All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path)
       [not found] ` <CANn89i+ekXHJSePzQ0rWx2KKqwYGTrok3-ZZ1RdEygVJcGDqRQ@mail.gmail.com>
@ 2014-11-10  1:59   ` Wei Yang
  2014-11-10  2:07     ` Wei Yang
  2014-11-10  2:46     ` Eric Dumazet
  0 siblings, 2 replies; 11+ messages in thread
From: Wei Yang @ 2014-11-10  1:59 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Wei Yang, Amir Vadai, David Miller, netdev

On Fri, Nov 07, 2014 at 07:38:15PM -0800, Eric Dumazet wrote:
>On Fri, Nov 7, 2014 at 6:57 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>> Eric and Amir
>>
>> I am testing the VF on PowerNV platform with 3.18-rc2.
>> After applying this patch I face some errors.
>>
>> First is the compiling error.
>>
>>     drivers/net/ethernet/mellanox/mlx4//en_tx.c: In function ‘mlx4_en_xmit’:
>>     drivers/net/ethernet/mellanox/mlx4//en_tx.c:802:8: error: ‘shinfo’ undeclared (first use in this function)
>>             shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
>>             ^
>>     include/linux/compiler.h:160:42: note: in definition of macro ‘unlikely’
>>      # define unlikely(x) __builtin_expect(!!(x), 0)
>>                                               ^
>>     drivers/net/ethernet/mellanox/mlx4//en_tx.c:802:8: note: each undeclared identifier is reported only once for each function it appears in
>>             shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
>>             ^
>>     include/linux/compiler.h:160:42: note: in definition of macro ‘unlikely’
>>      # define unlikely(x) __builtin_expect(!!(x), 0)
>>                                               ^
>>     make[1]: *** [drivers/net/ethernet/mellanox/mlx4//en_tx.o] Error 1
>>     make: *** [_module_drivers/net/ethernet/mellanox/mlx4/] Error 2
>>
>
>
>This compilation error seems strange.
>
>Are you sure your tree is pristine, not corrupted in any way ?

I believe I did the revert one by one with git revert.

>
>
>> I tried to fix this with following change:
>>
>>     [root@tian-lp1 3.18]# git diff
>>     diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/m
>>     index eaf23eb..d2f06a7 100644
>>     --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>>     +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>>     @@ -799,8 +799,8 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_dev
>>              * set flag for further reference
>>              */
>>             if (unlikely(ring->hwtstamp_tx_type == HWTSTAMP_TX_ON &&
>>     -                    shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
>>     -               shinfo->tx_flags |= SKBTX_IN_PROGRESS;
>>     +                    skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
>>     +               skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
>>                     tx_info->ts_requested = 1;
>>             }
>>
>> But seems to face another error.
>>
>
>I suspect your tree is not the official tree, I do not see how you got
>this compilation error.


I checked the upstream git tree again, and find this commit:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7dfa4b414d4eec8da56e44fb2b4aea3e549b092f


And I want to say the shinfo local variable is introduced in commit:

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b9d8839a44092cb4268ef2813c34d5dbf3363603

And in my log tree, also checked the upstream, this one is applyed after the
first one. And the compiling error will disappear untill I apply this one.

So this compiling issue can't reproduced at your side? You have reset --hard
to the "Code cleanup" one, and can't see the error? That is strange.

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path)
  2014-11-10  1:59   ` Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path) Wei Yang
@ 2014-11-10  2:07     ` Wei Yang
  2014-11-10  2:46     ` Eric Dumazet
  1 sibling, 0 replies; 11+ messages in thread
From: Wei Yang @ 2014-11-10  2:07 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Amir Vadai, David Miller, netdev

This is the git output in on my machine.

[ywywyang@tian-lp1 3.18]$ git status 
# On branch p8-sriov-3.18-rc2-mlx4
# Your branch is behind 'origin/p8-sriov-3.18-rc2-mlx4' by 14 commits, and can be fast-forwarded.
#   (use "git pull" to update your local branch)
#
nothing to commit, working directory clean
[ywywyang@tian-lp1 3.18]$ git oneline -40
40c4198 Revert "net/mlx4_en: Align tx path structures to cache lines"
7d071a0 Revert "net/mlx4_en: Avoid calling bswap in tx fast path"
5aa717e Revert "net/mlx4_en: tx_info allocated with kmalloc() instead of vmalloc
77ab7f7 Revert "net/mlx4_en: Avoid a cache line miss in TX completion for single
6fee4f6 Revert "net/mlx4_en: Use prefetch in tx path"
6509cd2 Revert "net/mlx4_en: Avoid false sharing in mlx4_en_en_process_tx_cq()"
50b7df1 Revert "net/mlx4_en: mlx4_en_xmit() reads ring->cons once, and ahead of 
a0cc8e8 Revert "net/mlx4_en: Use local var in tx flow for skb_shinfo(skb)"
8cc9b1d Revert "net/mlx4_en: Use local var for skb_headlen(skb)"
2894ad1 Revert "net/mlx4_en: tx_info->ts_requested was not cleared"
10f191f Revert "net/mlx4_en: Enable the compiler to make is_inline() inlined"
5b6e300 Revert "net/mlx4_en: Use the new tx_copybreak to set inline threshold"
277c194 Revert "net/mlx4_en: remove NETDEV_TX_BUSY"
3f626de Revert "net: add netdev_txq_bql_{enqueue, complete}_prefetchw() helpers"
c1d0c02 Revert "mlx4: fix race accessing page->_count"
cac7f24 Linux 3.18-rc2

You could see current tree is clean and based on v3.18-rc2.

On Mon, Nov 10, 2014 at 09:59:33AM +0800, Wei Yang wrote:
>On Fri, Nov 07, 2014 at 07:38:15PM -0800, Eric Dumazet wrote:
>>On Fri, Nov 7, 2014 at 6:57 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>>> Eric and Amir
>>>
>>> I am testing the VF on PowerNV platform with 3.18-rc2.
>>> After applying this patch I face some errors.
>>>
>>> First is the compiling error.
>>>
>>>     drivers/net/ethernet/mellanox/mlx4//en_tx.c: In function ‘mlx4_en_xmit’:
>>>     drivers/net/ethernet/mellanox/mlx4//en_tx.c:802:8: error: ‘shinfo’ undeclared (first use in this function)
>>>             shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
>>>             ^
>>>     include/linux/compiler.h:160:42: note: in definition of macro ‘unlikely’
>>>      # define unlikely(x) __builtin_expect(!!(x), 0)
>>>                                               ^
>>>     drivers/net/ethernet/mellanox/mlx4//en_tx.c:802:8: note: each undeclared identifier is reported only once for each function it appears in
>>>             shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
>>>             ^
>>>     include/linux/compiler.h:160:42: note: in definition of macro ‘unlikely’
>>>      # define unlikely(x) __builtin_expect(!!(x), 0)
>>>                                               ^
>>>     make[1]: *** [drivers/net/ethernet/mellanox/mlx4//en_tx.o] Error 1
>>>     make: *** [_module_drivers/net/ethernet/mellanox/mlx4/] Error 2
>>>
>>
>>
>>This compilation error seems strange.
>>
>>Are you sure your tree is pristine, not corrupted in any way ?
>
>I believe I did the revert one by one with git revert.
>
>>
>>
>>> I tried to fix this with following change:
>>>
>>>     [root@tian-lp1 3.18]# git diff
>>>     diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/m
>>>     index eaf23eb..d2f06a7 100644
>>>     --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>>>     +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>>>     @@ -799,8 +799,8 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_dev
>>>              * set flag for further reference
>>>              */
>>>             if (unlikely(ring->hwtstamp_tx_type == HWTSTAMP_TX_ON &&
>>>     -                    shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
>>>     -               shinfo->tx_flags |= SKBTX_IN_PROGRESS;
>>>     +                    skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
>>>     +               skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
>>>                     tx_info->ts_requested = 1;
>>>             }
>>>
>>> But seems to face another error.
>>>
>>
>>I suspect your tree is not the official tree, I do not see how you got
>>this compilation error.
>
>
>I checked the upstream git tree again, and find this commit:
>
>https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7dfa4b414d4eec8da56e44fb2b4aea3e549b092f
>
>
>And I want to say the shinfo local variable is introduced in commit:
>
>https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b9d8839a44092cb4268ef2813c34d5dbf3363603
>
>And in my log tree, also checked the upstream, this one is applyed after the
>first one. And the compiling error will disappear untill I apply this one.
>
>So this compiling issue can't reproduced at your side? You have reset --hard
>to the "Code cleanup" one, and can't see the error? That is strange.
>
>-- 
>Richard Yang
>Help you, Help me

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path)
  2014-11-10  1:59   ` Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path) Wei Yang
  2014-11-10  2:07     ` Wei Yang
@ 2014-11-10  2:46     ` Eric Dumazet
  2014-11-10  5:40       ` Wei Yang
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2014-11-10  2:46 UTC (permalink / raw)
  To: Wei Yang; +Cc: Eric Dumazet, Amir Vadai, David Miller, netdev

On Mon, 2014-11-10 at 09:59 +0800, Wei Yang wrote:
> On Fri, Nov 07, 2014 at 07:38:15PM -0800, Eric Dumazet wrote:
> >On Fri, Nov 7, 2014 at 6:57 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
> >> Eric and Amir
> >>
> >> I am testing the VF on PowerNV platform with 3.18-rc2.
> >> After applying this patch I face some errors.
> >>
> >> First is the compiling error.
> >>
> >>     drivers/net/ethernet/mellanox/mlx4//en_tx.c: In function ‘mlx4_en_xmit’:
> >>     drivers/net/ethernet/mellanox/mlx4//en_tx.c:802:8: error: ‘shinfo’ undeclared (first use in this function)
> >>             shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
> >>             ^
> >>     include/linux/compiler.h:160:42: note: in definition of macro ‘unlikely’
> >>      # define unlikely(x) __builtin_expect(!!(x), 0)
> >>                                               ^
> >>     drivers/net/ethernet/mellanox/mlx4//en_tx.c:802:8: note: each undeclared identifier is reported only once for each function it appears in
> >>             shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
> >>             ^
> >>     include/linux/compiler.h:160:42: note: in definition of macro ‘unlikely’
> >>      # define unlikely(x) __builtin_expect(!!(x), 0)
> >>                                               ^
> >>     make[1]: *** [drivers/net/ethernet/mellanox/mlx4//en_tx.o] Error 1
> >>     make: *** [_module_drivers/net/ethernet/mellanox/mlx4/] Error 2
> >>
> >
> >
> >This compilation error seems strange.
> >
> >Are you sure your tree is pristine, not corrupted in any way ?
> 
> I believe I did the revert one by one with git revert.
> 
> >
> >
> >> I tried to fix this with following change:
> >>
> >>     [root@tian-lp1 3.18]# git diff
> >>     diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/m
> >>     index eaf23eb..d2f06a7 100644
> >>     --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> >>     +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> >>     @@ -799,8 +799,8 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_dev
> >>              * set flag for further reference
> >>              */
> >>             if (unlikely(ring->hwtstamp_tx_type == HWTSTAMP_TX_ON &&
> >>     -                    shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
> >>     -               shinfo->tx_flags |= SKBTX_IN_PROGRESS;
> >>     +                    skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
> >>     +               skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
> >>                     tx_info->ts_requested = 1;
> >>             }
> >>
> >> But seems to face another error.
> >>
> >
> >I suspect your tree is not the official tree, I do not see how you got
> >this compilation error.
> 
> 
> I checked the upstream git tree again, and find this commit:
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7dfa4b414d4eec8da56e44fb2b4aea3e549b092f
> 
> 
> And I want to say the shinfo local variable is introduced in commit:
> 
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b9d8839a44092cb4268ef2813c34d5dbf3363603
> 
> And in my log tree, also checked the upstream, this one is applyed after the
> first one. And the compiling error will disappear untill I apply this one.
> 
> So this compiling issue can't reproduced at your side? You have reset --hard
> to the "Code cleanup" one, and can't see the error? That is strange.
> 

Okay, your message was not clear : I thought you had a compilation error
on current tree.

The true story of these patches is that Mellanox split an initial big
chunk [1] I gave into multiple patches.

Maybe they missed that one patch did not actually compile.

[1] https://patchwork.ozlabs.org/patch/394256/

Now, it is done, there is nothing we can do.

I'll let Mellanox comment, but it looks like your hardware does not like
something.

Have you tried to disable Blue Frame ?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path)
  2014-11-10  2:46     ` Eric Dumazet
@ 2014-11-10  5:40       ` Wei Yang
  2014-11-10  8:00         ` Amir Vadai
  0 siblings, 1 reply; 11+ messages in thread
From: Wei Yang @ 2014-11-10  5:40 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Wei Yang, Eric Dumazet, Amir Vadai, David Miller, netdev

On Sun, Nov 09, 2014 at 06:46:14PM -0800, Eric Dumazet wrote:
>On Mon, 2014-11-10 at 09:59 +0800, Wei Yang wrote:
>> On Fri, Nov 07, 2014 at 07:38:15PM -0800, Eric Dumazet wrote:
>> >On Fri, Nov 7, 2014 at 6:57 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>> >> Eric and Amir
>> >>
>> >> I am testing the VF on PowerNV platform with 3.18-rc2.
>> >> After applying this patch I face some errors.
>> >>
>> >> First is the compiling error.
>> >>
>> >>     drivers/net/ethernet/mellanox/mlx4//en_tx.c: In function ‘mlx4_en_xmit’:
>> >>     drivers/net/ethernet/mellanox/mlx4//en_tx.c:802:8: error: ‘shinfo’ undeclared (first use in this function)
>> >>             shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
>> >>             ^
>> >>     include/linux/compiler.h:160:42: note: in definition of macro ‘unlikely’
>> >>      # define unlikely(x) __builtin_expect(!!(x), 0)
>> >>                                               ^
>> >>     drivers/net/ethernet/mellanox/mlx4//en_tx.c:802:8: note: each undeclared identifier is reported only once for each function it appears in
>> >>             shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
>> >>             ^
>> >>     include/linux/compiler.h:160:42: note: in definition of macro ‘unlikely’
>> >>      # define unlikely(x) __builtin_expect(!!(x), 0)
>> >>                                               ^
>> >>     make[1]: *** [drivers/net/ethernet/mellanox/mlx4//en_tx.o] Error 1
>> >>     make: *** [_module_drivers/net/ethernet/mellanox/mlx4/] Error 2
>> >>
>> >
>> >
>> >This compilation error seems strange.
>> >
>> >Are you sure your tree is pristine, not corrupted in any way ?
>> 
>> I believe I did the revert one by one with git revert.
>> 
>> >
>> >
>> >> I tried to fix this with following change:
>> >>
>> >>     [root@tian-lp1 3.18]# git diff
>> >>     diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/m
>> >>     index eaf23eb..d2f06a7 100644
>> >>     --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> >>     +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> >>     @@ -799,8 +799,8 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_dev
>> >>              * set flag for further reference
>> >>              */
>> >>             if (unlikely(ring->hwtstamp_tx_type == HWTSTAMP_TX_ON &&
>> >>     -                    shinfo->tx_flags & SKBTX_HW_TSTAMP)) {
>> >>     -               shinfo->tx_flags |= SKBTX_IN_PROGRESS;
>> >>     +                    skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP)) {
>> >>     +               skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
>> >>                     tx_info->ts_requested = 1;
>> >>             }
>> >>
>> >> But seems to face another error.
>> >>
>> >
>> >I suspect your tree is not the official tree, I do not see how you got
>> >this compilation error.
>> 
>> 
>> I checked the upstream git tree again, and find this commit:
>> 
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7dfa4b414d4eec8da56e44fb2b4aea3e549b092f
>> 
>> 
>> And I want to say the shinfo local variable is introduced in commit:
>> 
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b9d8839a44092cb4268ef2813c34d5dbf3363603
>> 
>> And in my log tree, also checked the upstream, this one is applyed after the
>> first one. And the compiling error will disappear untill I apply this one.
>> 
>> So this compiling issue can't reproduced at your side? You have reset --hard
>> to the "Code cleanup" one, and can't see the error? That is strange.
>> 
>
>Okay, your message was not clear : I thought you had a compilation error
>on current tree.
>
>The true story of these patches is that Mellanox split an initial big
>chunk [1] I gave into multiple patches.
>
>Maybe they missed that one patch did not actually compile.
>
>[1] https://patchwork.ozlabs.org/patch/394256/
>
>Now, it is done, there is nothing we can do.
>
>I'll let Mellanox comment, but it looks like your hardware does not like
>something.
>
>Have you tried to disable Blue Frame ?
>

Yep, looks the PF works fine. But the current FW I can't just enable the PF.

How to disable Blue Frame? I am not clear about this.


-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path)
  2014-11-10  5:40       ` Wei Yang
@ 2014-11-10  8:00         ` Amir Vadai
  2014-11-11  1:57           ` Wei Yang
  0 siblings, 1 reply; 11+ messages in thread
From: Amir Vadai @ 2014-11-10  8:00 UTC (permalink / raw)
  To: Wei Yang, Eric Dumazet; +Cc: Eric Dumazet, David Miller, netdev

On 11/10/2014 7:40 AM, Wei Yang wrote:
> On Sun, Nov 09, 2014 at 06:46:14PM -0800, Eric Dumazet wrote:
>> On Mon, 2014-11-10 at 09:59 +0800, Wei Yang wrote:
>>> On Fri, Nov 07, 2014 at 07:38:15PM -0800, Eric Dumazet wrote:
>>>> On Fri, Nov 7, 2014 at 6:57 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>>>>> Eric and Amir
>>>>>

[...]

>>>
>>
>> Okay, your message was not clear : I thought you had a compilation error
>> on current tree.
>>
>> The true story of these patches is that Mellanox split an initial big
>> chunk [1] I gave into multiple patches.
>>
>> Maybe they missed that one patch did not actually compile.
>>
>> [1] https://patchwork.ozlabs.org/patch/394256/
>>
>> Now, it is done, there is nothing we can do.
>>
>> I'll let Mellanox comment, but it looks like your hardware does not like
>> something.
>>
>> Have you tried to disable Blue Frame ?
>>
> 
> Yep, looks the PF works fine. But the current FW I can't just enable the PF.
> 
> How to disable Blue Frame? I am not clear about this.
> 
Hi,

Lets see that we're on the same page here:
1. There was a compilation problem that you fixed (Yes, it was my fault
- I just discovered it a minute after the code was applied).
2. When you're using SR-IOV, during initialization, you get a CQE error
with syndrome 0x2 on one of the probed VF's.
3. Regarding the BlueFlame - I don't see how it is related to the issue
that you see. But it is a very easy experiment. Issue: "ethtool
--set-priv-flags eth1 blueflame off"

Please send me the module parameters you used when loading mlx4_core, a
full dmesg with both mlx4_core and mlx4_en loading.

Amir.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path)
  2014-11-10  8:00         ` Amir Vadai
@ 2014-11-11  1:57           ` Wei Yang
  2014-11-11  6:49             ` Or Gerlitz
  2014-11-11  7:28             ` Amir Vadai
  0 siblings, 2 replies; 11+ messages in thread
From: Wei Yang @ 2014-11-11  1:57 UTC (permalink / raw)
  To: Amir Vadai; +Cc: Wei Yang, Eric Dumazet, Eric Dumazet, David Miller, netdev

On Mon, Nov 10, 2014 at 10:00:21AM +0200, Amir Vadai wrote:
>On 11/10/2014 7:40 AM, Wei Yang wrote:
>> On Sun, Nov 09, 2014 at 06:46:14PM -0800, Eric Dumazet wrote:
>>> On Mon, 2014-11-10 at 09:59 +0800, Wei Yang wrote:
>>>> On Fri, Nov 07, 2014 at 07:38:15PM -0800, Eric Dumazet wrote:
>>>>> On Fri, Nov 7, 2014 at 6:57 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>>>>>> Eric and Amir
>>>>>>
>
>[...]
>
>>>>
>>>
>>> Okay, your message was not clear : I thought you had a compilation error
>>> on current tree.
>>>
>>> The true story of these patches is that Mellanox split an initial big
>>> chunk [1] I gave into multiple patches.
>>>
>>> Maybe they missed that one patch did not actually compile.
>>>
>>> [1] https://patchwork.ozlabs.org/patch/394256/
>>>
>>> Now, it is done, there is nothing we can do.
>>>
>>> I'll let Mellanox comment, but it looks like your hardware does not like
>>> something.
>>>
>>> Have you tried to disable Blue Frame ?
>>>
>> 
>> Yep, looks the PF works fine. But the current FW I can't just enable the PF.
>> 
>> How to disable Blue Frame? I am not clear about this.
>> 
>Hi,
>
>Lets see that we're on the same page here:
>1. There was a compilation problem that you fixed (Yes, it was my fault
>- I just discovered it a minute after the code was applied).
>2. When you're using SR-IOV, during initialization, you get a CQE error
>with syndrome 0x2 on one of the probed VF's.

>From the log, seems yes.

>3. Regarding the BlueFlame - I don't see how it is related to the issue
>that you see. But it is a very easy experiment. Issue: "ethtool
>--set-priv-flags eth1 blueflame off"

I tried to use this after mlx4_en is loaded, still see the CQE error.

>
>Please send me the module parameters you used when loading mlx4_core, a
>full dmesg with both mlx4_core and mlx4_en loading.

The command line I use is:
	modprobe mlx4_core num_vfs=1 probe_vf=1 port_type_array=2,2

The log I sent in the first mail is the full log, including the CQE error, one
warning in watchdog, and then print the CQE error periodicly. What else
message you would like me to capture?

And this error is reported from VF always. After the error, the other network
interface seems can't function.

>
>Amir.

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path)
  2014-11-11  1:57           ` Wei Yang
@ 2014-11-11  6:49             ` Or Gerlitz
  2014-11-11  7:28             ` Amir Vadai
  1 sibling, 0 replies; 11+ messages in thread
From: Or Gerlitz @ 2014-11-11  6:49 UTC (permalink / raw)
  To: Wei Yang
  Cc: Amir Vadai, Eric Dumazet, Eric Dumazet, David Miller, Linux Netdev List

On Tue, Nov 11, 2014 at 3:57 AM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>>Please send me the module parameters you used when loading mlx4_core, a
>>full dmesg with both mlx4_core and mlx4_en loading.
>
> The command line I use is:
>         modprobe mlx4_core num_vfs=1 probe_vf=1 port_type_array=2,2
>
> The log I sent in the first mail is the full log, including the CQE error, one
> warning in watchdog, and then print the CQE error periodicly. What else
> message you would like me to capture?
>
> And this error is reported from VF always. After the error, the other network
> interface seems can't function.

Guys, do we have here something which isn't bisectable (i.e the code
doesn't work if you force it to an arbitrary commit within in the
chain) - something which doesn't work as a whole? in case the former,
we will work to do better next times.. in case the latter, Wei, as
Amir asked, please send us full logs when you load/run the driver with
debug level set, for mlx4_core it would be loading with debug_level=1
for mlx4_en you should be able to raise the debug level using ethtool

Or.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path)
  2014-11-11  1:57           ` Wei Yang
  2014-11-11  6:49             ` Or Gerlitz
@ 2014-11-11  7:28             ` Amir Vadai
  2014-11-11  7:42               ` Wei Yang
  1 sibling, 1 reply; 11+ messages in thread
From: Amir Vadai @ 2014-11-11  7:28 UTC (permalink / raw)
  To: Wei Yang; +Cc: Amir Vadai, Eric Dumazet, Eric Dumazet, David Miller, netdev

On Tue, Nov 11, 2014 at 3:57 AM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
> On Mon, Nov 10, 2014 at 10:00:21AM +0200, Amir Vadai wrote:
>>On 11/10/2014 7:40 AM, Wei Yang wrote:
>>> On Sun, Nov 09, 2014 at 06:46:14PM -0800, Eric Dumazet wrote:
>>>> On Mon, 2014-11-10 at 09:59 +0800, Wei Yang wrote:
>>>>> On Fri, Nov 07, 2014 at 07:38:15PM -0800, Eric Dumazet wrote:
>>>>>> On Fri, Nov 7, 2014 at 6:57 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>>>>>>> Eric and Amir
>>>>>>>
>>
>>[...]
>>
>>>>>
>>>>
>>>> Okay, your message was not clear : I thought you had a compilation error
>>>> on current tree.
>>>>
>>>> The true story of these patches is that Mellanox split an initial big
>>>> chunk [1] I gave into multiple patches.
>>>>
>>>> Maybe they missed that one patch did not actually compile.
>>>>
>>>> [1] https://patchwork.ozlabs.org/patch/394256/
>>>>
>>>> Now, it is done, there is nothing we can do.
>>>>
>>>> I'll let Mellanox comment, but it looks like your hardware does not like
>>>> something.
>>>>
>>>> Have you tried to disable Blue Frame ?
>>>>
>>>
>>> Yep, looks the PF works fine. But the current FW I can't just enable the PF.
>>>
>>> How to disable Blue Frame? I am not clear about this.
>>>
>>Hi,
>>
>>Lets see that we're on the same page here:
>>1. There was a compilation problem that you fixed (Yes, it was my fault
>>- I just discovered it a minute after the code was applied).
>>2. When you're using SR-IOV, during initialization, you get a CQE error
>>with syndrome 0x2 on one of the probed VF's.
>
> From the log, seems yes.
>
>>3. Regarding the BlueFlame - I don't see how it is related to the issue
>>that you see. But it is a very easy experiment. Issue: "ethtool
>>--set-priv-flags eth1 blueflame off"
>
> I tried to use this after mlx4_en is loaded, still see the CQE error.
>
>>
>>Please send me the module parameters you used when loading mlx4_core, a
>>full dmesg with both mlx4_core and mlx4_en loading.
>
> The command line I use is:
>         modprobe mlx4_core num_vfs=1 probe_vf=1 port_type_array=2,2
>
> The log I sent in the first mail is the full log, including the CQE error, one
> warning in watchdog, and then print the CQE error periodicly. What else
> message you would like me to capture?

The log in the first mail has only mlx4_en logs. I would like to see
the full log, that has mlx4_core messages too. And as Or suggested,
debug_level=1 could be useful here too.

>
> And this error is reported from VF always. After the error, the other network
> interface seems can't function.
>
>>
>>Amir.
>
> --
> Richard Yang
> Help you, Help me
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path)
  2014-11-11  7:28             ` Amir Vadai
@ 2014-11-11  7:42               ` Wei Yang
  2014-11-11  8:40                 ` Amir Vadai
  0 siblings, 1 reply; 11+ messages in thread
From: Wei Yang @ 2014-11-11  7:42 UTC (permalink / raw)
  To: Amir Vadai
  Cc: Wei Yang, Eric Dumazet, Eric Dumazet, David Miller, netdev, gerlitz.or

On Tue, Nov 11, 2014 at 09:28:34AM +0200, Amir Vadai wrote:
>On Tue, Nov 11, 2014 at 3:57 AM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>> On Mon, Nov 10, 2014 at 10:00:21AM +0200, Amir Vadai wrote:
>>>On 11/10/2014 7:40 AM, Wei Yang wrote:
>>>> On Sun, Nov 09, 2014 at 06:46:14PM -0800, Eric Dumazet wrote:
>>>>> On Mon, 2014-11-10 at 09:59 +0800, Wei Yang wrote:
>>>>>> On Fri, Nov 07, 2014 at 07:38:15PM -0800, Eric Dumazet wrote:
>>>>>>> On Fri, Nov 7, 2014 at 6:57 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>>>>>>>> Eric and Amir
>>>>>>>>
>>>
>>>[...]
>>>
>>>>>>
>>>>>
>>>>> Okay, your message was not clear : I thought you had a compilation error
>>>>> on current tree.
>>>>>
>>>>> The true story of these patches is that Mellanox split an initial big
>>>>> chunk [1] I gave into multiple patches.
>>>>>
>>>>> Maybe they missed that one patch did not actually compile.
>>>>>
>>>>> [1] https://patchwork.ozlabs.org/patch/394256/
>>>>>
>>>>> Now, it is done, there is nothing we can do.
>>>>>
>>>>> I'll let Mellanox comment, but it looks like your hardware does not like
>>>>> something.
>>>>>
>>>>> Have you tried to disable Blue Frame ?
>>>>>
>>>>
>>>> Yep, looks the PF works fine. But the current FW I can't just enable the PF.
>>>>
>>>> How to disable Blue Frame? I am not clear about this.
>>>>
>>>Hi,
>>>
>>>Lets see that we're on the same page here:
>>>1. There was a compilation problem that you fixed (Yes, it was my fault
>>>- I just discovered it a minute after the code was applied).
>>>2. When you're using SR-IOV, during initialization, you get a CQE error
>>>with syndrome 0x2 on one of the probed VF's.
>>
>> From the log, seems yes.
>>
>>>3. Regarding the BlueFlame - I don't see how it is related to the issue
>>>that you see. But it is a very easy experiment. Issue: "ethtool
>>>--set-priv-flags eth1 blueflame off"
>>
>> I tried to use this after mlx4_en is loaded, still see the CQE error.
>>
>>>
>>>Please send me the module parameters you used when loading mlx4_core, a
>>>full dmesg with both mlx4_core and mlx4_en loading.
>>
>> The command line I use is:
>>         modprobe mlx4_core num_vfs=1 probe_vf=1 port_type_array=2,2
>>
>> The log I sent in the first mail is the full log, including the CQE error, one
>> warning in watchdog, and then print the CQE error periodicly. What else
>> message you would like me to capture?
>
>The log in the first mail has only mlx4_en logs. I would like to see
>the full log, that has mlx4_core messages too. And as Or suggested,
>debug_level=1 could be useful here too.
>

Ah, you need the log from mlx4_core too. Ok, I will do it again.

BTW, how to add the debug_level=1 in the command line? Like this?

modprobe mlx4_core num_vfs=1 probe_vf=1 port_type_array=2,2 debug_level=1

But for mlx4_en, I am not sure I could raise the debug level with ethtool,
since the ethernet driver may not work properly. Actually I am not sure how to
raise the level with ethtool. Could you give me an example?

>>
>> And this error is reported from VF always. After the error, the other network
>> interface seems can't function.
>>
>>>
>>>Amir.
>>
>> --
>> Richard Yang
>> Help you, Help me
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path)
  2014-11-11  7:42               ` Wei Yang
@ 2014-11-11  8:40                 ` Amir Vadai
  2014-11-11  9:12                   ` Wei Yang
  0 siblings, 1 reply; 11+ messages in thread
From: Amir Vadai @ 2014-11-11  8:40 UTC (permalink / raw)
  To: Wei Yang
  Cc: Amir Vadai, Eric Dumazet, Eric Dumazet, David Miller, netdev, gerlitz.or

On Tue, Nov 11, 2014 at 9:42 AM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
> On Tue, Nov 11, 2014 at 09:28:34AM +0200, Amir Vadai wrote:
>>On Tue, Nov 11, 2014 at 3:57 AM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>>> On Mon, Nov 10, 2014 at 10:00:21AM +0200, Amir Vadai wrote:
>>>>On 11/10/2014 7:40 AM, Wei Yang wrote:
>>>>> On Sun, Nov 09, 2014 at 06:46:14PM -0800, Eric Dumazet wrote:
>>>>>> On Mon, 2014-11-10 at 09:59 +0800, Wei Yang wrote:
>>>>>>> On Fri, Nov 07, 2014 at 07:38:15PM -0800, Eric Dumazet wrote:
>>>>>>>> On Fri, Nov 7, 2014 at 6:57 PM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>>>>>>>>> Eric and Amir
>>>>>>>>>
>>>>
>>>>[...]
>>>>
>>>>>>>
>>>>>>
>>>>>> Okay, your message was not clear : I thought you had a compilation error
>>>>>> on current tree.
>>>>>>
>>>>>> The true story of these patches is that Mellanox split an initial big
>>>>>> chunk [1] I gave into multiple patches.
>>>>>>
>>>>>> Maybe they missed that one patch did not actually compile.
>>>>>>
>>>>>> [1] https://patchwork.ozlabs.org/patch/394256/
>>>>>>
>>>>>> Now, it is done, there is nothing we can do.
>>>>>>
>>>>>> I'll let Mellanox comment, but it looks like your hardware does not like
>>>>>> something.
>>>>>>
>>>>>> Have you tried to disable Blue Frame ?
>>>>>>
>>>>>
>>>>> Yep, looks the PF works fine. But the current FW I can't just enable the PF.
>>>>>
>>>>> How to disable Blue Frame? I am not clear about this.
>>>>>
>>>>Hi,
>>>>
>>>>Lets see that we're on the same page here:
>>>>1. There was a compilation problem that you fixed (Yes, it was my fault
>>>>- I just discovered it a minute after the code was applied).
>>>>2. When you're using SR-IOV, during initialization, you get a CQE error
>>>>with syndrome 0x2 on one of the probed VF's.
>>>
>>> From the log, seems yes.
>>>
>>>>3. Regarding the BlueFlame - I don't see how it is related to the issue
>>>>that you see. But it is a very easy experiment. Issue: "ethtool
>>>>--set-priv-flags eth1 blueflame off"
>>>
>>> I tried to use this after mlx4_en is loaded, still see the CQE error.
>>>
>>>>
>>>>Please send me the module parameters you used when loading mlx4_core, a
>>>>full dmesg with both mlx4_core and mlx4_en loading.
>>>
>>> The command line I use is:
>>>         modprobe mlx4_core num_vfs=1 probe_vf=1 port_type_array=2,2
>>>
>>> The log I sent in the first mail is the full log, including the CQE error, one
>>> warning in watchdog, and then print the CQE error periodicly. What else
>>> message you would like me to capture?
>>
>>The log in the first mail has only mlx4_en logs. I would like to see
>>the full log, that has mlx4_core messages too. And as Or suggested,
>>debug_level=1 could be useful here too.
>>
>
> Ah, you need the log from mlx4_core too. Ok, I will do it again.
>
> BTW, how to add the debug_level=1 in the command line? Like this?
>
> modprobe mlx4_core num_vfs=1 probe_vf=1 port_type_array=2,2 debug_level=1
yes

>
> But for mlx4_en, I am not sure I could raise the debug level with ethtool,
> since the ethernet driver may not work properly. Actually I am not sure how to
> raise the level with ethtool. Could you give me an example?

# ethtool -s ens5f1d1 msglvl 0xffff

>
>>>
>>> And this error is reported from VF always. After the error, the other network
>>> interface seems can't function.
>>>
>>>>
>>>>Amir.
>>>
>>> --
>>> Richard Yang
>>> Help you, Help me
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> --
> Richard Yang
> Help you, Help me
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path)
  2014-11-11  8:40                 ` Amir Vadai
@ 2014-11-11  9:12                   ` Wei Yang
  0 siblings, 0 replies; 11+ messages in thread
From: Wei Yang @ 2014-11-11  9:12 UTC (permalink / raw)
  To: Amir Vadai
  Cc: Wei Yang, Eric Dumazet, Eric Dumazet, David Miller, netdev, gerlitz.or

On Tue, Nov 11, 2014 at 10:40:24AM +0200, Amir Vadai wrote:
>On Tue, Nov 11, 2014 at 9:42 AM, Wei Yang <weiyang@linux.vnet.ibm.com> wrote:
>>>>>Hi,
>>>>>
>>>>>Lets see that we're on the same page here:
>>>>>1. There was a compilation problem that you fixed (Yes, it was my fault
>>>>>- I just discovered it a minute after the code was applied).
>>>>>2. When you're using SR-IOV, during initialization, you get a CQE error
>>>>>with syndrome 0x2 on one of the probed VF's.
>>>>
>>>> From the log, seems yes.
>>>>
>>>>>3. Regarding the BlueFlame - I don't see how it is related to the issue
>>>>>that you see. But it is a very easy experiment. Issue: "ethtool
>>>>>--set-priv-flags eth1 blueflame off"
>>>>
>>>> I tried to use this after mlx4_en is loaded, still see the CQE error.
>>>>
>>>>>
>>>>>Please send me the module parameters you used when loading mlx4_core, a
>>>>>full dmesg with both mlx4_core and mlx4_en loading.
>>>>
>>>> The command line I use is:
>>>>         modprobe mlx4_core num_vfs=1 probe_vf=1 port_type_array=2,2
>>>>
>>>> The log I sent in the first mail is the full log, including the CQE error, one
>>>> warning in watchdog, and then print the CQE error periodicly. What else
>>>> message you would like me to capture?
>>>
>>>The log in the first mail has only mlx4_en logs. I would like to see
>>>the full log, that has mlx4_core messages too. And as Or suggested,
>>>debug_level=1 could be useful here too.
>>>
>>
>> Ah, you need the log from mlx4_core too. Ok, I will do it again.
>>
>> BTW, how to add the debug_level=1 in the command line? Like this?
>>
>> modprobe mlx4_core num_vfs=1 probe_vf=1 port_type_array=2,2 debug_level=1
>yes
>

I past the full log here, but I don't see the log level affact.


[ 5723.612882] mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)
[ 5723.612979] mlx4_core: Initializing 0003:05:00.0
[ 5723.613023] pci 0003:05: 0.1: [PE# 222] VF 0003:05:00.1 associated with PE#222
[ 5723.613299] pci 0003:05: 0.1: [PE# 222] Setting up 32-bit TCE table at 0..80000000
[ 5723.684140] pci 0003:05: 0.1: [PE# 222] Enabling 64-bit DMA bypass
[ 5723.684215] pci 0003:05: 0.2: [PE# 223] VF 0003:05:00.2 associated with PE#223
[ 5723.684409] pci 0003:05: 0.2: [PE# 223] Setting up 32-bit TCE table at 0..80000000
[ 5723.754871] pci 0003:05: 0.2: [PE# 223] Enabling 64-bit DMA bypass
[ 5723.754944] pci 0003:05: 0.3: [PE# 224] VF 0003:05:00.3 associated with PE#224
[ 5723.755150] pci 0003:05: 0.3: [PE# 224] Setting up 32-bit TCE table at 0..80000000
[ 5723.825646] pci 0003:05: 0.3: [PE# 224] Enabling 64-bit DMA bypass
[ 5723.825718] pci 0003:05: 0.4: [PE# 225] VF 0003:05:00.4 associated with PE#225
[ 5723.825911] pci 0003:05: 0.4: [PE# 225] Setting up 32-bit TCE table at 0..80000000
[ 5723.896342] pci 0003:05: 0.4: [PE# 225] Enabling 64-bit DMA bypass
[ 5723.896416] pci 0003:05: 0.5: [PE# 226] VF 0003:05:00.5 associated with PE#226
[ 5723.896610] pci 0003:05: 0.5: [PE# 226] Setting up 32-bit TCE table at 0..80000000
[ 5723.967052] pci 0003:05: 0.5: [PE# 226] Enabling 64-bit DMA bypass
[ 5723.967135] pci 0003:05: 0.6: [PE# 227] VF 0003:05:00.6 associated with PE#227
[ 5723.967328] pci 0003:05: 0.6: [PE# 227] Setting up 32-bit TCE table at 0..80000000
[ 5724.037755] pci 0003:05: 0.6: [PE# 227] Enabling 64-bit DMA bypass
[ 5724.037829] pci 0003:05: 0.7: [PE# 228] VF 0003:05:00.7 associated with PE#228
[ 5724.038022] pci 0003:05: 0.7: [PE# 228] Setting up 32-bit TCE table at 0..80000000
[ 5724.108452] pci 0003:05: 0.7: [PE# 228] Enabling 64-bit DMA bypass
[ 5724.108525] pci 0003:05: 1.0: [PE# 229] VF 0003:05:01.0 associated with PE#229
[ 5724.108718] pci 0003:05: 1.0: [PE# 229] Setting up 32-bit TCE table at 0..80000000
[ 5724.179209] pci 0003:05: 1.0: [PE# 229] Enabling 64-bit DMA bypass
[ 5724.179293] pci 0003:05: 1.1: [PE# 230] VF 0003:05:01.1 associated with PE#230
[ 5724.179486] pci 0003:05: 1.1: [PE# 230] Setting up 32-bit TCE table at 0..80000000
[ 5724.249946] pci 0003:05: 1.1: [PE# 230] Enabling 64-bit DMA bypass
[ 5724.250020] pci 0003:05: 1.2: [PE# 231] VF 0003:05:01.2 associated with PE#231
[ 5724.250214] pci 0003:05: 1.2: [PE# 231] Setting up 32-bit TCE table at 0..80000000
[ 5724.320703] pci 0003:05: 1.2: [PE# 231] Enabling 64-bit DMA bypass
[ 5724.320787] pci 0003:05: 1.3: [PE# 232] VF 0003:05:01.3 associated with PE#232
[ 5724.320982] pci 0003:05: 1.3: [PE# 232] Setting up 32-bit TCE table at 0..80000000
[ 5724.391424] pci 0003:05: 1.3: [PE# 232] Enabling 64-bit DMA bypass
[ 5724.391497] pci 0003:05: 1.4: [PE# 233] VF 0003:05:01.4 associated with PE#233
[ 5724.391690] pci 0003:05: 1.4: [PE# 233] Setting up 32-bit TCE table at 0..80000000
[ 5724.462129] pci 0003:05: 1.4: [PE# 233] Enabling 64-bit DMA bypass
[ 5724.462215] pci 0003:05: 1.5: [PE# 234] VF 0003:05:01.5 associated with PE#234
[ 5724.462408] pci 0003:05: 1.5: [PE# 234] Setting up 32-bit TCE table at 0..80000000
[ 5724.532834] pci 0003:05: 1.5: [PE# 234] Enabling 64-bit DMA bypass
[ 5724.532907] pci 0003:05: 1.6: [PE# 235] VF 0003:05:01.6 associated with PE#235
[ 5724.533100] pci 0003:05: 1.6: [PE# 235] Setting up 32-bit TCE table at 0..80000000
[ 5724.603556] pci 0003:05: 1.6: [PE# 235] Enabling 64-bit DMA bypass
[ 5724.603628] pci 0003:05: 1.7: [PE# 236] VF 0003:05:01.7 associated with PE#236
[ 5724.603822] pci 0003:05: 1.7: [PE# 236] Setting up 32-bit TCE table at 0..80000000
[ 5724.674261] pci 0003:05: 1.7: [PE# 236] Enabling 64-bit DMA bypass
[ 5724.674334] pci 0003:05: 2.0: [PE# 237] VF 0003:05:02.0 associated with PE#237
[ 5724.674528] pci 0003:05: 2.0: [PE# 237] Setting up 32-bit TCE table at 0..80000000
[ 5724.745012] pci 0003:05: 2.0: [PE# 237] Enabling 64-bit DMA bypass
[ 5724.745088] pci 0003:05: 2.1: [PE# 238] VF 0003:05:02.1 associated with PE#238
[ 5724.745281] pci 0003:05: 2.1: [PE# 238] Setting up 32-bit TCE table at 0..80000000
[ 5724.815714] pci 0003:05: 2.1: [PE# 238] Enabling 64-bit DMA bypass
[ 5724.815786] pci 0003:05: 2.2: [PE# 239] VF 0003:05:02.2 associated with PE#239
[ 5724.815979] pci 0003:05: 2.2: [PE# 239] Setting up 32-bit TCE table at 0..80000000
[ 5724.886413] pci 0003:05: 2.2: [PE# 239] Enabling 64-bit DMA bypass
[ 5724.886487] pci 0003:05: 2.3: [PE# 240] VF 0003:05:02.3 associated with PE#240
[ 5724.886680] pci 0003:05: 2.3: [PE# 240] Setting up 32-bit TCE table at 0..80000000
[ 5724.957126] pci 0003:05: 2.3: [PE# 240] Enabling 64-bit DMA bypass
[ 5724.957210] pci 0003:05: 2.4: [PE# 241] VF 0003:05:02.4 associated with PE#241
[ 5724.957403] pci 0003:05: 2.4: [PE# 241] Setting up 32-bit TCE table at 0..80000000
[ 5725.027842] pci 0003:05: 2.4: [PE# 241] Enabling 64-bit DMA bypass
[ 5725.027917] pci 0003:05: 2.5: [PE# 242] VF 0003:05:02.5 associated with PE#242
[ 5725.028109] pci 0003:05: 2.5: [PE# 242] Setting up 32-bit TCE table at 0..80000000
[ 5725.098551] pci 0003:05: 2.5: [PE# 242] Enabling 64-bit DMA bypass
[ 5725.098624] pci 0003:05: 2.6: [PE# 243] VF 0003:05:02.6 associated with PE#243
[ 5725.098818] pci 0003:05: 2.6: [PE# 243] Setting up 32-bit TCE table at 0..80000000
[ 5725.169255] pci 0003:05: 2.6: [PE# 243] Enabling 64-bit DMA bypass
[ 5725.169338] pci 0003:05: 2.7: [PE# 244] VF 0003:05:02.7 associated with PE#244
[ 5725.169532] pci 0003:05: 2.7: [PE# 244] Setting up 32-bit TCE table at 0..80000000
[ 5725.239962] pci 0003:05: 2.7: [PE# 244] Enabling 64-bit DMA bypass
[ 5725.240036] pci 0003:05: 3.0: [PE# 245] VF 0003:05:03.0 associated with PE#245
[ 5725.240229] pci 0003:05: 3.0: [PE# 245] Setting up 32-bit TCE table at 0..80000000
[ 5725.310678] pci 0003:05: 3.0: [PE# 245] Enabling 64-bit DMA bypass
[ 5725.310751] pci 0003:05: 3.1: [PE# 246] VF 0003:05:03.1 associated with PE#246
[ 5725.310944] pci 0003:05: 3.1: [PE# 246] Setting up 32-bit TCE table at 0..80000000
[ 5725.381375] pci 0003:05: 3.1: [PE# 246] Enabling 64-bit DMA bypass
[ 5725.381459] pci 0003:05: 3.2: [PE# 247] VF 0003:05:03.2 associated with PE#247
[ 5725.381653] pci 0003:05: 3.2: [PE# 247] Setting up 32-bit TCE table at 0..80000000
[ 5725.452087] pci 0003:05: 3.2: [PE# 247] Enabling 64-bit DMA bypass
[ 5725.452171] pci 0003:05: 3.3: [PE# 248] VF 0003:05:03.3 associated with PE#248
[ 5725.452365] pci 0003:05: 3.3: [PE# 248] Setting up 32-bit TCE table at 0..80000000
[ 5725.522815] pci 0003:05: 3.3: [PE# 248] Enabling 64-bit DMA bypass
[ 5725.522889] pci 0003:05: 3.4: [PE# 249] VF 0003:05:03.4 associated with PE#249
[ 5725.523082] pci 0003:05: 3.4: [PE# 249] Setting up 32-bit TCE table at 0..80000000
[ 5725.593516] pci 0003:05: 3.4: [PE# 249] Enabling 64-bit DMA bypass
[ 5725.593590] pci 0003:05: 3.5: [PE# 250] VF 0003:05:03.5 associated with PE#250
[ 5725.593783] pci 0003:05: 3.5: [PE# 250] Setting up 32-bit TCE table at 0..80000000
[ 5725.664330] pci 0003:05: 3.5: [PE# 250] Enabling 64-bit DMA bypass
[ 5725.664403] pci 0003:05: 3.6: [PE# 251] VF 0003:05:03.6 associated with PE#251
[ 5725.664597] pci 0003:05: 3.6: [PE# 251] Setting up 32-bit TCE table at 0..80000000
[ 5725.735081] pci 0003:05: 3.6: [PE# 251] Enabling 64-bit DMA bypass
[ 5725.735154] pci 0003:05: 3.7: [PE# 252] VF 0003:05:03.7 associated with PE#252
[ 5725.735348] pci 0003:05: 3.7: [PE# 252] Setting up 32-bit TCE table at 0..80000000
[ 5725.805792] pci 0003:05: 3.7: [PE# 252] Enabling 64-bit DMA bypass
[ 5725.805865] pci 0003:05: 4.0: [PE# 253] VF 0003:05:04.0 associated with PE#253
[ 5725.806058] pci 0003:05: 4.0: [PE# 253] Setting up 32-bit TCE table at 0..80000000
[ 5725.876496] pci 0003:05: 4.0: [PE# 253] Enabling 64-bit DMA bypass
[ 5725.876570] pci 0003:05: 4.1: [PE# 254] VF 0003:05:04.1 associated with PE#254
[ 5725.876763] pci 0003:05: 4.1: [PE# 254] Setting up 32-bit TCE table at 0..80000000
[ 5725.945133] pci 0003:05: 4.1: [PE# 254] Enabling 64-bit DMA bypass
[ 5725.945212] pci 0003:05: 4.2: [PE# 255] VF 0003:05:04.2 associated with PE#255
[ 5725.945316] pci 0003:05: 4.2: [PE# 255] Setting up 32-bit TCE table at 0..80000000
[ 5726.006975] pci 0003:05: 4.2: [PE# 255] Enabling 64-bit DMA bypass
[ 5726.007163] mlx4_core 0003:05:00.0: Using 64-bit DMA iommu bypass
[ 5726.007227] mlx4_core 0003:05:00.0: Enabling SR-IOV with 1 VFs
[ 5726.116742] mlx4_core: Initializing 0003:05:00.1
[ 5726.116825] mlx4_core 0003:05:00.1: enabling device (0000 -> 0002)
[ 5726.116914] mlx4_core 0003:05:00.1: Using 64-bit DMA iommu bypass
[ 5726.116976] mlx4_core 0003:05:00.1: Detected virtual function - running in slave mode
[ 5726.117191] mlx4_core 0003:05:00.1: PF is not ready - Deferring probe
[ 5726.117276] pci 0003:05:00.1: Driver mlx4_core requests probe deferral
[ 5726.117363] mlx4_core 0003:05:00.0: Running in master mode
[ 5731.123148] mlx4_core 0003:05:00.0: PCIe link speed is 8.0GT/s, device supports 8.0GT/s
[ 5731.123227] mlx4_core 0003:05:00.0: PCIe link width is x8, device supports x8
[ 5731.151307] mlx4_core: Initializing 0003:05:00.1
[ 5731.151389] mlx4_core 0003:05:00.1: enabling device (0000 -> 0002)
[ 5731.151501] mlx4_core 0003:05:00.1: Using 64-bit DMA iommu bypass
[ 5731.151559] mlx4_core 0003:05:00.1: Detected virtual function - running in slave mode
[ 5731.151636] mlx4_core 0003:05:00.1: Sending reset
[ 5731.151770] mlx4_core 0003:05:00.0: Received reset from slave:1
[ 5731.152189] mlx4_core 0003:05:00.1: Sending vhcr0
[ 5731.152991] mlx4_core 0003:05:00.1: HCA minimum page size:512
[ 5731.153444] mlx4_core 0003:05:00.1: Timestamping is not supported in slave mode
[root@tian-lp1 ywywyang]# [ 5731.229958] mlx4_en: Mellanox ConnectX HCA Ethernet driver v2.2-1 (Feb 2014)
[ 5731.230572] mlx4_en 0003:05:00.0: registered PHC clock
[ 5731.230817] mlx4_en 0003:05:00.0: Activating port:1
[ 5731.262377] mlx4_en: eth12: Using 256 TX rings
[ 5731.262494] mlx4_en: eth12: Using 4 RX rings
[ 5731.262546] mlx4_en: eth12:   frag:0 - size:1518 prefix:0 align:0 stride:1536
[ 5731.262761] mlx4_en: eth12: Initializing port
[ 5731.263357] mlx4_en 0003:05:00.1: Activating port:1
[ 5731.263467] mlx4_en: 0003:05:00.1: Port 1: Assigned random MAC address d6:0f:0b:57:36:24
[ 5731.442926] mlx4_en: eth13: Using 256 TX rings
[ 5731.443010] mlx4_en: eth13: Using 4 RX rings
[ 5731.443018] mlx4_en: eth13:   frag:0 - size:1518 prefix:0 align:0 stride:1536
[ 5731.443304] mlx4_en: eth13: Initializing port
[ 5732.096548] mlx4_en: eth13:   frag:0 - size:1518 prefix:0 align:0 stride:1536
[ 5732.452283] IPv6: ADDRCONF(NETDEV_UP): eth13: link is not ready
[ 5732.951420] mlx4_en: eth12: Link Up
[ 5732.951477] mlx4_en: eth13: Link Up
[ 5732.951627] IPv6: ADDRCONF(NETDEV_CHANGE): eth13: link becomes ready
[ 5733.171020] mlx4_en: eth13: CQE error - vendor syndrome: 0x6f syndrome: 0x2
[ 5733.315756] mlx4_en: eth13: CQE error - vendor syndrome: 0x6f syndrome: 0x2
[ 5734.165387] mlx4_en: eth13: CQE error - vendor syndrome: 0x6f syndrome: 0x2
[ 5748.045084] NETDEV WATCHDOG: eth13 (mlx4_core): transmit queue 1 timed out
[ 5748.045245] ------------[ cut here ]------------
[ 5748.045301] WARNING: at net/sched/sch_generic.c:303
[ 5748.045355] Modules linked in: mlx4_en(O) mlx4_core nf_conntrack_ipv6 nf_defrag_ipv6 bnep nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack bluetooth rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_mangle iptable_security iptable_raw be2net tg3 ptp pps_core nfsd nfs_acl ses enclosure kvm_hv kvm_pr kvm lockd grace binfmt_misc shpchp sunrpc uinput lpfc ipr scsi_transport_fc [last unloaded: mlx4_core]
[ 5748.046069] CPU: 1 PID: 0 Comm: swapper/1 Tainted: P           O   3.18.0-rc2yw+ #161
[ 5748.046142] task: c0000027ef4f92d0 ti: c0000027ef574000 task.ti: c0000027ef574000
[ 5748.046215] NIP: c00000000079a9d4 LR: c00000000079a9d0 CTR: 0000000030044878
[ 5748.046286] REGS: c0000027ef5772d0 TRAP: 0700   Tainted: P           O    (3.18.0-rc2yw+)
[ 5748.046357] MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI>  CR: 28004024  XER: 00000000
[ 5748.046524] CFAR: c0000000008be150 SOFTE: 1 
GPR00: c00000000079a9d0 c0000027ef577550 c00000000147acf0 000000000000003e 
GPR04: c000002004145888 c000002004156240 9000000000009032 0000000000000029 
GPR08: c000000000d1acf0 0000000000000000 0000002003430000 0000000030044954 
GPR12: 0000000028004022 c00000000fdc0900 0000000000000001 c0000000008f10a8 
GPR16: c0000027ef795438 c0000027ef795838 c0000027ef795c38 0000000000000000 
GPR20: c0000027ef795038 c0000000014b2200 0000000000000000 0000000000000000 
GPR24: 0000000000000000 ffffffffffffffff 0000000000000000 0000000000000001 
GPR28: 0000000000000004 c0000000014b2200 c000002781340000 0000000000000001 
[ 5748.047435] NIP [c00000000079a9d4] .dev_watchdog+0x354/0x370
[ 5748.047496] LR [c00000000079a9d0] .dev_watchdog+0x350/0x370
[ 5748.047544] Call Trace:
[ 5748.047570] [c0000027ef577550] [c00000000079a9d0] .dev_watchdog+0x350/0x370 (unreliable)
[ 5748.047655] [c0000027ef577600] [c0000000001275a4] .call_timer_fn+0x64/0x190
[ 5748.047727] [c0000027ef5776b0] [c00000000012814c] .run_timer_softirq+0x2bc/0x3d0
[ 5748.047812] [c0000027ef5777b0] [c0000000000ad738] .__do_softirq+0x168/0x390
[ 5748.047885] [c0000027ef5778b0] [c0000000000adcc8] .irq_exit+0xc8/0x110
[ 5748.047958] [c0000027ef577930] [c00000000001d15c] .timer_interrupt+0x9c/0xd0
[ 5748.048030] [c0000027ef5779b0] [c000000000002658] decrementer_common+0x158/0x180
[ 5748.048116] --- interrupt: 901 at .arch_local_irq_restore+0x74/0x90
[ 5748.048116]     LR = .arch_local_irq_restore+0x74/0x90
[ 5748.048224] [c0000027ef577ca0] [c0000027ef577d30] 0xc0000027ef577d30 (unreliable)
[ 5748.048309] [c0000027ef577d10] [c00000000070887c] .cpuidle_enter_state+0xac/0x260
[ 5748.048393] [c0000027ef577dd0] [c0000000000f6c90] .cpu_startup_entry+0x410/0x460
[ 5748.048478] [c0000027ef577ed0] [c000000000040b40] .start_secondary+0x310/0x340
[ 5748.048562] [c0000027ef577f90] [c000000000008b6c] start_secondary_prolog+0x10/0x14
[ 5748.048643] Instruction dump:
[ 5748.048680] 994d02d4 4bffff10 7fc3f378 4bfd5ff1 60000000 7fc4f378 7fe6fb78 7c651b78 
[ 5748.048799] 3c62ff75 3863afb8 48123721 60000000 <0fe00000> 39200001 3d02fff0 99282b88 
[ 5748.048920] ---[ end trace a04deb5eef8eba41 ]---
[ 5748.385369] mlx4_en: eth13:   frag:0 - size:1518 prefix:0 align:0 stride:1536
[ 5760.841519] mlx4_en: eth13: CQE error - vendor syndrome: 0x6f syndrome: 0x2
[ 5781.165457] mlx4_en: eth13: CQE error - vendor syndrome: 0x6f syndrome: 0x2
[ 5781.315631] mlx4_en: eth13: CQE error - vendor syndrome: 0x6f syndrome: 0x2
[ 5782.165413] mlx4_en: eth13: CQE error - vendor syndrome: 0x6f syndrome: 0x2



>>
>> But for mlx4_en, I am not sure I could raise the debug level with ethtool,
>> since the ethernet driver may not work properly. Actually I am not sure how to
>> raise the level with ethtool. Could you give me an example?
>
># ethtool -s ens5f1d1 msglvl 0xffff
>
>>
>>>>
>>>> And this error is reported from VF always. After the error, the other network
>>>> interface seems can't function.
>>>>
>>>>>
>>>>>Amir.
>>>>
>>>> --
>>>> Richard Yang
>>>> Help you, Help me
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> --
>> Richard Yang
>> Help you, Help me
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Richard Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-11-11  9:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20141108025758.GA13875@richard>
     [not found] ` <CANn89i+ekXHJSePzQ0rWx2KKqwYGTrok3-ZZ1RdEygVJcGDqRQ@mail.gmail.com>
2014-11-10  1:59   ` Face some error after applying commit 7dfa4b414d4(net/mlx4_en: Code cleanups in tx path) Wei Yang
2014-11-10  2:07     ` Wei Yang
2014-11-10  2:46     ` Eric Dumazet
2014-11-10  5:40       ` Wei Yang
2014-11-10  8:00         ` Amir Vadai
2014-11-11  1:57           ` Wei Yang
2014-11-11  6:49             ` Or Gerlitz
2014-11-11  7:28             ` Amir Vadai
2014-11-11  7:42               ` Wei Yang
2014-11-11  8:40                 ` Amir Vadai
2014-11-11  9:12                   ` Wei Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.