All of lore.kernel.org
 help / color / mirror / Atom feed
From: "lihuisong (C)" <lihuisong@huawei.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>,
	"Min Hu (Connor)" <humin29@huawei.com>, <dev@dpdk.org>
Cc: <thomas@monjalon.net>, <xiaoyun.li@intel.com>,
	Radu Nicolau <radu.nicolau@intel.com>,
	"Singh, Aman Deep" <aman.deep.singh@intel.com>
Subject: Re: [dpdk-dev] [PATCH 1/3] app/testpmd: fix port status of active slave device
Date: Tue, 8 Feb 2022 09:19:45 +0800	[thread overview]
Message-ID: <c12c4c70-2c21-0e3a-2153-865be6e1f698@huawei.com> (raw)
In-Reply-To: <a793b6f1-4820-f4cd-7119-e14ccd7f6700@intel.com>


在 2022/2/4 20:07, Ferruh Yigit 写道:
> On 10/25/2021 7:39 AM, Min Hu (Connor) wrote:
>> From: Huisong Li <lihuisong@huawei.com>
>>
>> Stopping a bond device also stops all active slaves under the bond 
>> device.
>> If this port is bond device, we need to modify the port status of all
>> slaves from RTE_PORT_STARTED to RTE_PORT_STOPPED.
>>
>> Fixes: 0e545d3047fe ("app/testpmd: check stopping port is not in 
>> bonding")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
>> ---
>>   app/test-pmd/cmdline.c |  1 +
>>   app/test-pmd/testpmd.c | 49 +++++++++++++++++++++++++++++++++++++++---
>>   app/test-pmd/testpmd.h |  3 ++-
>>   3 files changed, 49 insertions(+), 4 deletions(-)
>>
>> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
>> index 722f4fb9d9..5bfb4b509b 100644
>> --- a/app/test-pmd/cmdline.c
>> +++ b/app/test-pmd/cmdline.c
>> @@ -6639,6 +6639,7 @@ static void 
>> cmd_create_bonded_device_parsed(void *parsed_result,
>>                   "Failed to enable promiscuous mode for port %u: %s 
>> - ignore\n",
>>                   port_id, rte_strerror(-ret));
>>   +        ports[port_id].bond_flag = 1;
>>           ports[port_id].need_setup = 0;
>>           ports[port_id].port_status = RTE_PORT_STOPPED;
>>       }
>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
>> index af0e79fe6d..d6b9ebc4dd 100644
>> --- a/app/test-pmd/testpmd.c
>> +++ b/app/test-pmd/testpmd.c
>> @@ -65,6 +65,9 @@
>>   #ifdef RTE_EXEC_ENV_WINDOWS
>>   #include <process.h>
>>   #endif
>> +#ifdef RTE_NET_BOND
>> +#include <rte_eth_bond.h>
>> +#endif
>>     #include "testpmd.h"
>>   @@ -2986,6 +2989,35 @@ start_port(portid_t pid)
>>       return 0;
>>   }
>>   +#ifdef RTE_NET_BOND
>> +static void
>> +change_bonding_active_slave_port_status(portid_t bond_pid)
>
> The function sets the status explicitly to PORT_STOPPED, but function
> name is more generic, should we update the function name to reflect the
> functionality?
ok
>
>> +{
>> +    portid_t slave_pids[RTE_MAX_ETHPORTS];
>> +    struct rte_port *port;
>> +    int num_active_slaves;
>> +    portid_t slave_pid;
>> +    int i;
>> +
>> +    num_active_slaves = rte_eth_bond_active_slaves_get(bond_pid, 
>> slave_pids,
>> +                               RTE_MAX_ETHPORTS);
>> +    if (num_active_slaves < 0) {
>> +        fprintf(stderr, "Failed to get slave list for port = %u\n",
>> +            bond_pid);
>> +        return;
>> +    }
>> +
>> +    for (i = 0; i < num_active_slaves; i++) {
>> +        slave_pid = slave_pids[i];
>> +        port = &ports[slave_pid];
>> +        if (rte_atomic16_cmpset(&(port->port_status),
>> +            RTE_PORT_STARTED, RTE_PORT_STOPPED) == 0)
>> +            fprintf(stderr, "Port %u can not be set into stopped\n",
>> +                slave_pid);
>> +    }
>> +}
>> +#endif
>> +
>>   void
>>   stop_port(portid_t pid)
>>   {
>> @@ -3042,9 +3074,20 @@ stop_port(portid_t pid)
>>           if (port->flow_list)
>>               port_flow_flush(pi);
>>   -        if (eth_dev_stop_mp(pi) != 0)
>> -            RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port %u\n",
>> -                pi);
>
> Can you please remove the 'eth_dev_stop_mp()' function in this patch,
> which is removed in patch 2/3.
ok
>
>> +        if (is_proc_primary()) {
>> +#ifdef RTE_NET_BOND
>> +            /*
>> +             * Stopping a bond device also stops all active slaves
>> +             * under the bond device. If this port is bond device,
>> +             * we need to modify the port status of all slaves.
>> +             */
>> +            if (port->bond_flag == 1)
>> +                change_bonding_active_slave_port_status(pi);
>> +#endif
>> +            if (rte_eth_dev_stop(pi) != 0)
>> +                RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port 
>> %u\n",
>> +                    pi);
>
> Should we roll back the slave port status if 'rte_eth_dev_stop(pi)' 
> fails?
Yes, it is necessary here for slaves to fail to execute dev_stop() in 
bonding driver.

Btw, in thinking about this, I find a behavior that is not very reasonable.
Namely, only active slaves are stopped when a bonding device is stopped.
It can cause confusion in port status. For example, applications have to 
only modify
active slaves status to RTE_PORT_STOPPED and non-active slaves status is 
still
RTE_PORT_STARTED.
I think the bonding PMD should stop all slaves when a bonding device is 
stopped.
I checked the modification history about this in the bonding PMD. This 
behavior is
introduced by the following patch.

/*
commit 0911d4ec01839c9149a0df5758d00d9d57a47cea
Author: Radu Nicolau <radu.nicolau@intel.com>
Date:   Thu Nov 8 15:26:42 2018 +0000

     net/bonding: fix crash when stopping mode 4 port

     When stopping a bonded port all slaves are deactivated. Attempting
     to deactivate a slave that was never activated will result in a 
segfault
     when mode 4 is used.

     Fixes: 7486331308f6 ("net/bonding: stop and deactivate slaves on stop")
     Cc: stable@dpdk.org

     Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
     Acked-by: Chas Williams <chas3@att.com>
*/

The root cause of the problem the above patch mentioned is that in mode 4,
the bonding PMD does not allocate rx/tx rings to non-active slave devices.
The call stack is as follows:
#0  0x0000000000b1250c in rte_ring_dequeue_bulk_elem (available=0x0, 
n=1, esize=8, obj_table=0xffffffff7c80, r=0x0) at 
../dpdk-next-net/lib/ring/rte_ring_elem.h:380
#1  rte_ring_dequeue_elem (esize=8, obj_p=0xffffffff7c80, r=0x0) at 
../dpdk-next-net/lib/ring/rte_ring_elem.h:476
#2  rte_ring_dequeue (obj_p=0xffffffff7c80, r=0x0) at 
../dpdk-next-net/lib/ring/rte_ring.h:463
#3  bond_mode_8023ad_deactivate_slave (bond_dev=0x4753200 
<rte_eth_devices+33024>, slave_id=0) at 
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_8023ad.c:1163
#4  0x0000000000b29e10 in deactivate_slave (eth_dev=0x4753200 
<rte_eth_devices+33024>, port_id=0) at 
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_api.c:117
#5  0x0000000000b44208 in bond_ethdev_stop (eth_dev=0x4753200 
<rte_eth_devices+33024>) at 
../dpdk-next-net/drivers/net/bonding/rte_eth_bond_pmd.c:2103
#6  0x00000000007966fc in rte_eth_dev_stop (port_id=2) at 
../dpdk-next-net/lib/ethdev/rte_ethdev.c:1894
#7  0x000000000055ea60 in eth_dev_stop_mp (port_id=2) at 
../dpdk-next-net/app/test-pmd/testpmd.c:613
#8  0x0000000000565230 in stop_port (pid=2) at 
../dpdk-next-net/app/test-pmd/testpmd.c:3059
#9  0x00000000004f7614 in cmd_operate_specific_port_parsed 
(parsed_result=0xffffffff91b0, cl=0x4829250, data=0x0) at 
../dpdk-next-net/app/test-pmd/cmdline.c:1261
#10 0x000000000078be24 in cmdline_parse (cl=0x4829250, buf=0x4829298 
"port stop 2\n") at ../dpdk-next-net/lib/cmdline/cmdline_parse.c:290
#11 0x0000000000789c34 in cmdline_valid_buffer (rdl=0x4829260, 
buf=0x4829298 "port stop 2\n", size=13) at 
../dpdk-next-net/lib/cmdline/cmdline.c:26
#12 0x000000000078f160 in rdline_char_in (rdl=0x4829260, c=10 '\n') at 
../dpdk-next-net/lib/cmdline/cmdline_rdline.c:446
#13 0x000000000078a0c8 in cmdline_in (cl=0x4829250, buf=0xfffffffff2e7 
"\n", size=1) at ../dpdk-next-net/lib/cmdline/cmdline.c:148
#14 0x000000000078a3b4 in cmdline_interact (cl=0x4829250) at 
../dpdk-next-net/lib/cmdline/cmdline.c:222
#15 0x000000000050bf98 in prompt () at 
../dpdk-next-net/app/test-pmd/cmdline.c:18001
#16 0x00000000005687c4 in main (argc=4, argv=0xfffffffff510) at 
../dpdk-next-net/app/test-pmd/testpmd.c:4268

For the problem Radu encountered, we only need to ensure that
non-active slaves doesn't deactivate.
I plan to add a patch in this patchset to fix this problem.
What do you think, Ferruh?
>
>> +        }
>>             if (rte_atomic16_cmpset(&(port->port_status),
>>               RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
>> diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
>> index e3995d24ab..ad3b4f875c 100644
>> --- a/app/test-pmd/testpmd.h
>> +++ b/app/test-pmd/testpmd.h
>> @@ -237,7 +237,8 @@ struct rte_port {
>>       struct rte_eth_txconf tx_conf[RTE_MAX_QUEUES_PER_PORT+1]; /**< 
>> per queue tx configuration */
>>       struct rte_ether_addr   *mc_addr_pool; /**< pool of multicast 
>> addrs */
>>       uint32_t                mc_addr_nb; /**< nb. of addr. in 
>> mc_addr_pool */
>> -    uint8_t                 slave_flag; /**< bonding slave port */
>> +    uint8_t                 slave_flag : 1, /**< bonding slave port */
>> +                bond_flag : 1; /**< port is bond device */
>
> Can't we detect if the port is a bonding port without introducing a new
> variable/state?
The bonding device is also an ethdev. I do not find the external API that
can be used to detect whether a port is a bonding port.
>
>>       struct port_flow        *flow_list; /**< Associated flows. */
>>       struct port_indirect_action *actions_list;
>>       /**< Associated indirect actions. */
>
> .

  reply	other threads:[~2022-02-08  1:19 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25  6:39 [dpdk-dev] [PATCH 0/3] bugfix for testpmd Min Hu (Connor)
2021-10-25  6:39 ` [dpdk-dev] [PATCH 1/3] app/testpmd: fix port status of active slave device Min Hu (Connor)
2021-11-15 13:01   ` Singh, Aman Deep
2021-11-16  1:20     ` lihuisong (C)
2022-02-03  7:06       ` Singh, Aman Deep
2022-02-04 12:07   ` Ferruh Yigit
2022-02-08  1:19     ` lihuisong (C) [this message]
2021-10-25  6:39 ` [dpdk-dev] [PATCH 2/3] app/testpmd: fix slave device isn't released Min Hu (Connor)
2022-02-04 12:14   ` Ferruh Yigit
2022-02-08  1:12     ` lihuisong (C)
2021-10-25  6:39 ` [dpdk-dev] [PATCH 3/3] app/testpmd: remove unused header file Min Hu (Connor)
2021-11-08 16:05   ` Ferruh Yigit
2022-03-24  3:00 ` [PATCH V2 0/4] bugfix for bonding Min Hu (Connor)
2022-03-24  3:00   ` [PATCH V2 1/4] net/bonding: fix non-active slaves aren't stopped Min Hu (Connor)
2022-04-26 18:19     ` Ferruh Yigit
2022-04-29  6:45       ` Min Hu (Connor)
2022-04-29 13:31         ` Ferruh Yigit
2022-05-03  6:54           ` Min Hu (Connor)
2022-05-03 19:04             ` Ferruh Yigit
2022-05-05  1:16               ` Min Hu (Connor)
2022-03-24  3:00   ` [PATCH V2 2/4] net/bonding: fix non-terminable while loop Min Hu (Connor)
2022-04-26 18:19     ` Ferruh Yigit
2022-04-29  6:52       ` Min Hu (Connor)
2022-04-29 13:35         ` Ferruh Yigit
2022-03-24  3:00   ` [PATCH V2 3/4] app/testpmd: fix port status of slave device Min Hu (Connor)
2022-03-24  3:00   ` [PATCH V2 4/4] app/testpmd: fix slave device isn't released Min Hu (Connor)
2022-05-30  6:01     ` Min Hu (Connor)
2022-05-30 10:21       ` Singh, Aman Deep
2022-04-25  6:49   ` [PATCH V2 0/4] bugfix for bonding Min Hu (Connor)
2022-05-03 10:02   ` [PATCH v3 0/5] " Min Hu (Connor)
2022-05-03 10:02     ` [PATCH v3 1/5] net/bonding: fix non-active slaves aren't stopped Min Hu (Connor)
2022-05-03 10:02     ` [PATCH v3 2/5] net/bonding: fix non-terminable while loop Min Hu (Connor)
2022-05-03 10:02     ` [PATCH v3 3/5] app/testpmd: fix port status of slave device Min Hu (Connor)
2022-05-03 23:39       ` Konstantin Ananyev
2022-05-06  8:16         ` Min Hu (Connor)
2022-05-08 11:28           ` Konstantin Ananyev
2022-05-10 16:34           ` Ferruh Yigit
2022-05-10 21:48             ` Konstantin Ananyev
2022-05-11  2:16               ` Min Hu (Connor)
2022-05-11 10:05                 ` Ferruh Yigit
2022-05-11  2:14       ` [PATCH v4] " Min Hu (Connor)
2022-05-11 22:08         ` Konstantin Ananyev
2022-05-19  7:15           ` Andrew Rybchenko
2022-05-03 10:02     ` [PATCH v3 4/5] app/testpmd: fix slave device isn't released Min Hu (Connor)
2022-06-01 17:54       ` Ferruh Yigit
2022-06-07  8:15         ` Dongdong Liu
2022-06-07  8:10       ` [PATCH v4] " Dongdong Liu
2022-06-07 14:31         ` Ferruh Yigit
2022-06-09  7:50           ` Dongdong Liu
2022-06-09  8:50             ` Ferruh Yigit
2022-06-09 11:20               ` Dongdong Liu
2022-06-09 11:49       ` [PATCH v5] " Dongdong Liu
2022-06-10  8:10         ` Ferruh Yigit
2022-05-03 10:02     ` [PATCH v3 5/5] ethdev: fix dev state when stop Min Hu (Connor)
2022-05-25 17:44       ` Ferruh Yigit
2022-05-26 10:21         ` Thomas Monjalon
2022-05-30 12:04           ` Ferruh Yigit
2022-05-11 14:04     ` [PATCH v3 0/5] bugfix for bonding Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c12c4c70-2c21-0e3a-2153-865be6e1f698@huawei.com \
    --to=lihuisong@huawei.com \
    --cc=aman.deep.singh@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=humin29@huawei.com \
    --cc=radu.nicolau@intel.com \
    --cc=thomas@monjalon.net \
    --cc=xiaoyun.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.