From mboxrd@z Thu Jan 1 00:00:00 1970 From: Adrien Mazarguil Subject: Re: [PATCH v3] net/mlx4: fix dev rmv not detected after port stop Date: Wed, 31 Jan 2018 11:43:52 +0100 Message-ID: <20180131104352.GT4256@6wind.com> References: <1516357009-15463-1-git-send-email-motih@mellanox.com> <1517214877-126768-1-git-send-email-motih@mellanox.com> <20180130093958.GE4256@6wind.com> <20180131091513.GS4256@6wind.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Shahaf Shuler , Mordechay Haimovsky , "dev@dpdk.org" , "stable@dpdk.org" To: Matan Azrad Return-path: Received: from mail-wr0-f195.google.com (mail-wr0-f195.google.com [209.85.128.195]) by dpdk.org (Postfix) with ESMTP id 53EF81B747 for ; Wed, 31 Jan 2018 11:44:06 +0100 (CET) Received: by mail-wr0-f195.google.com with SMTP id g21so14398230wrb.13 for ; Wed, 31 Jan 2018 02:44:06 -0800 (PST) Content-Disposition: inline In-Reply-To: List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Wed, Jan 31, 2018 at 10:08:06AM +0000, Matan Azrad wrote: > Hi all > > From: Adrien Mazarguil > > On Tue, Jan 30, 2018 at 08:37:06PM +0000, Shahaf Shuler wrote: > > > Tuesday, January 30, 2018 11:40 AM, Adrien Mazarguil: > > > > Unfortunately I didn't get a chance to review this patch before it was > > applied. > > > > I'm not sure a stopped port is supposed to report events > > > > (interrupts). Will applications expect them to occur at this point? > > > > > > Why not? > > > > > > Stopped port is still counted as attached. The fact the application stopped > > the packet receive on it doesn't mean it should not receive a sync events > > (such as the remove event). > > > async events, by definition, are not related to traffic being flows through > > the port. > > > > My comment is based on my understanding of rte_eth_dev_stop(), which is > > a device (or port) is completely stopped, in a suspended state and no > > interrupts shall occur, as a means for applications to temporarily not be > > bothered by them until restarted. > > Stopping traffic is not saying that the application is not interesting in the device, > I think that you mean to dev_close(). No, dev_close() releases resources and destroys configuration. Good luck using dev_start() or any other devop after dev_close(). > Any event may still be usable for application between dev_stop() to dev_start(), > especially RMV or LCS can still be interested. Possibly, but then, how come no PMD implements it that way? Neither did mlx4 before this patch got applied. At the very least it cannot be considered a "fix". > > Think about it that way: applications do not want to get interrupts > > immediately after the device is initialized, because they might not be ready > > to process them at this point. An explicit call to rte_eth_dev_start() tells the > > PMD when it's OK to do so. The converse is rte_eth_dev_stop(). > > So, they can delay the event registration to the time they interesting in the events. > And use event unregister when they are not interesting in it anymore. Of course you can ask application maintainers to adapt to the new behavior, or you know, leave things as they used to be. Setting up RMV/LSC callbacks is not the only configuration an application usually performs before calling dev_start(). Think about setting up flow rules, MAC addresses, VLANs, and so on, this on multiple ports before starting them up all at once. Previously it could be done in an unspecified order, now they have to take special care for RMV/LSC. Many devops are only safe when called while a device is stopped. It's even documented in rte_ethdev.h. > > Stopping traffic can already be achieved by not polling from the application > > side, calling rte_eth_dev_[rt]x_queue_stop() and/or toggling RX/TX > > interrupts through rte_eth_dev_[rt]x_intr_enable(). rte_eth_dev_stop() > > provides lower-level device control. > > I think it makes sense only for Rx interrupt which is traffic oriented(like stop and start). OK, looks like we disagree :) > > Perhaps documentation is not clear, however that's how LSC seems > > implemented in all PMDs; it gets disabled after rte_eth_dev_stop() and one > > should explicitly use rte_eth_link_get() to retrieve link status afterward. I > > think RMV should behave similarly with rte_eth_dev_is_removed(). > > Adapting fail-safe should be easier than modifying all the remaining PMDs. > > Or maybe PMDs which do it make mistakes. I'm not convinced mlx4 is the only PMD doing the right thing, we should ask other maintainers as well as application writers' opinion on the matter. If it's not a problem for RMV/LSC to occur while a device is stopped, then I'll agree with the approach. It still won't make it a fix regardless. > > > > In my opinion it's not a fix, as in, it doesn't address an issue > > > > introduced by the mentioned patch whose behavior was correct. > > > > > > > > It's probably too late to change it now and it does address an issue > > > > seen with a use case involving this PMD, however I think the > > > > fail-safe PMD could as well poll using the recently-added > > > > rte_eth_dev_is_removed() when it's aware the underlying port is > > stopped instead of expecting interrupts. -- Adrien Mazarguil 6WIND