From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ferruh Yigit Subject: Re: [PATCH v3 3/4] bonding: take queue spinlock in rx/tx burst functions Date: Wed, 15 Feb 2017 18:01:45 +0000 Message-ID: References: <1464280727-25752-2-git-send-email-bernard.iremonger@intel.com> <5839452.IzJ3v2K0cK@xps13> <8CEF83825BEC744B83065625E567D7C21A03A6D7@IRSMSX108.ger.corp.intel.com> <1632739.trvk2NaClS@xps13> Mime-Version: 1.0 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Cc: DPDK To: Thomas Monjalon , Bernard Iremonger , Bruce Richardson , "Ananyev, Konstantin" , Declan Doherty Return-path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id B7F1898 for ; Wed, 15 Feb 2017 19:01:49 +0100 (CET) In-Reply-To: <1632739.trvk2NaClS@xps13> List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 6/16/2016 7:38 PM, thomas.monjalon at 6wind.com (Thomas Monjalon) wrote: > 2016-06-16 16:41, Iremonger, Bernard: >> Hi Thomas, >> >>> 2016-06-16 15:32, Bruce Richardson: >>>> On Mon, Jun 13, 2016 at 01:28:08PM +0100, Iremonger, Bernard wrote: >>>>>> Why does this particular PMD need spinlocks when doing RX and TX, >>>>>> while other device types do not? How is adding/removing devices >>>>>> from a bonded device different to other control operations that >>>>>> can be done on physical PMDs? Is this not similar to say bringing >>>>>> down or hotplugging out a physical port just before an RX or TX >>> operation takes place? >>>>>> For all other PMDs we rely on the app to synchronise control and >>>>>> data plane operation - why not here? >>>>>> >>>>>> /Bruce >>>>> >>>>> This issue arose during VM live migration testing. >>>>> For VM live migration it is necessary (while traffic is running) to be able to >>> remove a bonded slave device, stop it, close it and detach it. >>>>> It a slave device is removed from a bonded device while traffic is running >>> a segmentation fault may occur in the rx/tx burst function. The spinlock has >>> been added to prevent this occurring. >>>>> >>>>> The bonding device already uses a spinlock to synchronise between the >>> add and remove functionality and the slave_link_status_change_monitor >>> code. >>>>> >>>>> Previously testpmd did not allow, stop, close or detach of PMD while >>>>> traffic was running. Testpmd has been modified with the following >>>>> patchset >>>>> >>>>> http://dpdk.org/dev/patchwork/patch/13472/ >>>>> >>>>> It now allows stop, close and detach of a PMD provided in it is not >>> forwarding and is not a slave of bonded PMD. >>>>> >>>> I will admit to not being fully convinced, but if nobody else has any >>>> serious objections, and since this patch has been reviewed and acked, >>>> I'm ok to merge it in. I'll do so shortly. >>> >>> Please hold on. >>> Seeing locks introduced in the Rx/Tx path is an alert. >>> We clearly need a design document to explain where locks can be used and >>> what are the responsibility of the control plane. >>> If everybody agrees in this document that DPDK can have some locks in the >>> fast path, then OK to merge it. >>> >>> So I would say NACK for 16.07 and maybe postpone to 16.11. >> >> Looking at the documentation for the bonding PMD. >> >> http://dpdk.org/doc/guides/prog_guide/link_bonding_poll_mode_drv_lib.html >> >> In section 10.2 it states the following: >> >> Bonded devices support the dynamical addition and removal of slave devices using the rte_eth_bond_slave_add / rte_eth_bond_slave_remove APIs. >> >> If a slave device is added or removed while traffic is running, there is the possibility of a segmentation fault in the rx/tx burst functions. This is most likely to occur in the round robin bonding mode. >> >> This patch set fixes what appears to be a bug in the bonding PMD. > > It can be fixed by removing this statement in the doc. > > One of the design principle of DPDK is to avoid locks. > >> Performance measurements have been made with this patch set applied and without the patches applied using 64 byte packets. >> >> With the patches applied the following drop in performance was observed: >> >> % drop for fwd+io: 0.16% >> % drop for fwd+mac: 0.39% >> >> This patch set has been reviewed and ack'ed, so I think it should be applied in 16.07 > > I understand your point of view and I gave mine. > Now we need more opinions from others. > Hi, These patches are sitting in the patchwork for a long time. Discussion never concluded and patches kept deferred each release. I think we should give a decision about them: 1- We can merge them in this release, they are fixing a valid problem, and patches are already acked. 2- We can reject them, if not having them for more than six months not caused a problem, perhaps they are not really that required. And if somebody needs them in the future, we can resurrect them from patchwork. I vote for option 2, any comments? Thanks, ferruh