From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753610AbcLHUcg (ORCPT ); Thu, 8 Dec 2016 15:32:36 -0500 Received: from mout.gmx.net ([212.227.17.22]:54848 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752035AbcLHUce (ORCPT ); Thu, 8 Dec 2016 15:32:34 -0500 Subject: Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock To: Francois Romieu References: <1481141138-19466-1-git-send-email-LinoSanfilippo@gmx.de> <1481141138-19466-2-git-send-email-LinoSanfilippo@gmx.de> <20161207231534.GB5889@electric-eye.fr.zoreil.com> Cc: bh74.an@samsung.com, ks.giri@samsung.com, vipul.pandya@samsung.com, peppe.cavallaro@st.com, alexandre.torgue@st.com, pavel@ucw.cz, davem@davemloft.net, linux-kernel@vger.kernel.org, netdev@vger.kernel.org From: Lino Sanfilippo Message-ID: <051e3043-8b58-0591-36e3-99e2267f67f4@gmx.de> Date: Thu, 8 Dec 2016 21:32:12 +0100 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <20161207231534.GB5889@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:KYXgkZ5QzBpAc1n+41onkWlBWCjmtCuW3Ia0fA/kRSclET8A0GW kKqn6ycO5UHOaNDKHEcjzaPI5bqNMCgTcR7+YSzNoFmzhr3aj9ptzEvmhkCG/lSRhNh20Vm yglV+20ew07dGfLPYoX6sn6AG9xcF2x2lcB9b9bEElss7vkYXlGEPjwNuhWkEOkdEWIDNdd 8sDf1q2G3WbtpgOPeew6A== X-UI-Out-Filterresults: notjunk:1;V01:K0:VbcMdgjOKr0=:TW6Vp64Gde3w9CbJKzbOHO a3ua+sEXt/NeizknrJZhA03u3InIf2bU2QBtNQbbXtVdlUj0TM7qUIClBKlSYCfUCyMJadJjb aw0gxXYbduW9szaUui2q9c2MO7VjGS67H2oGvxIAFLBH4+kxy0L5KE4EmDUje8kJuVC2H3VOr R2PRl6MggJNUNM4QzHmjSM1VnZR+05OvAtSZbGlFmtVsFH8mVLFbxG+NYdfqAhqasUZboSHkq FyM94Av3Kr5p+bgDPFjxZP9v8u6k0m+DPHvunKpzf9qwLZlQbxGSfb4o9UrI3nGdSbIJOv56E q5pN4QrgxyK0yQ0dRFRFMnLt41iw+jJ3BBbuQ6CNUo89Al/ZoXCI8VoxLCMrIw6vB/vp1NKPL jr7c/5AKZiXTBcAm5EaoCdHxc+BLV4rHQ+Z/MJOzr+pcl/AvKIkjLbomriCA4asOxKoBvV4WF 2EXBs1vOWfRsKeoSv652efFcF4bXuzCwQDHrvdAIBFrkb9Fbf0lfLevdE7Et9jdZDEzkwZboP c2YnIYkPwaT88Rvi8wD3Yl+13pUfOUFxlszGxtL1k1oURqzxBrp/ll9VW6vUtNPXfIw4AJasW RQoYMyTkXv7cmEY4Oe91O7Psjq8vrLmo7PU0z4GMdWPFs+YDctOdd5VITrqCPcelfzmN3HqAt WwMZ4k308FXLbMY7WIbbEHZMHF16/Adavywbgsngk4HM/nOep/Lg147YuMXE0FYtmsOAYrums G8Qct576BgQP5lNxpmFXL9plknC09/pTxIjay9Vb1eKRMNQBLb0/N2M08ljSFViI9V8IbuFUV vge/Ods Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 08.12.2016 00:15, Francois Romieu wrote: > Lino Sanfilippo : >> The driver uses a private lock for synchronization between the xmit >> function and the xmit completion handler, but since the NETIF_F_LLTX flag >> is not set, the xmit function is also called with the xmit_lock held. >> >> On the other hand the xmit completion handler first takes the private lock >> and (in case that the tx queue has been stopped) the xmit_lock, leading >> to a reverse locking order and the potential danger of a deadlock. > > netif_tx_stop_queue is used by: > 1. xmit function before releasing lock and returning. > 2. sxgbe_restart_tx_queue() > <- sxgbe_tx_interrupt > <- sxgbe_reset_all_tx_queues() > <- sxgbe_tx_timeout() > > Given xmit won't be called again until tx queue is enabled, it's not clear > how a deadlock could happen due to #1. > After spending more thoughts on this I tend to agree with you. Yes, we have the different locking order for the xmit_lock and the private lock in two concurrent threads. And one of the first things one learns about locking is that this is a good way to create a deadlock sooner or later. But in our case the deadlock can only occur if the xmit function and the tx completion handler perceive different states for the tx queue, or to be more specific: the completion handler sees the tx queue in state "stopped" while the xmit handler sees it in state "running" at the same time. Only then both functions would try to take both locks, which could lead to a deadlock. OTOH Pavel said that he actually could produce a deadlock. Now I wonder if this is caused by that locking scheme (in a way I have not figured out yet) or if it is a different issue. Regards, Lino