From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752088AbcLJC0I (ORCPT ); Fri, 9 Dec 2016 21:26:08 -0500 Received: from mout.gmx.net ([212.227.15.18]:54163 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751367AbcLJC0H (ORCPT ); Fri, 9 Dec 2016 21:26:07 -0500 Subject: Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock To: Pavel Machek , Francois Romieu References: <1481141138-19466-1-git-send-email-LinoSanfilippo@gmx.de> <1481141138-19466-2-git-send-email-LinoSanfilippo@gmx.de> <20161207231534.GB5889@electric-eye.fr.zoreil.com> <051e3043-8b58-0591-36e3-99e2267f67f4@gmx.de> <20161208231943.GA13102@electric-eye.fr.zoreil.com> <20161209112142.GA22710@amd> Cc: bh74.an@samsung.com, ks.giri@samsung.com, vipul.pandya@samsung.com, peppe.cavallaro@st.com, alexandre.torgue@st.com, davem@davemloft.net, linux-kernel@vger.kernel.org, netdev@vger.kernel.org From: Lino Sanfilippo Message-ID: Date: Sat, 10 Dec 2016 03:25:47 +0100 User-Agent: Mozilla/5.0 (X11; Linux i686; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <20161209112142.GA22710@amd> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Provags-ID: V03:K0:fl0tBSunTn2ujCXSmAGVNVE5iiPN24x2dYdsbe8IeoIZGOIzJFN vIzJld6ZCSzqq9+3KgM4Y9Mm8Dxwy4FIgrOcoXWc2cs49EAheK3SjID/GHtLdiDpHC5n+S6 N/1dxfm0IioAO/t7XXrqJs33TQLccTdxgY31wEgoJtZlSgIe6Z6jB7R9cFl47bErtm7BfMB YgzkZX8FgtvBR0BFrNF+w== X-UI-Out-Filterresults: notjunk:1;V01:K0:WiGRpdzQuBs=:dTLW3yXQ8WFDVYhq8M3gFm 93P8gE3QtJz5m4lG5X9ccMETPURszHcgaLU4d2LsGPDRgJAVQCShGgvR0V1+RuCiUiZ12DwCy zzZR/lFaHRQ1ni3Ery3QyZ3IfMpzGDSO++D4r7QWEDSUQtLb75QvjmSYZpr9V0XshTyGR0t/d O7lcyjcPHP1M8b7AijZXE5iuU8TQogfYR6kgPDounwIP8TLPMIftz6HuVVn/JQSy/9MSL1rsl xp4mLP85jhAgS0iQlxJpjXreLSsPIInnS9rgykb8pjpTBa+aeF3ZskjJJLxAZw2Q6B21NAIJy zsMYtvx39QRgLqqH2IXFVKwfZ3ydT9ueCyc+JOi+DIwJy3ANmHVliV8JhFOb92uVqKw9GlXn7 Kgt2WdRxmxAPcoMzNnoqGACOE+fZDnGAf7CFfkodcPAYBouuZ9qXnxuqYWoThGHMBW+2qrCPI QYO7NhAzornGvTkWQ7COAf5SkIsuCrd3wPear6WbZ+DODfkpOmNk24O4OjqEfzriSYnOwRWyc HvPiAxUeyRDL+JZqzYui8Nhu+LuAehFJ6jGwQ76lgLOWa8lHereJVgEYk+m9hK+Vlzckvi6ih DGlv3BvjUoM/l2Cjxt7C7uMM933VjK74peAgwaH0ylIdOq4YJETpAQRNFnD7S7/ImpG7Ykdah qdpnCL1LGgJiwU1qJTptzH1TQdZPDK0uGV9oJOARF4cOBfSdzytFhIPkPex34ffQKr7mJ2H0N u/FkIn5vA65OW512s87YXDnfBiY1YOL+3XunHmsnMeFro/7BPP5wHII36Kmm2712FweSxYJ9d EC0kwWt Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 09.12.2016 12:21, Pavel Machek wrote: > On Fri 2016-12-09 00:19:43, Francois Romieu wrote: >> Lino Sanfilippo : >> [...] >> > OTOH Pavel said that he actually could produce a deadlock. Now I wonder if >> > this is caused by that locking scheme (in a way I have not figured out yet) >> > or if it is a different issue. >> >> stmmac_tx_err races with stmmac_xmit. > > Umm, yes, that looks real. > > And that means that removing tx_lock will not be completely trivial > :-(. Lino, any ideas there? > Ok, the race is there but it looks like a problem that is not related to the use or removal of the private lock. By a glimpse into other drivers (e.g sky2 or e1000), a possible way to handle a tx error is to start a separate task and restart the tx path in that task instead the irq handler (or timer in case of the watchdog). In that task we could do: 1. deactivate napi 2. deactivate irqs 3. wait for running napi/irqs do complete (_sync) 4. call stmmac_tx_err() 5. reenable napi 6. reenable irqs We have to ensure that no xmit() is executing while stmmac_tx_err() does the cleanup, so stmmac_tx_err() should IMO rather call netif_tx_disable() instead of netif_stop_queue() (the former grabs the xmit lock before it sets __QUEUE_STATE_DRV_XOFF to disable the queue). Regards, Lino