From mboxrd@z Thu Jan 1 00:00:00 1970 From: Josef Bacik Subject: Re: Soft lockup in inet_put_port on 4.6 Date: Tue, 20 Dec 2016 03:40:56 +0000 Message-ID: <9DF94C8E-1463-4C10-81E3-E6F4534097CB@fb.com> References: <1481928610.17731.0@smtp.office365.com> <286A21B1-2A15-4DDF-B334-A016DA3D52EA@fb.com> <20161219.205646.1955469060856026212.davem@davemloft.net> ,<1482201702.1521.13.camel@edumazet-glaptop3.roam.corp.google.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: Tom Herbert , David Miller , Hannes Frederic Sowa , Craig Gallek , Linux Kernel Network Developers To: Eric Dumazet Return-path: Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:41959 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752111AbcLTDlE (ORCPT ); Mon, 19 Dec 2016 22:41:04 -0500 In-Reply-To: <1482201702.1521.13.camel@edumazet-glaptop3.roam.corp.google.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: > On Dec 19, 2016, at 9:42 PM, Eric Dumazet wrote: >=20 >> On Mon, 2016-12-19 at 18:07 -0800, Tom Herbert wrote: >>=20 >> When sockets created SO_REUSEPORT move to TW state they are placed >> back on the the tb->owners. fastreuse port is no longer set so we have >> to walk potential long list of sockets in tb->owners to open a new >> listener socket. I imagine this is happens when we try to open a new >> listener SO_REUSEPORT after the system has been running a while and so >> we hit the long tb->owners list. >=20 > Hmm... __inet_twsk_hashdance() does not change tb->fastreuse >=20 > So where tb->fastreuse is cleared ? >=20 > If all your sockets have SO_REUSEPORT set, this should not happen. >=20 The app starts out with no SO_REUSEPORT, and then we restart it with that o= ption enabled. What I suspect is we have all the twsks from the original s= ervice, and the fastreuse stuff is cleared. My naive patch resets it once = we add a reuseport sk to the tb and that makes the problem go away. I'm re= working all of this logic and adding some extra info to the tb to make the = reset actually safe. I'll send those patches out tomorrow. Thanks, Josef=