From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753001AbcG3NOF (ORCPT ); Sat, 30 Jul 2016 09:14:05 -0400 Received: from out01.mta.xmission.com ([166.70.13.231]:58535 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751683AbcG3NMg (ORCPT ); Sat, 30 Jul 2016 09:12:36 -0400 From: ebiederm@xmission.com (Eric W. Biederman) To: Dave Jones Cc: Vegard Nossum , trinity@vger.kernel.org, Thomas Gleixner , Tejun Heo , LKML , Bjorn Helgaas , Russell King References: <5790C376.2010206@oracle.com> <20160721131346.GA1705@codemonkey.org.uk> Date: Sat, 30 Jul 2016 07:58:55 -0500 In-Reply-To: <20160721131346.GA1705@codemonkey.org.uk> (Dave Jones's message of "Thu, 21 Jul 2016 09:13:46 -0400") Message-ID: <87fuqrcmeo.fsf@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1bTU3a-0003Kw-Hb;;;mid=<87fuqrcmeo.fsf@x220.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=67.3.204.119;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/du/mv2dSBXJEFSm1kwIAFwy+J6qu5UME= X-SA-Exim-Connect-IP: 67.3.204.119 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 TVD_RCVD_IP Message was received from an IP address * 0.7 XMSubLong Long Subject * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] * 1.0 T_XMDrugObfuBody_08 obfuscated drug references * 0.0 T_TooManySym_01 4+ unique symbols in subject * 0.0 T_TooManySym_02 5+ unique symbols in subject X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: *;Dave Jones X-Spam-Relay-Country: X-Spam-Timing: total 591 ms - load_scoreonly_sql: 0.09 (0.0%), signal_user_changed: 4.6 (0.8%), b_tie_ro: 2.9 (0.5%), parse: 1.19 (0.2%), extract_message_metadata: 14 (2.4%), get_uri_detail_list: 2.3 (0.4%), tests_pri_-1000: 5 (0.9%), tests_pri_-950: 1.29 (0.2%), tests_pri_-900: 1.07 (0.2%), tests_pri_-400: 23 (3.9%), check_bayes: 22 (3.7%), b_tokenize: 6 (1.0%), b_tok_get_all: 8 (1.3%), b_comp_prob: 2.5 (0.4%), b_tok_touch_all: 3.1 (0.5%), b_finish: 0.79 (0.1%), tests_pri_0: 529 (89.6%), check_dkim_signature: 0.54 (0.1%), check_dkim_adsp: 3.4 (0.6%), tests_pri_500: 7 (1.2%), rewrite_mail: 0.00 (0.0%) Subject: Re: cleanup_net()/net_mutex hung tasks + kobject release debugging X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dave Jones writes: > On Thu, Jul 21, 2016 at 02:43:34PM +0200, Vegard Nossum wrote: > > > The rules for net_mutex are very simple, it's used in very few places so > > I don't see how the locking could get messed up there. I'll buy your > > theory that the lock is held for a long time if there are a lot of > > namespaces to iterate over. I decided to time it myself and it seems > > that cleanup_net() can hold the mutex for 30-40 seconds at a time, which > > is surely wrong. > > > so on a hunch I disabled DEBUG_KOBJECT_RELEASE, and that does indeed > > solve the problem -- cleanup_net() still holds the mutex for fairly > > long, but only up to max ~5 seconds at a time as opposed to 30-40. > > Yeah, I never ran with that option enabled (it used to cause my testbox > to not boot, and I never got around to debugging why). I thought five seconds > was painful enough. I guess we have different thresholds for acceptable > behaviour here :-) > > Could be one of the other debug options I had enabled exacerbates the > cleanup_net problem in a similar way though. > > > There's maybe a case for cleanup_net() to release the mutex every now > > and again during cleanup, but I was also seeing a few other hung tasks > > unrelated to net_mutex when I disabled the unshare() system call in > > trinity, which makes me wonder if we need a more general solution. > > Not sure. We may have to just look at these on a case by case basis. The best you can easily do in cleanup_net with net_mutex is to reduce the number of net namespaces you free at once. Which sounds attractive except that last I looked most of the time was spent in syncrhonize_rcu. Because the namespaces can share those synchronize_rcu calls cleaning up a bunch of network namespaces all at once is actually a pretty big optimization in terms of system performance. Though if someone wants to dig in and point out non-shared synchronize_rcu calls or other obvious sillies happening in cleanup_net I will be happy to see what we can do. Eric