From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932900AbaLAXJ2 (ORCPT ); Mon, 1 Dec 2014 18:09:28 -0500 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:38758 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932553AbaLAXJZ (ORCPT ); Mon, 1 Dec 2014 18:09:25 -0500 Date: Mon, 1 Dec 2014 18:08:38 -0500 From: Chris Mason To: Linus Torvalds CC: =?iso-8859-1?Q?D=E2niel?= Fraga , Dave Jones , Sasha Levin , "Paul E. McKenney" , Linux Kernel Mailing List Subject: Re: frequent lockups in 3.18rc4 Message-ID: <20141201230339.GA20487@ret.masoncoding.com> Mail-Followup-To: Chris Mason , Linus Torvalds , =?iso-8859-1?Q?D=E2niel?= Fraga , Dave Jones , Sasha Levin , "Paul E. McKenney" , Linux Kernel Mailing List References: <20141127225637.GA24019@redhat.com> <547b8a45.6e608c0a.20f9.1002@mx.google.com> <547bbe36.48548c0a.105c.779c@mx.google.com> <20141201191431.GA17385@linux.vnet.ibm.com> <547ccf74.a5198c0a.25de.26d9@mx.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) X-Originating-IP: [192.168.16.4] X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.13.68,1.0.33,0.0.0000 definitions=2014-12-02_01:2014-12-01,2014-12-01,1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 kscore.is_bulkscore=0 kscore.compositescore=0 circleOfTrustscore=70.7710805427282 compositescore=0.182402046397986 urlsuspect_oldscore=0.182402046397986 suspectscore=0 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=0 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=1996008 rbsscore=0.182402046397986 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.9 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1412010220 X-FB-Internal: deliver Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I'm not sure if this is related, but running trinity here, I noticed it was stuck at 100% system time on every CPU. perf report tells me we are spending all of our time in spin_lock under the sync system call. I think it's coming from contention in the bdi_queue_work() call from inside sync_inodes_sb, which is spin_lock_bh(). I wonder if we're just spinning so hard on this one bh lock that we're starving the watchdog? Dave, do you have spinlock debugging on? -chris