From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46A59C432C1 for ; Wed, 25 Sep 2019 08:36:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EFF5220665 for ; Wed, 25 Sep 2019 08:36:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="VoBkfX9U" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EFF5220665 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8372B6B000E; Wed, 25 Sep 2019 04:36:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E76E6B0010; Wed, 25 Sep 2019 04:36:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 723716B0266; Wed, 25 Sep 2019 04:36:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0173.hostedemail.com [216.40.44.173]) by kanga.kvack.org (Postfix) with ESMTP id 4D31B6B000E for ; Wed, 25 Sep 2019 04:36:06 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id E438283EA for ; Wed, 25 Sep 2019 08:36:05 +0000 (UTC) X-FDA: 75972785490.18.uncle51_524108d576218 X-HE-Tag: uncle51_524108d576218 X-Filterd-Recvd-Size: 4152 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf11.hostedemail.com (Postfix) with ESMTP for ; Wed, 25 Sep 2019 08:36:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=qY25nh7ft7xVLg9CJpkhKTMkEGk7+Lk26TG1nt2SkhY=; b=VoBkfX9U18C/zs886+AEon/jz P8rwaJyJUSaktSTe86iBPd0eBgthTknvrTbwTpUBOuWHhmEZqmAITl1A0E7pa7U5ttEI5mrWcs9o4 p4+fDC/V3M2a7wy9dKkPgP+TnuzpeTYzt/qzNSUXjyw6yW/81muGQ5GjgXgcZhJl5NxibaKkmzPui Gzbv1UPwm2+G48V6a/5upaH/pv1sR3cER+i16a7znZ9Xqj2BQPIf7FboFrBLfcaS1vBBQvX0pQf7E jXk2p5sUusUWVcfBJ91JEh92ht0Wz8O9LH/eQ6ciz2xzpX9IXOUs7BnDAB5+pXhClmFrsi0CMhyhv QqT4AlKXw==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.92.2 #3 (Red Hat Linux)) id 1iD2mF-0006SO-Rx; Wed, 25 Sep 2019 08:36:00 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 18B5D305E1F; Wed, 25 Sep 2019 10:35:11 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id D7818203E4FB5; Wed, 25 Sep 2019 10:35:57 +0200 (CEST) Date: Wed, 25 Sep 2019 10:35:57 +0200 From: Peter Zijlstra To: Dave Chinner Cc: Waiman Long , Ingo Molnar , Will Deacon , Alexander Viro , Mike Kravetz , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Davidlohr Bueso Subject: Re: [PATCH 0/5] hugetlbfs: Disable PMD sharing for large systems Message-ID: <20190925083557.GA4553@hirez.programming.kicks-ass.net> References: <20190911150537.19527-1-longman@redhat.com> <20190913015043.GF27547@dread.disaster.area> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190913015043.GF27547@dread.disaster.area> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 13, 2019 at 11:50:43AM +1000, Dave Chinner wrote: > On Wed, Sep 11, 2019 at 04:05:32PM +0100, Waiman Long wrote: > > A customer with large SMP systems (up to 16 sockets) with application > > that uses large amount of static hugepages (~500-1500GB) are experiencing > > random multisecond delays. These delays was caused by the long time it > > took to scan the VMA interval tree with mmap_sem held. > > > > To fix this problem while perserving existing behavior as much as > > possible, we need to allow timeout in down_write() and disabling PMD > > sharing when it is taking too long to do so. Since a transaction can > > involving touching multiple huge pages, timing out for each of the huge > > page interactions does not completely solve the problem. So a threshold > > is set to completely disable PMD sharing if too many timeouts happen. > > > > The first 4 patches of this 5-patch series adds a new > > down_write_timedlock() API which accepts a timeout argument and return > > true is locking is successful or false otherwise. It works more or less > > than a down_write_trylock() but the calling thread may sleep. > > Just on general principle, this is a non-starter. If a lock is being > held too long, then whatever the lock is protecting needs fixing. > Adding timeouts to locks and sysctls to tune them is not a viable > solution to address latencies caused by algorithm scalability > issues. I'm very much agreeing here. Lock functions with timeouts are a sign of horrific design.