From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751378AbbHMEA2 (ORCPT ); Thu, 13 Aug 2015 00:00:28 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:56438 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750713AbbHMEA1 (ORCPT ); Thu, 13 Aug 2015 00:00:27 -0400 Date: Wed, 12 Aug 2015 21:00:27 -0700 From: Andrew Morton To: Linus Torvalds Cc: Joonsoo Kim , Al Viro , Linux Kernel Mailing List Subject: Re: get_vmalloc_info() and /proc/meminfo insanely expensive Message-Id: <20150812210027.88dfcf90.akpm@linux-foundation.org> In-Reply-To: References: X-Mailer: Sylpheed 2.7.1 (GTK+ 2.18.9; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 12 Aug 2015 20:29:34 -0700 Linus Torvalds wrote: > I just did some profiling of a simple "make test" in the git repo, and > was surprised by the top kernel offender: get_vmalloc_info() showed up > at roughly 4% cpu use. > > It turns out that bash ends up reading /proc/meminfo on every single > activation, and "make test" is basically just running a huge > collection of shell scripts. You can verify by just doing > > strace -o trace sh -c "echo" > > to see what bash does on your system. I suspect it's actually glibc, > because a quick google finds the function "get_phys_pages()" that just > looks at the "MemTotal" line (or possibly get_avphys_pageslooks at the > MemFree" line). And bash surely isn't interested in vmalloc stats. Putting all these things in the same file wasn't the smartest thing we've ever done. > Ok, so bash is insane for caring so deeply that it does this > regardless of anything else. But what else is new - user space does > odd things. It's like a truism. > > My gut feel for this is that we should just rate-limit this and cache > the vmalloc information for a fraction of a second or something. Maybe > we could expose total memory sizes in some more efficient format, but > it's not like existing binaries will magically de-crapify themselves, > so just speeding up meminfo sounds like a good thing. > > Maybe we could even cache the whole seqfile buffer - Al? How painful > would something like that be? Although from the profiles, it's really > just the vmalloc info gathering that shows up as actually wasting CPU > cycles.. > Do your /proc/meminfo vmalloc numbers actually change during that build? Mine don't. Perhaps we can cache the most recent vmalloc_info and invalidate that cache whenever someone does a vmalloc/vfree/etc.