From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8A702C10F26 for ; Wed, 1 Apr 2020 08:18:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 61C1A207FF for ; Wed, 1 Apr 2020 08:18:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="key not found in DNS" (0-bit key) header.d=szeredi.hu header.i=@szeredi.hu header.b="TsV8RV0S" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731961AbgDAIS3 (ORCPT ); Wed, 1 Apr 2020 04:18:29 -0400 Received: from mail-ed1-f65.google.com ([209.85.208.65]:39217 "EHLO mail-ed1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731745AbgDAIS2 (ORCPT ); Wed, 1 Apr 2020 04:18:28 -0400 Received: by mail-ed1-f65.google.com with SMTP id a43so28587060edf.6 for ; Wed, 01 Apr 2020 01:18:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=5NfTZ1psxcbruTelVq920UEZMKP5Q8CYhbx3Of3ZlPA=; b=TsV8RV0SNk0duVRe8v/Jj4pUvxKHMnpiNZJsAeg9dXLkJzcr28I12rPobiFSrxekML lZvjBSqesoMNZcw1jplpdWMUlDPvIYiCa5VHqqtrXDqBvhuLrRMLZ7+aBc6IpMGq+fcW 2/z6FfT4AnvWnPiH8aQdExL1Q5n0l2QdZdKhI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=5NfTZ1psxcbruTelVq920UEZMKP5Q8CYhbx3Of3ZlPA=; b=kO1aB45TwXoX82C61wm118otzBaVZpeOUJuv2jv3PL+GpWC6cEklWsDeXVkjMEL6cZ KApdaOgzc4vdUhE19m1HZ5wBCa9VxFAGrhXdWzKWJDZo/yr3O43MK46CogxSofuC91Iz sa/moGAYA4qdM3RtYLcUliulkixkNIICPGBHmgmM2RKraKk71ewFXH8/aFZPGkIUwxfY GPjJjcvN4UPO9yyF2Yk3VXJ+NmcaIQZLccO173AoT0kGXKOIQYEpgnHibWG65TapJgtr oSo48mCsKr01+zNTDw/+07elABodO7Gll5gxVUQkQw1w9E6c2o1MDez4xDb5knmZmxXD g7Dg== X-Gm-Message-State: ANhLgQ3sHLVHk6bW/chO/nZ7JoJgoWjzXtWJLU0Vxos44TVQn2k3ldEU fWcxVsjUESrEwURphii+qnGg9PFsFLtmXG2IJ5qnhw== X-Google-Smtp-Source: ADFU+vtzbMBDUSOP6hzS0q+rekhXxHRVTRf1V7Qotd7o9b2HuU0+vYqetl4dyF5TMa+9BSocqVggTTfkwZ/JTmGjJJE= X-Received: by 2002:a17:906:9ca:: with SMTP id r10mr18691109eje.151.1585729105602; Wed, 01 Apr 2020 01:18:25 -0700 (PDT) MIME-Version: 1.0 References: <158454408854.2864823.5910520544515668590.stgit@warthog.procyon.org.uk> <50caf93782ba1d66bd6acf098fb8dcb0ecc98610.camel@themaw.net> In-Reply-To: <50caf93782ba1d66bd6acf098fb8dcb0ecc98610.camel@themaw.net> From: Miklos Szeredi Date: Wed, 1 Apr 2020 10:18:14 +0200 Message-ID: Subject: Re: [PATCH 00/13] VFS: Filesystem information [ver #19] To: Ian Kent Cc: David Howells , Linus Torvalds , Al Viro , Linux NFS list , Andreas Dilger , Anna Schumaker , "Theodore Ts'o" , Linux API , linux-ext4@vger.kernel.org, Trond Myklebust , Miklos Szeredi , Christian Brauner , Jann Horn , "Darrick J. Wong" , Karel Zak , Jeff Layton , linux-fsdevel@vger.kernel.org, LSM , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 1, 2020 at 7:22 AM Ian Kent wrote: > > On Wed, 2020-03-18 at 17:05 +0100, Miklos Szeredi wrote: > > On Wed, Mar 18, 2020 at 4:08 PM David Howells > > wrote: > > > > > ============================ > > > WHY NOT USE PROCFS OR SYSFS? > > > ============================ > > > > > > Why is it better to go with a new system call rather than adding > > > more magic > > > stuff to /proc or /sysfs for each superblock object and each mount > > > object? > > > > > > (1) It can be targetted. It makes it easy to query directly by > > > path. > > > procfs and sysfs cannot do this easily. > > > > > > (2) It's more efficient as we can return specific binary data > > > rather than > > > making huge text dumps. Granted, sysfs and procfs could > > > present the > > > same data, though as lots of little files which have to be > > > individually opened, read, closed and parsed. > > > > Asked this a number of times, but you haven't answered yet: what > > application would require such a high efficiency? > > Umm ... systemd and udisks2 and about 4 others. > > A problem I've had with autofs for years is using autofs direct mount > maps of any appreciable size cause several key user space applications > to consume all available CPU while autofs is starting or stopping which > takes a fair while with a very large mount table. I saw a couple of > applications affected purely because of the large mount table but not > as badly as starting or stopping autofs. > > Maps of 5,000 to 10,000 map entries can almost be handled, not uncommon > for heavy autofs users in spite of the problem, but much larger than > that and you've got a serious problem. > > There are problems with expiration as well but that's more an autofs > problem that I need to fix. > > To be clear it's not autofs that needs the improvement (I need to > deal with this in autofs itself) it's the affect that these large > mount tables have on the rest of the user space and that's quite > significant. According to dhowell's measurements processing 100k mounts would take about a few seconds of system time (that's the time spent by the kernel to retrieve the data, obviously the userspace processing would add to that, but that's independent of the kernel patchset). I think that sort of time spent by the kernel is entirely reasonable and is probably not worth heavy optimization, since userspace is probably going to spend as much, if not more time with each mount entry. > I can't even think about resolving my autofs problem until this > problem is resolved and handling very large numbers of mounts > as efficiently as possible must be part of that solution for me > and I think for the OS overall too. The key to that is allowing userspace to retrieve individual mount entries instead of having to parse the complete mount table on every change. Thanks, Miklos