From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7704BC433E1 for ; Wed, 8 Jul 2020 11:15:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5497420772 for ; Wed, 8 Jul 2020 11:15:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="oR6mc7oX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728782AbgGHLPP (ORCPT ); Wed, 8 Jul 2020 07:15:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725972AbgGHLPO (ORCPT ); Wed, 8 Jul 2020 07:15:14 -0400 Received: from mail-qt1-x842.google.com (mail-qt1-x842.google.com [IPv6:2607:f8b0:4864:20::842]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7961CC08C5DC for ; Wed, 8 Jul 2020 04:15:13 -0700 (PDT) Received: by mail-qt1-x842.google.com with SMTP id w34so14055101qte.1 for ; Wed, 08 Jul 2020 04:15:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mi6TL9T9iKfT5X0caZoWFf4L6ARlFDMGl6M6HUbAyGQ=; b=oR6mc7oXySioKIzpAZZS+JHd5d+4ZTUEYZi/yss9U0fkOuZo40/tparovX9Ueoos6v kOL2BlB3OVJnT9TYSGkFIkw1wRcR2b4G+Qs/l/o/QjiKD37PwGsiOQFBO6ee0cMMX6EN e0/BF/D9gdAKMXV08vSlx6b01PpCxrKoANdz6XjdNITkozPTjWilO8xoDBxeFiwyOzZS ioVwvbF16lpoIb4GVmeBEcB6B9q+SZRM9XUq1+Rgf5dvl7XdRD9RYz2eHw0wu/O9/zsB OtmSn++vBAbqSlmhKfRp1UGX5z9i3HU0Q2ac/5TVOq0bYpiQs8RUhhyLdlTY9mZNMJtC /dvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mi6TL9T9iKfT5X0caZoWFf4L6ARlFDMGl6M6HUbAyGQ=; b=rbEOgYLIflmF8fGHxUxg+lrJWZw1L7xoo3bQ5nT6KzgKTPxCyZ8llYzTbZPID9ylGt bVTe0iExEFdseDIKnWZKrhgUsHK7V9yAilptBI87XxNe1nGchdoRST8ZVpoERdHHlhrD 64i4uSJFjr61ZS0WjC73xcA6Jy4BWtnFEX9M9lmNdV2gj/7E+BRKfik9Deh0lhTzJN5x 00dMj0gq23INKQSz1gJCbLDyCcf0V5wppMbBCKKOaquT0rq/Vq0Tpdw/hhquWEmhKrYj lA3EAHAwPd5Itfter4kw6TS3KHn+35qRJlsfxPY+T0EOWDjElXlE4ci6e75DCLNx3v4q cRMg== X-Gm-Message-State: AOAM533qKXn6aNrlM+4cqErZc4VXmRt+Sx3veNa0pJyMnd896ZX8A40Z ejRi3uceWvPFXZDJpPGGXYfNR4tD5yGxdq3cvwIrgg== X-Google-Smtp-Source: ABdhPJz0cuf2U1RcCfJfUPmZLeauOtF48OOTeofNCW7eLZV45C29Pz+WifPZjuIr9BN+3xKxRdS7nyJmcjvANfOCiIA= X-Received: by 2002:ac8:4c88:: with SMTP id j8mr57010251qtv.57.1594206912372; Wed, 08 Jul 2020 04:15:12 -0700 (PDT) MIME-Version: 1.0 References: <000000000000a47ace05a9c7b825@google.com> <20200707152411.GD25069@quack2.suse.cz> <20200707181710.GD32331@gaia> <20200708110814.GA6308@gaia> In-Reply-To: <20200708110814.GA6308@gaia> From: Dmitry Vyukov Date: Wed, 8 Jul 2020 13:15:00 +0200 Message-ID: Subject: Re: memory leak in inotify_update_watch To: Catalin Marinas Cc: Jan Kara , syzbot , Amir Goldstein , linux-fsdevel , LKML , syzkaller-bugs , syzkaller Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 8, 2020 at 1:08 PM Catalin Marinas wrote: > > On Wed, Jul 08, 2020 at 09:17:37AM +0200, Dmitry Vyukov wrote: > > On Tue, Jul 7, 2020 at 8:17 PM Catalin Marinas wrote: > > > Kmemleak never performs well under heavy load. Normally you'd need to > > > let the system settle for a bit before checking whether the leaks are > > > still reported. The issue is caused by the memory scanning not stopping > > > the whole machine, so pointers may be hidden in registers on different > > > CPUs (list insertion/deletion for example causes transient kmemleak > > > confusion). > > > > > > I think the syzkaller guys tried a year or so ago to run it in parallel > > > with kmemleak and gave up shortly. The proposal was to add a "stopscan" > > > command to kmemleak which would do this under stop_machine(). However, > > > no-one got to implementing it. > > > > > > So, in this case, does the leak still appear with the reproducer, once > > > the system went idle? > > > > This report came from syzbot, so obviously we did not give up :) > > That's good to know ;). > > > We don't run scanning in parallel with fuzzing and do a very intricate > > multi-step dance to overcome false positives: > > https://github.com/google/syzkaller/blob/5962a2dc88f6511b77100acdf687c1088f253f6b/executor/common_linux.h#L3407-L3478 > > and only report leaks that are reproducible. > > So far I have not seen any noticable amount of false positives, and > > you can see 70 already fixed leaks here: > > https://syzkaller.appspot.com/upstream/fixed?manager=ci-upstream-gce-leak > > https://syzkaller.appspot.com/upstream?manager=ci-upstream-gce-leak > > Thanks for the information and the good work here. If you have time, you > could implement the stop_machine() kmemleak scan as well ;). stop_machine will only help with pointers stored in registers/jumping in memory. But there may be other sources of false positives like hidden pointers via some hashing, offsets, reused low/high bits. Doing several scans and crc checksum of object contents helps with these as well and is orthogonal to stop_machine. So now I wonder if using stop_machine will actually solve all problems... because if not, then doing this work but then having to do several scans and checksums anyway is kinda pointless...