From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.9 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF39EC10F03 for ; Tue, 23 Apr 2019 15:05:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B598921852 for ; Tue, 23 Apr 2019 15:05:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="sUk6XvU/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727839AbfDWPF2 (ORCPT ); Tue, 23 Apr 2019 11:05:28 -0400 Received: from mail-yw1-f66.google.com ([209.85.161.66]:45670 "EHLO mail-yw1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727673AbfDWPF1 (ORCPT ); Tue, 23 Apr 2019 11:05:27 -0400 Received: by mail-yw1-f66.google.com with SMTP id r139so3000245ywe.12; Tue, 23 Apr 2019 08:05:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=yM9IZXAjfmuawC0fb8xSTK2sR89bHmk2dYLe8Yk2uxY=; b=sUk6XvU/Q2bOiwTFBLOzFWYV0gjU0cuMHCnMB0TUDIJc7I1OtuTmq5UBXUY+H4y5lF 4hjz2JVgZ6nU+Qyhn1CaCDTJzne5d1KcOO9bN4T+SqZnmtbW5Orc7dNrxgjJk/fxpHwV PgGOjhsugz1D+tfJs5BnhQO8n3zsjSesKWkkzYToqBu3PJy5kqp30P409ULeIqjRLFQO MINDxWLxx7Eqt1EXygM2hjZ6vrexyax9SHlRFXo8A1/Jpx+TdCrhvVwhRiMQ7wMAnpEF aswW5jUL0GkEB7gTkG4fJWrvFlJ/Nqm5nPYoIZbXav20ghm8VNibYGjXWXsA3Qfn22d2 PLsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=yM9IZXAjfmuawC0fb8xSTK2sR89bHmk2dYLe8Yk2uxY=; b=Fg+/OgkcOq34SQL1p3XEPFxCb9HnQO72oMWgRC6iXkHcSrSF8ZSpagS50kfuQ05NJ3 8FsSu87LwBkCSqAAsgTB9yg4QO1z4SsfNvPgGmI8cW5VuXA0mivKL1tOV8RDIsOO4XGB DOm62/XvhaTmOZUD+3GKSgD+zQ/E0KewdJKPztqmvoDOY0yRhHlcAYYXJu2NNCo3bYNl fyJBG87BrFk5wh8N7Z3EUpxErQj3nMcYpMPS1qLzOInBAgC/860poCvWEbrnM9Ztf9kI XdF0Sh4ibv7G50vbJ1oM/pgPVqs8CIwxzCG58LiuJuAqmw7Mn4MIpD/q1SnM87y9wIh2 kf4A== X-Gm-Message-State: APjAAAXkhOEAqj6nF7FiYquIwtBel1uIPYfOGef/Uc+MRmJQmoZ9BTc1 1vcauPW1z9XH7TPuQi4Lhe7x4a1rsInFUPO81qiLpaIu X-Google-Smtp-Source: APXvYqxsqNaA9FFQ3o19fdkonPiV0mZ04LsvibnCFrkvds709htLZslz0ZzRyEeHQIQmhywhYR3fRdjWb/UF8XzMIKM= X-Received: by 2002:a0d:ff82:: with SMTP id p124mr22616214ywf.409.1556031926345; Tue, 23 Apr 2019 08:05:26 -0700 (PDT) MIME-Version: 1.0 References: <20190423065024.12695-1-jencce.kernel@gmail.com> In-Reply-To: From: Amir Goldstein Date: Tue, 23 Apr 2019 18:05:14 +0300 Message-ID: Subject: Re: [PATCH] ovl: add RWF_NONITIFY flag to skip wrong duplicate fanotify event To: Miklos Szeredi Cc: Murphy Zhou , overlayfs , linux-fsdevel , Jan Kara Content-Type: text/plain; charset="UTF-8" Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Tue, Apr 23, 2019 at 5:16 PM Miklos Szeredi wrote: > > On Tue, Apr 23, 2019 at 3:53 PM Amir Goldstein wrote: > > > > On Tue, Apr 23, 2019 at 4:41 PM Miklos Szeredi wrote: > > > > > > On Tue, Apr 23, 2019 at 2:44 PM Amir Goldstein wrote: > > > > > > > > On Tue, Apr 23, 2019 at 2:40 PM Miklos Szeredi wrote: > > > > > > > > > > On Tue, Apr 23, 2019 at 1:00 PM Amir Goldstein wrote: > > > > > > > > > > > > On Tue, Apr 23, 2019 at 9:51 AM Murphy Zhou wrote: > > > > > > > > > > > > > > Overlays ovl_iter_write calls vfs_iter_write to write on real file, > > > > > > > in which calls fsnotify_modify on this change, however vfs_write also > > > > > > > calls fsnotify_modify after ovl_iter_write. The first notification > > > > > > > sent by vfs_iter_write grabs marks from upper inode and overlay mnt, > > > > > > > because of its fake path. The second one sent by vfs_write grabs marks > > > > > > > from ovl inode and ovl mnt. > > > > > > > > > > > > > > LTP fanotify06 add modify mark for mnt point, then add ignore modify > > > > > > > mask on testfile, then truncate and write the file. Because the ignore > > > > > > > mask is marked on ovl inode, not the upper inode, the first event is not > > > > > > > masked like the second one. So we get a modification event even with a > > > > > > > mask on the file. > > > > > > > > > > > > Care to extend fanotify06 in a similar manner to the way readahead02 > > > > > > was extended to test overlay test case regardless of the base LTP filesystem? > > > > > > > > > > > > > > > > > > > > Proposing fixing this by add a new RWF flag to skip fsnotify on this IO. > > > > > > > vfs_iter_write used by ovl can use this flag to skip one duplicate event. > > > > > > > > > > > > > > > > > > > This fix is wrong for several reasons: > > > > > > - It exports RWF_NONOTIFY to uapi > > > > > > - It will cause no events at all when overlay writes to file even when user > > > > > > requested events on upper inode > > > > > > > > > > > > Please try attached patch. > > > > > > > > > > Would be nice, but until mmap stops using realfile this isn't a good solution. > > > > > > > > > > > > > Sigh! I figured there was a catch... > > > > Will it be ok if fake path used a cloned private mount of overlay mount? > > > > > > So the reason we have the fake path is e.g. /proc/$$/maps. That can > > > only work with the original mount, AFAICS. > > > > > > We could have one realfile for regular I/O and a separate one for mmap > > > but that would increase complexity as well as resource use, so I'm not > > > quite sure if that's the right solution. > > > > > > > I see. Well what we could do is set a file flag for fake path, so at least > > fsnotify will be able to ignore f_path->mnt when calculating ignore mask. > > FMODE_NONOTIFY_MNT: like FMODE_NONOTIFY, but only affect mount marks? > On second thought I don't think there a need for an explicit FMODE_NONOTIFY_MNT/FMODE_FAKE_PATH flag because code can just check for file_inode(f) != d_inode(f->f_path->dentry). The question is what do do about this. I think situation is actually worse than reported (CC Jan). This problem existed even before stacked f_op, but is now manifesting differently because of stacked f_op. When user sets a mark on real inode or real parent dir inode, the user events that user gets when operations are performed via overlayfs file are with event->fd created by dentry_open(&event->path, ... So even when ignore mount marks are not involved, user will get the wrong path (overlayfs path) and referenced by event->fd while the mark was set on the real inode. The good new is that I believe fanotify marks are most commonly set on mount marks in the wild (?), so the events via overlayfs file on real inode won't be triggered in this case anyway and that's fine. FAN_MARK_FILESYSTEM may have opened up the possibility of exposing this bug to new users. In fsnotify_path(), when inode mismatches path->dentry->d_inode calling fsnotify_parent() is wrong (we are looking for parent watch of inode) and passing FSNOTIFY_EVENT_PATH info is wrong, because fsnotify() will filter by wrong mount and wrong path will we reported to user via event->fd. We could still report the event with FSNOTIFY_EVENT_INODE info, so at least inotify/audit watch on inode itself will work, but parent watches will not work as well as legacy fanotify inode watches. New FAN_REPORT_FID fanotify watches should work just fine. Any other ideas how to untangle? Thanks, Amir.