From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755382AbbBCH4T (ORCPT ); Tue, 3 Feb 2015 02:56:19 -0500 Received: from zeniv.linux.org.uk ([195.92.253.2]:39033 "EHLO ZenIV.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752000AbbBCH4R (ORCPT ); Tue, 3 Feb 2015 02:56:17 -0500 Date: Tue, 3 Feb 2015 07:56:16 +0000 From: Al Viro To: Alexander Holler Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/5] WIP: Add syscall unlinkat_s (currently x86* only) Message-ID: <20150203075616.GA29656@ZenIV.linux.org.uk> References: <1422896713-25367-1-git-send-email-holler@ahsoftware.de> <1422896713-25367-2-git-send-email-holler@ahsoftware.de> <20150203060542.GZ29656@ZenIV.linux.org.uk> <54D071AA.1030302@ahsoftware.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <54D071AA.1030302@ahsoftware.de> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Feb 03, 2015 at 07:58:50AM +0100, Alexander Holler wrote: > > Charming. Now, what exactly happens if two such syscalls overlap in time? > > What do you think will happen? I assume you haven't looked at how I've > implemented set_secure_delete(). CHarming. AFAICS, you get random unlink() happening at the same time hit by that mess, whether they'd asked for it or not. What's more, this counter of yours is *not* guaranteed to be elevated during the final iput() of the inode you wanted to get - again, ls -lR racing with that syscall can elevate the refcount of dentry, making d_delete() in vfs_unlink() just remove that dentry from hash, while keeping it positive. If dentry reference grabbed by stat(2) is released after both dput() and iput() in do_unlinkat(), the final iput() will be done when stat(2) drops its reference to dentry, triggering immediate dentry_kill() (since dentry has already been unhashed) and dentry_iput() from it. IOW, this counter is both too crude (it's fs-wide, for crying out loud) *and* not guaranteed to cover enough. _IF_ you want that behaviour at all, it ought to be an in-core inode flag set by that syscall and checked by truncation logics to decide whether to do normal truncate of this "overwrite with zeroes" thing. While we are at it, "overwrite with zeroes" is too weak if the attacker might get hold of the actual hardware. Google for details - it's far too long story for l-k posting. Look for data recovery and secure data erasure...