From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 773F8C433EF for ; Mon, 11 Oct 2021 17:38:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5AA7360EBB for ; Mon, 11 Oct 2021 17:38:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233171AbhJKRj7 (ORCPT ); Mon, 11 Oct 2021 13:39:59 -0400 Received: from mail.kernel.org ([198.145.29.99]:58900 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232708AbhJKRj6 (ORCPT ); Mon, 11 Oct 2021 13:39:58 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 2A26E60560; Mon, 11 Oct 2021 17:37:56 +0000 (UTC) Date: Mon, 11 Oct 2021 18:37:52 +0100 From: Catalin Marinas To: Al Viro Cc: Linus Torvalds , Andreas Gruenbacher , Christoph Hellwig , "Darrick J. Wong" , Jan Kara , Matthew Wilcox , cluster-devel , linux-fsdevel , Linux Kernel Mailing List , "ocfs2-devel@oss.oracle.com" , Josef Bacik , Will Deacon Subject: Re: [RFC][arm64] possible infinite loop in btrfs search_ioctl() Message-ID: References: <20210827164926.1726765-6-agruenba@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Tue, Aug 31, 2021 at 03:28:57PM +0000, Al Viro wrote: > On Tue, Aug 31, 2021 at 02:54:50PM +0100, Catalin Marinas wrote: > > An arm64-specific workaround would be for pagefault_disable() to disable > > tag checking. It's a pretty big hammer, weakening the out of bounds > > access detection of MTE. My preference would be a fix in the btrfs code. > > > > A btrfs option would be for copy_to_sk() to return an indication of > > where the fault occurred and get fault_in_pages_writeable() to check > > that location, even if the copying would restart from an earlier offset > > (this requires open-coding copy_to_user_nofault()). An attempt below, > > untested and does not cover read_extent_buffer_to_user_nofault(): > > Umm... There's another copy_to_user_nofault() call in the same function > (same story, AFAICS). I cleaned up this patch [1] but I realised it still doesn't solve it. The arm64 __copy_to_user_inatomic(), while ensuring progress if called in a loop, it does not guarantee precise copy to the fault position. The copy_to_sk(), after returning an error, starts again from the previous sizeof(sh) boundary rather than from where the __copy_to_user_inatomic() stopped. So it can get stuck attempting to copy the same search header. An ugly fix is to fall back to byte by byte copying so that we can attempt the actual fault address in fault_in_pages_writeable(). If the sh being recreated in copy_to_sk() is the same on the retried iteration, we could use an *sk_offset that is not a multiple of sizeof(sh) in order to have progress. But it's not clear to me whether the data being copied can change once btrfs_release_path() is called. [1] https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=devel/btrfs-fix -- Catalin