From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD8D6C4332D for ; Fri, 20 Mar 2020 18:24:56 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 82D8E20786 for ; Fri, 20 Mar 2020 18:24:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="w2Kwf2qr" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 82D8E20786 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1CCE76B0003; Fri, 20 Mar 2020 14:24:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A49E6B0008; Fri, 20 Mar 2020 14:24:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0BC2E6B000A; Fri, 20 Mar 2020 14:24:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0136.hostedemail.com [216.40.44.136]) by kanga.kvack.org (Postfix) with ESMTP id E8DDD6B0003 for ; Fri, 20 Mar 2020 14:24:55 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 89F5B824556B for ; Fri, 20 Mar 2020 18:24:55 +0000 (UTC) X-FDA: 76616566950.29.brush06_6523778aca501 X-HE-Tag: brush06_6523778aca501 X-Filterd-Recvd-Size: 5383 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Fri, 20 Mar 2020 18:24:54 +0000 (UTC) Received: from sol.localdomain (c-107-3-166-239.hsd1.ca.comcast.net [107.3.166.239]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E406F20767; Fri, 20 Mar 2020 18:24:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584728694; bh=iXk1jLxqgPXgnR9ja2pvER1tlxPb7y75DLTL+2ZEEvs=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=w2Kwf2qrgq9mF3e+Y3DstRvcRM3IPcP1rBsX2IFVwGk7IR9XNsgwueVzsL3c0gBZe +MuPXs0V8pB0UpnbySniOTBsbDmUlJD9DiQwI2PQqZKV0rdBr17ai4ecQWWo63ilaO umg5VrnjKfVftfBjPZrbhNTs/LLm4csl2Am6T0gA= Date: Fri, 20 Mar 2020 11:24:52 -0700 From: Eric Biggers To: Matthew Wilcox Cc: Andrew Morton , linux-xfs@vger.kernel.org, William Kucharski , John Hubbard , linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, ocfs2-devel@oss.oracle.com, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-btrfs@vger.kernel.org Subject: Re: [PATCH v9 12/25] mm: Move end_index check out of readahead loop Message-ID: <20200320182452.GF851@sol.localdomain> References: <20200320142231.2402-1-willy@infradead.org> <20200320142231.2402-13-willy@infradead.org> <20200320165828.GB851@sol.localdomain> <20200320173040.GB4971@bombadil.infradead.org> <20200320180017.GE851@sol.localdomain> <20200320181132.GD4971@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200320181132.GD4971@bombadil.infradead.org> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Mar 20, 2020 at 11:11:32AM -0700, Matthew Wilcox wrote: > On Fri, Mar 20, 2020 at 11:00:17AM -0700, Eric Biggers wrote: > > On Fri, Mar 20, 2020 at 10:30:40AM -0700, Matthew Wilcox wrote: > > > On Fri, Mar 20, 2020 at 09:58:28AM -0700, Eric Biggers wrote: > > > > On Fri, Mar 20, 2020 at 07:22:18AM -0700, Matthew Wilcox wrote: > > > > > + /* Avoid wrapping to the beginning of the file */ > > > > > + if (index + nr_to_read < index) > > > > > + nr_to_read = ULONG_MAX - index + 1; > > > > > + /* Don't read past the page containing the last byte of the file */ > > > > > + if (index + nr_to_read >= end_index) > > > > > + nr_to_read = end_index - index + 1; > > > > > > > > There seem to be a couple off-by-one errors here. Shouldn't it be: > > > > > > > > /* Avoid wrapping to the beginning of the file */ > > > > if (index + nr_to_read < index) > > > > nr_to_read = ULONG_MAX - index; > > > > > > I think it's right. Imagine that index is ULONG_MAX. We should read one > > > page (the one at ULONG_MAX). That would be ULONG_MAX - ULONG_MAX + 1. > > > > > > > /* Don't read past the page containing the last byte of the file */ > > > > if (index + nr_to_read > end_index) > > > > nr_to_read = end_index - index + 1; > > > > > > > > I.e., 'ULONG_MAX - index' rather than 'ULONG_MAX - index + 1', so that > > > > 'index + nr_to_read' is then ULONG_MAX rather than overflowed to 0. > > > > > > > > Then 'index + nr_to_read > end_index' rather 'index + nr_to_read >= end_index', > > > > since otherwise nr_to_read can be increased by 1 rather than decreased or stay > > > > the same as expected. > > > > > > Ooh, I missed the overflow case here. It should be: > > > > > > + if (index + nr_to_read - 1 > end_index) > > > + nr_to_read = end_index - index + 1; > > > > > > > But then if someone passes index=0 and nr_to_read=0, this underflows and the > > entire file gets read. > > nr_to_read == 0 doesn't make sense ... I thought we filtered that out > earlier, but I can't find anywhere that does that right now. I'd > rather return early from __do_page_cache_readahead() to fix that. > > > The page cache isn't actually supposed to contain a page at index ULONG_MAX, > > since MAX_LFS_FILESIZE is at most ((loff_t)ULONG_MAX << PAGE_SHIFT), right? So > > I don't think we need to worry about reading the page with index ULONG_MAX. > > I.e. I think it's fine to limit nr_to_read to 'ULONG_MAX - index', if that makes > > it easier to avoid an overflow or underflow in the next check. > > I think we can get a page at ULONG_MAX on 32-bit systems? I mean, we can buy > hard drives which are larger than 16TiB these days: > https://www.pcmag.com/news/seagate-will-ship-18tb-and-20tb-hard-drives-in-2020 > (even ignoring RAID devices) The max file size is ((loff_t)ULONG_MAX << PAGE_SHIFT) which means the maximum page *index* is ULONG_MAX - 1, not ULONG_MAX. Anyway, I think we may be making this much too complicated. How about just: pgoff_t i_nrpages = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); if (index >= i_nrpages) return; /* Don't read past the end of the file */ nr_to_read = min(nr_to_read, i_nrpages - index); That's 2 branches instead of 4. (Note that assigning to i_nrpages can't overflow, since the max number of pages is ULONG_MAX not ULONG_MAX + 1.) - Eric