From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F342C433DF for ; Thu, 18 Jun 2020 01:06:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 11BFF21D7F for ; Thu, 18 Jun 2020 01:06:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="APtuJ/AW" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 11BFF21D7F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9CC2C6B0006; Wed, 17 Jun 2020 21:06:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 97E5E6B0007; Wed, 17 Jun 2020 21:06:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8920B6B0008; Wed, 17 Jun 2020 21:06:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0034.hostedemail.com [216.40.44.34]) by kanga.kvack.org (Postfix) with ESMTP id 704DA6B0006 for ; Wed, 17 Jun 2020 21:06:01 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 04D1B824556B for ; Thu, 18 Jun 2020 01:06:01 +0000 (UTC) X-FDA: 76940540922.18.rest35_2a0e4bc26e0c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id CDDA6100ED9C6 for ; Thu, 18 Jun 2020 01:06:00 +0000 (UTC) X-HE-Tag: rest35_2a0e4bc26e0c X-Filterd-Recvd-Size: 6507 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf46.hostedemail.com (Postfix) with ESMTP for ; Thu, 18 Jun 2020 01:06:00 +0000 (UTC) Received: from X1 (nat-ab2241.sltdut.senawave.net [162.218.216.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 3420E21527; Thu, 18 Jun 2020 01:05:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1592442359; bh=fCKvWeZHV681pg3LNgZSQbejrSkS+9SbuvF+zjEAulA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=APtuJ/AWPatfvPNmx/3CBgL56NJJjWfNJ/A2S0+7BCrQAJlgCh+5AqEY6R+xiNsmt UuYz3NQh6Pe+hPCTYLrrfHZEtqE0UOMNjbAi8mn1cfQe5u0ORKsA5H7Qxi7+x8BbiH SUte9/ml3rR00eV8OpZ/OHU/MrjPssnDeaSd9BME= Date: Wed, 17 Jun 2020 18:05:58 -0700 From: Andrew Morton To: Kent Overstreet Cc: linux-kernel@vger.kernel.org, viro@zeniv.linux.org.uk, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH v2 2/2] fs: generic_file_buffered_read() now uses find_get_pages_contig Message-Id: <20200617180558.9722e7337cbe3b88c4767126@linux-foundation.org> In-Reply-To: <20200610013642.4171512-2-kent.overstreet@gmail.com> References: <20200610001036.3904844-1-kent.overstreet@gmail.com> <20200610013642.4171512-2-kent.overstreet@gmail.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: CDDA6100ED9C6 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 9 Jun 2020 21:36:42 -0400 Kent Overstreet wrote: > Convert generic_file_buffered_read() to get pages to read from in > batches, and then copy data to userspace from many pages at once - in > particular, we now don't touch any cachelines that might be contended > while we're in the loop to copy data to userspace. > > This is is a performance improvement on workloads that do buffered reads > with large blocksizes, and a very large performance improvement if that > file is also being accessed concurrently by different threads. > > On smaller reads (512 bytes), there's a very small performance > improvement (1%, within the margin of error). > checkpatch goes fairly crazy over this one, mostly legitimate. > @@ -2255,6 +2194,79 @@ generic_file_buffered_read_no_cached_page(struct kiocb *iocb, > return generic_file_buffered_read_readpage(filp, mapping, page); > } > > +static int generic_file_buffered_read_get_pages(struct kiocb *iocb, > + struct iov_iter *iter, > + struct page **pages, > + unsigned nr) > +{ > + struct file *filp = iocb->ki_filp; > + struct address_space *mapping = filp->f_mapping; > + struct file_ra_state *ra = &filp->f_ra; > + pgoff_t index = iocb->ki_pos >> PAGE_SHIFT; > + pgoff_t last_index = (iocb->ki_pos + iter->count + PAGE_SIZE-1) >> PAGE_SHIFT; > + int i, j, ret, err = 0; > + > + nr = min_t(unsigned long, last_index - index, nr); > +find_page: > + if (fatal_signal_pending(current)) > + return -EINTR; > + > + ret = find_get_pages_contig(mapping, index, nr, pages); > + if (ret) > + goto got_pages; > + > + if (iocb->ki_flags & IOCB_NOWAIT) > + return -EAGAIN; > + > + page_cache_sync_readahead(mapping, ra, filp, index, last_index - index); > + > + ret = find_get_pages_contig(mapping, index, nr, pages); > + if (ret) > + goto got_pages; > + > + pages[0] = generic_file_buffered_read_no_cached_page(iocb, iter); > + err = PTR_ERR_OR_ZERO(pages[0]); > + ret = !IS_ERR_OR_NULL(pages[0]); what? > +got_pages: > + for (i = 0; i < ret; i++) { Comparing i with ret here just hurts my brain. Two lines ago ret was a boolean, now it's a scalar. > + struct page *page = pages[i]; > + pgoff_t pg_index = index +i; > + loff_t pg_pos = max(iocb->ki_pos, > + (loff_t) pg_index << PAGE_SHIFT); hm. I guess we can't use max_t here because we need to cast the pgoff_t before the << to avoid overflows on 32-bit. Perhaps this could be cleaned up by using additional suitably typed and named locals. > + loff_t pg_count = iocb->ki_pos + iter->count - pg_pos; > + > + if (PageReadahead(page)) > + page_cache_async_readahead(mapping, ra, filp, page, > + pg_index, last_index - pg_index); > + > + if (!PageUptodate(page)) { > + if (iocb->ki_flags & IOCB_NOWAIT) { > + for (j = i; j < ret; j++) > + put_page(pages[j]); > + ret = i; > + err = -EAGAIN; > + break; > + } > + > + page = generic_file_buffered_read_pagenotuptodate(filp, > + iter, page, pg_pos, pg_count); > + if (IS_ERR_OR_NULL(page)) { > + for (j = i + 1; j < ret; j++) > + put_page(pages[j]); > + ret = i; > + err = PTR_ERR_OR_ZERO(page); > + break; > + } > + } > + } > + > + if (likely(ret)) > + return ret; > + if (err) > + return err; > + goto find_page; > +} > + > /** > * generic_file_buffered_read - generic file read routine > * @iocb: the iocb to read > @@ -2275,86 +2287,108 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, > struct iov_iter *iter, ssize_t written) > { > struct file *filp = iocb->ki_filp; > + struct file_ra_state *ra = &filp->f_ra; > struct address_space *mapping = filp->f_mapping; > struct inode *inode = mapping->host; > - struct file_ra_state *ra = &filp->f_ra; > size_t orig_count = iov_iter_count(iter); > - pgoff_t last_index; > - int error = 0; > + struct page *page_array[8], **pages; > + unsigned nr_pages = ARRAY_SIZE(page_array); > + unsigned read_nr_pages = ((iocb->ki_pos + iter->count + PAGE_SIZE-1) >> PAGE_SHIFT) - > + (iocb->ki_pos >> PAGE_SHIFT); > + int i, pg_nr, error = 0; > + bool writably_mapped; > + loff_t isize, end_offset; > > if (unlikely(iocb->ki_pos >= inode->i_sb->s_maxbytes)) > return 0; > iov_iter_truncate(iter, inode->i_sb->s_maxbytes); > > - last_index = (iocb->ki_pos + iter->count + PAGE_SIZE-1) >> PAGE_SHIFT; > - > - for (;;) { > - pgoff_t index = iocb->ki_pos >> PAGE_SHIFT; > - struct page *page; > + if (read_nr_pages > nr_pages && > + (pages = kmalloc_array(read_nr_pages, sizeof(void *), GFP_KERNEL))) I agree with checkpatch! > + nr_pages = read_nr_pages; > + else > + pages = page_array; > > + do { > cond_resched(); > > ... > Please, can we make all this code nice to read?