All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Marshall <hubcap@omnibond.com>
To: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Mike Marshall <hubcap@omnibond.com>
Subject: [RFC PATCH] implement orangefs_readahead
Date: Sun, 31 Jan 2021 17:25:02 -0500	[thread overview]
Message-ID: <CAOg9mSTQ-zNKXQGBK9QEnwJCvwqh=zFLbLJZy-ibGZwLve4o0w@mail.gmail.com> (raw)

A few weeks ago Matthew Wilcox helped me see how
mm/readahead.c/read_pages was dropping down into
some code designed to take over for filesystems
that didn't implement ->readahead, and how this "failover"
code was screwing over the readahead-like code I'd put
into orangefs_readpage.

I studied a bunch of readahead code in fs and mm and other
filesystems and came up with this patch that seems to work
in the tests I've done so far. Sometimes code I like is
instead irritating to Linus or Al Viro :-), so hopefully some
of y'all will look over what I've got here. There's a
couple of printk's I've left in orangefs_readpage that don't
belong upstream, they help me now for testing though...
Besides the diff at the end of this message, the code is
in: https://github.com/hubcapsc/linux/tree/readahead

I wish I knew how to specify _nr_pages in the readahead_control
structure so that all the extra pages I need could be obtained
in readahead_page instead of part there and the rest in my
open-coded stuff in orangefs_readpage. But it looks to me as
if values in the readahead_control structure are set heuristically
outside of my control over in ondemand_readahead?

[root@vm3 linux]# git diff master..readahead
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c
index 48f0547d4850..682a968cb82a 100644
--- a/fs/orangefs/inode.c
+++ b/fs/orangefs/inode.c
@@ -244,6 +244,25 @@ static int orangefs_writepages(struct
address_space *mapping,

 static int orangefs_launder_page(struct page *);

+/*
+ * Prefill the page cache with some pages that we're probably
+ * about to need...
+ */
+static void orangefs_readahead(struct readahead_control *rac)
+{
+       pgoff_t index = readahead_index(rac);
+       struct page *page;
+
+       while ((page = readahead_page(rac))) {
+               prefetchw(&page->flags);
+               put_page(page);
+               unlock_page(page);
+               index++;
+       }
+
+       return;
+}
+
 static int orangefs_readpage(struct file *file, struct page *page)
 {
        struct inode *inode = page->mapping->host;
@@ -260,11 +279,16 @@ static int orangefs_readpage(struct file *file,
struct page *page)
        int remaining;

        /*
-        * Get up to this many bytes from Orangefs at a time and try
-        * to fill them into the page cache at once. Tests with dd made
-        * this seem like a reasonable static number, if there was
-        * interest perhaps this number could be made setable through
-        * sysfs...
+        * Orangefs isn't a good fit for reading files one page at
+        * a time. Get up to "read_size" bytes from Orangefs at a time and
+        * try to fill them into the page cache at once. Readahead code in
+        * mm already got us some extra pages by calling orangefs_readahead,
+        * but it doesn't know how many we actually wanted, so we'll
+        * get some more after we use up the extra ones we got from
+        * orangefs_readahead. Tests with dd made "read_size" seem
+        * like a reasonable static number of bytes to get from orangefs,
+        * if there was interest perhaps "read_size" could be made
+        * setable through sysfs or something...
         */
        read_size = 524288;

@@ -302,31 +326,19 @@ static int orangefs_readpage(struct file *file,
struct page *page)
                slot_index = 0;
                while ((remaining - PAGE_SIZE) >= PAGE_SIZE) {
                        remaining -= PAGE_SIZE;
-                       /*
-                        * It is an optimization to try and fill more than one
-                        * page... by now we've already gotten the single
-                        * page we were after, if stuff doesn't seem to
-                        * be going our way at this point just return
-                        * and hope for the best.
-                        *
-                        * If we look for pages and they're already there is
-                        * one reason to give up, and if they're not there
-                        * and we can't create them is another reason.
-                        */

                        index++;
                        slot_index++;
-                       next_page = find_get_page(inode->i_mapping, index);
+                       next_page = find_lock_page(inode->i_mapping, index);
                        if (next_page) {
-                               gossip_debug(GOSSIP_FILE_DEBUG,
-                                       "%s: found next page, quitting\n",
-                                       __func__);
-                               put_page(next_page);
-                               goto out;
+                               printk("%s: found readahead page\n", __func__);
+                       } else {
+                               next_page =
+                                       find_or_create_page(inode->i_mapping,
+                                                               index,
+                                                               GFP_KERNEL);
+                               printk("%s: alloced my own page\n", __func__);
                        }
-                       next_page = find_or_create_page(inode->i_mapping,
-                                                       index,
-                                                       GFP_KERNEL);
                        /*
                         * I've never hit this, leave it as a printk for
                         * now so it will be obvious.
@@ -659,6 +671,7 @@ static ssize_t orangefs_direct_IO(struct kiocb *iocb,
 /** ORANGEFS2 implementation of address space operations */
 static const struct address_space_operations orangefs_address_operations = {
        .writepage = orangefs_writepage,
+       .readahead = orangefs_readahead,
        .readpage = orangefs_readpage,
        .writepages = orangefs_writepages,
        .set_page_dirty = __set_page_dirty_nobuffers,
diff --git a/fs/orangefs/orangefs-mod.c b/fs/orangefs/orangefs-mod.c
index 74a3d6337ef4..cd7297815f91 100644
--- a/fs/orangefs/orangefs-mod.c
+++ b/fs/orangefs/orangefs-mod.c
@@ -31,7 +31,7 @@ static ulong module_parm_debug_mask;
 __u64 orangefs_gossip_debug_mask;
 int op_timeout_secs = ORANGEFS_DEFAULT_OP_TIMEOUT_SECS;
 int slot_timeout_secs = ORANGEFS_DEFAULT_SLOT_TIMEOUT_SECS;
-int orangefs_cache_timeout_msecs = 50;
+int orangefs_cache_timeout_msecs = 500;
 int orangefs_dcache_timeout_msecs = 50;
 int orangefs_getattr_timeout_msecs = 50;

             reply	other threads:[~2021-01-31 22:28 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-31 22:25 Mike Marshall [this message]
2021-02-01 13:08 ` [RFC PATCH] implement orangefs_readahead Matthew Wilcox
2021-02-02  3:32   ` Mike Marshall
2021-03-13 15:31     ` [RFC PATCH v2] " Mike Marshall
2021-03-17  3:04       ` Mike Marshall
2021-03-24 11:10     ` David Howells
2021-03-27  2:55       ` Mike Marshall
2021-03-27  3:50         ` Matthew Wilcox
2021-03-27  8:31         ` David Howells
2021-03-27 13:56           ` Matthew Wilcox
2021-03-27 15:40             ` Mike Marshall
2021-03-27 15:56               ` Matthew Wilcox
2021-03-28  3:04                 ` Mike Marshall
2021-03-29  1:51                   ` Mike Marshall
2021-03-29  9:37                   ` David Howells
2021-03-29 23:25                     ` Mike Marshall
     [not found]                       ` <3726695.1617284551@warthog.procyon.org.uk >
2021-04-13 15:08                         ` David Howells
2021-04-16 14:36                           ` Mike Marshall
2021-04-16 15:14                             ` Matthew Wilcox
2021-04-25  1:51                               ` Mike Marshall
2021-04-16 15:41                           ` David Howells
2021-04-25  1:43                             ` Mike Marshall
2021-04-25  7:49                             ` David Howells
2021-04-26 14:53                               ` Mike Marshall
2021-04-26 19:01                                 ` Mike Marshall
2021-04-26 20:01                                 ` David Howells
2021-04-26  8:37                             ` David Howells
2021-04-01 13:42                     ` David Howells
2021-04-08 20:39                       ` Mike Marshall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOg9mSTQ-zNKXQGBK9QEnwJCvwqh=zFLbLJZy-ibGZwLve4o0w@mail.gmail.com' \
    --to=hubcap@omnibond.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.