All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] implement orangefs_readahead
@ 2021-01-31 22:25 Mike Marshall
  2021-02-01 13:08 ` Matthew Wilcox
  0 siblings, 1 reply; 29+ messages in thread
From: Mike Marshall @ 2021-01-31 22:25 UTC (permalink / raw)
  To: linux-fsdevel, Matthew Wilcox, Mike Marshall

A few weeks ago Matthew Wilcox helped me see how
mm/readahead.c/read_pages was dropping down into
some code designed to take over for filesystems
that didn't implement ->readahead, and how this "failover"
code was screwing over the readahead-like code I'd put
into orangefs_readpage.

I studied a bunch of readahead code in fs and mm and other
filesystems and came up with this patch that seems to work
in the tests I've done so far. Sometimes code I like is
instead irritating to Linus or Al Viro :-), so hopefully some
of y'all will look over what I've got here. There's a
couple of printk's I've left in orangefs_readpage that don't
belong upstream, they help me now for testing though...
Besides the diff at the end of this message, the code is
in: https://github.com/hubcapsc/linux/tree/readahead

I wish I knew how to specify _nr_pages in the readahead_control
structure so that all the extra pages I need could be obtained
in readahead_page instead of part there and the rest in my
open-coded stuff in orangefs_readpage. But it looks to me as
if values in the readahead_control structure are set heuristically
outside of my control over in ondemand_readahead?

[root@vm3 linux]# git diff master..readahead
diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c
index 48f0547d4850..682a968cb82a 100644
--- a/fs/orangefs/inode.c
+++ b/fs/orangefs/inode.c
@@ -244,6 +244,25 @@ static int orangefs_writepages(struct
address_space *mapping,

 static int orangefs_launder_page(struct page *);

+/*
+ * Prefill the page cache with some pages that we're probably
+ * about to need...
+ */
+static void orangefs_readahead(struct readahead_control *rac)
+{
+       pgoff_t index = readahead_index(rac);
+       struct page *page;
+
+       while ((page = readahead_page(rac))) {
+               prefetchw(&page->flags);
+               put_page(page);
+               unlock_page(page);
+               index++;
+       }
+
+       return;
+}
+
 static int orangefs_readpage(struct file *file, struct page *page)
 {
        struct inode *inode = page->mapping->host;
@@ -260,11 +279,16 @@ static int orangefs_readpage(struct file *file,
struct page *page)
        int remaining;

        /*
-        * Get up to this many bytes from Orangefs at a time and try
-        * to fill them into the page cache at once. Tests with dd made
-        * this seem like a reasonable static number, if there was
-        * interest perhaps this number could be made setable through
-        * sysfs...
+        * Orangefs isn't a good fit for reading files one page at
+        * a time. Get up to "read_size" bytes from Orangefs at a time and
+        * try to fill them into the page cache at once. Readahead code in
+        * mm already got us some extra pages by calling orangefs_readahead,
+        * but it doesn't know how many we actually wanted, so we'll
+        * get some more after we use up the extra ones we got from
+        * orangefs_readahead. Tests with dd made "read_size" seem
+        * like a reasonable static number of bytes to get from orangefs,
+        * if there was interest perhaps "read_size" could be made
+        * setable through sysfs or something...
         */
        read_size = 524288;

@@ -302,31 +326,19 @@ static int orangefs_readpage(struct file *file,
struct page *page)
                slot_index = 0;
                while ((remaining - PAGE_SIZE) >= PAGE_SIZE) {
                        remaining -= PAGE_SIZE;
-                       /*
-                        * It is an optimization to try and fill more than one
-                        * page... by now we've already gotten the single
-                        * page we were after, if stuff doesn't seem to
-                        * be going our way at this point just return
-                        * and hope for the best.
-                        *
-                        * If we look for pages and they're already there is
-                        * one reason to give up, and if they're not there
-                        * and we can't create them is another reason.
-                        */

                        index++;
                        slot_index++;
-                       next_page = find_get_page(inode->i_mapping, index);
+                       next_page = find_lock_page(inode->i_mapping, index);
                        if (next_page) {
-                               gossip_debug(GOSSIP_FILE_DEBUG,
-                                       "%s: found next page, quitting\n",
-                                       __func__);
-                               put_page(next_page);
-                               goto out;
+                               printk("%s: found readahead page\n", __func__);
+                       } else {
+                               next_page =
+                                       find_or_create_page(inode->i_mapping,
+                                                               index,
+                                                               GFP_KERNEL);
+                               printk("%s: alloced my own page\n", __func__);
                        }
-                       next_page = find_or_create_page(inode->i_mapping,
-                                                       index,
-                                                       GFP_KERNEL);
                        /*
                         * I've never hit this, leave it as a printk for
                         * now so it will be obvious.
@@ -659,6 +671,7 @@ static ssize_t orangefs_direct_IO(struct kiocb *iocb,
 /** ORANGEFS2 implementation of address space operations */
 static const struct address_space_operations orangefs_address_operations = {
        .writepage = orangefs_writepage,
+       .readahead = orangefs_readahead,
        .readpage = orangefs_readpage,
        .writepages = orangefs_writepages,
        .set_page_dirty = __set_page_dirty_nobuffers,
diff --git a/fs/orangefs/orangefs-mod.c b/fs/orangefs/orangefs-mod.c
index 74a3d6337ef4..cd7297815f91 100644
--- a/fs/orangefs/orangefs-mod.c
+++ b/fs/orangefs/orangefs-mod.c
@@ -31,7 +31,7 @@ static ulong module_parm_debug_mask;
 __u64 orangefs_gossip_debug_mask;
 int op_timeout_secs = ORANGEFS_DEFAULT_OP_TIMEOUT_SECS;
 int slot_timeout_secs = ORANGEFS_DEFAULT_SLOT_TIMEOUT_SECS;
-int orangefs_cache_timeout_msecs = 50;
+int orangefs_cache_timeout_msecs = 500;
 int orangefs_dcache_timeout_msecs = 50;
 int orangefs_getattr_timeout_msecs = 50;

^ permalink raw reply related	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2021-04-26 20:02 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-31 22:25 [RFC PATCH] implement orangefs_readahead Mike Marshall
2021-02-01 13:08 ` Matthew Wilcox
2021-02-02  3:32   ` Mike Marshall
2021-03-13 15:31     ` [RFC PATCH v2] " Mike Marshall
2021-03-17  3:04       ` Mike Marshall
2021-03-24 11:10     ` David Howells
2021-03-27  2:55       ` Mike Marshall
2021-03-27  3:50         ` Matthew Wilcox
2021-03-27  8:31         ` David Howells
2021-03-27 13:56           ` Matthew Wilcox
2021-03-27 15:40             ` Mike Marshall
2021-03-27 15:56               ` Matthew Wilcox
2021-03-28  3:04                 ` Mike Marshall
2021-03-29  1:51                   ` Mike Marshall
2021-03-29  9:37                   ` David Howells
2021-03-29 23:25                     ` Mike Marshall
     [not found]                       ` <3726695.1617284551@warthog.procyon.org.uk >
2021-04-13 15:08                         ` David Howells
2021-04-16 14:36                           ` Mike Marshall
2021-04-16 15:14                             ` Matthew Wilcox
2021-04-25  1:51                               ` Mike Marshall
2021-04-16 15:41                           ` David Howells
2021-04-25  1:43                             ` Mike Marshall
2021-04-25  7:49                             ` David Howells
2021-04-26 14:53                               ` Mike Marshall
2021-04-26 19:01                                 ` Mike Marshall
2021-04-26 20:01                                 ` David Howells
2021-04-26  8:37                             ` David Howells
2021-04-01 13:42                     ` David Howells
2021-04-08 20:39                       ` Mike Marshall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.