linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] Make frontswap+cleancache and its friend be modularized.
@ 2013-02-01 20:22 Konrad Rzeszutek Wilk
  2013-02-01 20:22 ` [PATCH 01/15] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules Konrad Rzeszutek Wilk
                   ` (15 more replies)
  0 siblings, 16 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:22 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

Parts of this patch have been posted in the post (way back in November), but
this patchset expanded it a bit. The goal of the patches is to make the
different frontswap/cleancache API backends be modules - and load way way after
the swap system (or filesystem) has been initialized. Naturally one can still
build the frontswap+cleancache backend devices in the kernel. The next goal
(after these patches) is to also be able to unload the backend drivers - but
that places some interesting requirements to "reload" the swap device with
swap pages (don't need to worry that much about cleancache as it is a "secondary"
cache and can be dumped). Seth had posted some patches for that in the zswap
backend - and they could be more generally repurporsed.

Anyhow, I did not want to lose the authorship of some of the patches so I
didn't squash the ones that were made by Dan and mine. I can do it for review
if it would make it easier, but from my recollection on how Linus likes things
run he would prefer to keep the history (even the kludge parts).

The general flow prior to these patches was [I am concentrating on the
frontswap here, but the cleancache is similar, just s/swapon/mount/]:

 1) kernel inits frontswap_init
 2) kernel inits zcache (or some other backend)
 3) user does swapon /dev/XX and the writes to the swap disk end up in
    frontswap and then in the backend.

With the module loading, the 1) is still part of the bootup, but the
2) or 3) can be run at anytime. This means one could load the backend
_after_ the swap disk has been initialized and running along. Or
_before_ the swap disk has been setup - but that is similar to the
existing case so not that exciting.

To deal with that scenario the frontswap keeps an queue (actually an atomic
bitmap of the swap disks that have been init) and when the backend registers -
frontswap runs the backend init on the queued up swap disks.

The interesting thing is that we can be to certain degree racy when the
swap system starts steering pages to frontswap. Meaning after the backend
has registered it is OK if the pages are still hitting the disk instead of
the backend. Naturally this is unacceptable if one were to unload the
backend (not yet supported) - as we need to be quite atomic at that stage
and need to stop processing the pages the moment the backend is being
unloaded. To support this, the frontswap is using the struct static_key
which are incredibly light when they are in usage. They are incredibly heavy
when the value switches (on/off), but that is OK. The next part of unloading is
also taking the pages that are in the backend and feed them in the swap
storage (and Seth's patches do some of this).

Also attached is one patch from Minchan that fixes the condition where the
backend was constricted in allocating memory at init - b/c we were holding
a spin-lock. His patch fixes that and we are just holding the swapon_mutex
instead. It has been rebased on top of my patches.

This patchset is based on Greg KH's staging tree (since the zcache2 has
now been renamed to zcache). To be exact, it is based on
085494ac2039433a5df9fdd6fb653579e18b8c71

Dan Magenheimer (4):
      mm: cleancache: lazy initialization to allow tmem backends to build/run as modules
      mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
      staging: zcache: enable ramster to be built/loaded as a module
      xen: tmem: enable Xen tmem shim to be built/loaded as a module

Konrad Rzeszutek Wilk (10):
      frontswap: Make frontswap_init use a pointer for the ops.
      cleancache: Make cleancache_init use a pointer for the ops
      staging: zcache: enable zcache to be built/loaded as a module
      xen/tmem: Remove the subsys call.
      frontswap: Remove the check for frontswap_enabled.
      frontswap: Use static_key instead of frontswap_enabled and frontswap_ops
      cleancache: Remove the check for cleancache_enabled.
      cleancache: Use static_key instead of cleancache_ops and cleancache_enabled.
      zcache/tmem: Better error checking on frontswap_register_ops return     value.
      xen/tmem: Add missing %s in the printk statement.

Minchan Kim (1):
      frontswap: Get rid of swap_lock dependency


 drivers/staging/zcache/Kconfig                     |   6 +-
 drivers/staging/zcache/Makefile                    |  11 +-
 drivers/staging/zcache/ramster.h                   |   6 +-
 drivers/staging/zcache/ramster/nodemanager.c       |   9 +-
 drivers/staging/zcache/ramster/ramster.c           |  29 ++-
 drivers/staging/zcache/ramster/ramster.h           |   2 +-
 .../staging/zcache/ramster/ramster_nodemanager.h   |   2 +
 drivers/staging/zcache/tmem.c                      |   6 +-
 drivers/staging/zcache/tmem.h                      |   8 +-
 drivers/staging/zcache/zcache-main.c               |  64 +++++-
 drivers/staging/zcache/zcache.h                    |   2 +-
 drivers/xen/Kconfig                                |   4 +-
 drivers/xen/tmem.c                                 |  55 +++--
 drivers/xen/xen-selfballoon.c                      |  13 +-
 include/linux/cleancache.h                         |  27 ++-
 include/linux/frontswap.h                          |  31 +--
 include/xen/tmem.h                                 |   8 +
 mm/cleancache.c                                    | 241 ++++++++++++++++++---
 mm/frontswap.c                                     | 121 ++++++++---
 mm/swapfile.c                                      |   7 +-
 20 files changed, 505 insertions(+), 147 deletions(-)


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 01/15] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
@ 2013-02-01 20:22 ` Konrad Rzeszutek Wilk
  2013-02-03  8:06   ` Ric Mason
  2013-02-01 20:22 ` [PATCH 02/15] mm: frontswap: " Konrad Rzeszutek Wilk
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:22 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel
  Cc: Stefan Hengelein, Florian Schmaus, Andor Daam

From: Dan Magenheimer <dan.magenheimer@oracle.com>

With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
built/loaded as modules rather than built-in and enabled by a boot parameter,
this patch provides "lazy initialization", allowing backends to register to
cleancache even after filesystems were mounted. Calls to init_fs and
init_shared_fs are remembered as fake poolids but no real tmem_pools created.
On backend registration the fake poolids are mapped to real poolids and
respective tmem_pools.

Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
Signed-off-by: Andor Daam <andor.daam@googlemail.com>
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Minor fixes: used #define for some values and bools]
[v2: Removed CLEANCACHE_HAS_LAZY_INIT]
[v3: Added more comments, added a lock for [shared_|]fs_poolid_map]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 mm/cleancache.c | 240 +++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 219 insertions(+), 21 deletions(-)

diff --git a/mm/cleancache.c b/mm/cleancache.c
index 32e6f41..e4dc314 100644
--- a/mm/cleancache.c
+++ b/mm/cleancache.c
@@ -45,15 +45,99 @@ static u64 cleancache_puts;
 static u64 cleancache_invalidates;
 
 /*
- * register operations for cleancache, returning previous thus allowing
- * detection of multiple backends and possible nesting
+ * When no backend is registered all calls to init_fs and init_shared_fs
+ * are registered and fake poolids (FAKE_FS_POOLID_OFFSET or
+ * FAKE_SHARED_FS_POOLID_OFFSET, plus offset in the respective array
+ * [shared_|]fs_poolid_map) are given to the respective super block
+ * (sb->cleancache_poolid) and no tmem_pools are created. When a backend
+ * registers with cleancache the previous calls to init_fs and init_shared_fs
+ * are executed to create tmem_pools and set the respective poolids. While no
+ * backend is registered all "puts", "gets" and "flushes" are ignored or failed.
+ */
+#define MAX_INITIALIZABLE_FS 32
+#define FAKE_FS_POOLID_OFFSET 1000
+#define FAKE_SHARED_FS_POOLID_OFFSET 2000
+
+#define FS_NO_BACKEND (-1)
+#define FS_UNKNOWN (-2)
+static int fs_poolid_map[MAX_INITIALIZABLE_FS];
+static int shared_fs_poolid_map[MAX_INITIALIZABLE_FS];
+static char *uuids[MAX_INITIALIZABLE_FS];
+/*
+ * Mutex for the [shared_|]fs_poolid_map to guard against multiple threads
+ * invoking umount (and ending in __cleancache_invalidate_fs) and also multiple
+ * threads calling mount (and ending up in __cleancache_init_[shared|]fs).
+ */
+static DEFINE_MUTEX(poolid_mutex);
+/*
+ * When set to false (default) all calls to the cleancache functions, except
+ * the __cleancache_invalidate_fs and __cleancache_init_[shared|]fs are guarded
+ * by the if (!backend_registered) return. This means multiple threads (from
+ * different filesystems) will be checking backend_registered. The usage of a
+ * bool instead of a atomic_t or a bool guarded by a spinlock is OK - we are
+ * OK if the time between the backend's have been initialized (and
+ * backend_registered has been set to true) and when the filesystems start
+ * actually calling the backends. The inverse (when unloading) is obviously
+ * not good - but this shim does not do that (yet).
+ */
+static bool backend_registered __read_mostly;
+
+/*
+ * The backends and filesystems work all asynchronously. This is b/c the
+ * backends can be built as modules.
+ * The usual sequence of events is:
+ * 	a) mount /	-> __cleancache_init_fs is called. We set the
+ * 		[shared_|]fs_poolid_map and uuids for.
+ *
+ * 	b). user does I/Os -> we call the rest of __cleancache_* functions
+ * 		which return immediately as backend_registered is false.
+ *
+ * 	c). modprobe zcache -> cleancache_register_ops. We init the backend
+ * 		and set backend_registered to true, and for any fs_poolid_map
+ * 		(which is set by __cleancache_init_fs) we initialize the poolid.
+ *
+ * 	d). user does I/Os -> now that backend_registered is true all the
+ * 		__cleancache_* functions can call the backend. They all check
+ * 		that fs_poolid_map is valid and if so invoke the backend.
+ *
+ * 	e). umount /	-> __cleancache_invalidate_fs, the fs_poolid_map is
+ * 		reset (which is the second check in the __cleancache_* ops
+ * 		to call the backend).
+ *
+ * The sequence of event could also be c), followed by a), and d). and e). The
+ * c) would not happen anymore. There is also the chance of c), and one thread
+ * doing a) + d), and another doing e). For that case we depend on the
+ * filesystem calling __cleancache_invalidate_fs in the proper sequence (so
+ * that it handles all I/Os before it invalidates the fs (which is last part
+ * of unmounting process).
+ *
+ * Note: The acute reader will notice that there is no "rmmod zcache" case.
+ * This is b/c the functionality for that is not yet implemented and when
+ * done, will require some extra locking not yet devised.
+ */
+
+/*
+ * Register operations for cleancache, returning previous thus allowing
+ * detection of multiple backends and possible nesting.
  */
 struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
 {
 	struct cleancache_ops old = cleancache_ops;
+	int i;
 
+	mutex_lock(&poolid_mutex);
 	cleancache_ops = *ops;
-	cleancache_enabled = 1;
+
+	backend_registered = true;
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		if (fs_poolid_map[i] == FS_NO_BACKEND)
+			fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+		if (shared_fs_poolid_map[i] == FS_NO_BACKEND)
+			shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+					(uuids[i], PAGE_SIZE);
+	}
+out:
+	mutex_unlock(&poolid_mutex);
 	return old;
 }
 EXPORT_SYMBOL(cleancache_register_ops);
@@ -61,15 +145,42 @@ EXPORT_SYMBOL(cleancache_register_ops);
 /* Called by a cleancache-enabled filesystem at time of mount */
 void __cleancache_init_fs(struct super_block *sb)
 {
-	sb->cleancache_poolid = (*cleancache_ops.init_fs)(PAGE_SIZE);
+	int i;
+
+	mutex_lock(&poolid_mutex);
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		if (fs_poolid_map[i] == FS_UNKNOWN) {
+			sb->cleancache_poolid = i + FAKE_FS_POOLID_OFFSET;
+			if (backend_registered)
+				fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+			else
+				fs_poolid_map[i] = FS_NO_BACKEND;
+			break;
+		}
+	}
+	mutex_unlock(&poolid_mutex);
 }
 EXPORT_SYMBOL(__cleancache_init_fs);
 
 /* Called by a cleancache-enabled clustered filesystem at time of mount */
 void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
 {
-	sb->cleancache_poolid =
-		(*cleancache_ops.init_shared_fs)(uuid, PAGE_SIZE);
+	int i;
+
+	mutex_lock(&poolid_mutex);
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		if (shared_fs_poolid_map[i] == FS_UNKNOWN) {
+			sb->cleancache_poolid = i + FAKE_SHARED_FS_POOLID_OFFSET;
+			uuids[i] = uuid;
+			if (backend_registered)
+				shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+						(uuid, PAGE_SIZE);
+			else
+				shared_fs_poolid_map[i] = FS_NO_BACKEND;
+			break;
+		}
+	}
+	mutex_unlock(&poolid_mutex);
 }
 EXPORT_SYMBOL(__cleancache_init_shared_fs);
 
@@ -99,27 +210,53 @@ static int cleancache_get_key(struct inode *inode,
 }
 
 /*
+ * Returns a pool_id that is associated with a given fake poolid.
+ */
+static int get_poolid_from_fake(int fake_pool_id)
+{
+	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET)
+		return shared_fs_poolid_map[fake_pool_id -
+			FAKE_SHARED_FS_POOLID_OFFSET];
+	else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET)
+		return fs_poolid_map[fake_pool_id - FAKE_FS_POOLID_OFFSET];
+	return FS_NO_BACKEND;
+}
+
+/*
  * "Get" data from cleancache associated with the poolid/inode/index
  * that were specified when the data was put to cleanache and, if
  * successful, use it to fill the specified page with data and return 0.
  * The pageframe is unchanged and returns -1 if the get fails.
  * Page must be locked by caller.
+ *
+ * The function has two checks before any action is taken - whether
+ * a backend is registered and whether the sb->cleancache_poolid
+ * is correct.
  */
 int __cleancache_get_page(struct page *page)
 {
 	int ret = -1;
 	int pool_id;
+	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
+	if (!backend_registered) {
+		cleancache_failed_gets++;
+		goto out;
+	}
+
 	VM_BUG_ON(!PageLocked(page));
-	pool_id = page->mapping->host->i_sb->cleancache_poolid;
-	if (pool_id < 0)
+	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
+	if (fake_pool_id < 0)
 		goto out;
+	pool_id = get_poolid_from_fake(fake_pool_id);
 
 	if (cleancache_get_key(page->mapping->host, &key) < 0)
 		goto out;
 
-	ret = (*cleancache_ops.get_page)(pool_id, key, page->index, page);
+	if (pool_id >= 0)
+		ret = (*cleancache_ops.get_page)(pool_id,
+				key, page->index, page);
 	if (ret == 0)
 		cleancache_succ_gets++;
 	else
@@ -134,16 +271,31 @@ EXPORT_SYMBOL(__cleancache_get_page);
  * (previously-obtained per-filesystem) poolid and the page's,
  * inode and page index.  Page must be locked.  Note that a put_page
  * always "succeeds", though a subsequent get_page may succeed or fail.
+ *
+ * The function has two checks before any action is taken - whether
+ * a backend is registered and whether the sb->cleancache_poolid
+ * is correct.
  */
 void __cleancache_put_page(struct page *page)
 {
 	int pool_id;
+	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
+	if (!backend_registered) {
+		cleancache_puts++;
+		return;
+	}
+
 	VM_BUG_ON(!PageLocked(page));
-	pool_id = page->mapping->host->i_sb->cleancache_poolid;
+	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
+	if (fake_pool_id < 0)
+		return;
+
+	pool_id = get_poolid_from_fake(fake_pool_id);
+
 	if (pool_id >= 0 &&
-	      cleancache_get_key(page->mapping->host, &key) >= 0) {
+		cleancache_get_key(page->mapping->host, &key) >= 0) {
 		(*cleancache_ops.put_page)(pool_id, key, page->index, page);
 		cleancache_puts++;
 	}
@@ -153,19 +305,31 @@ EXPORT_SYMBOL(__cleancache_put_page);
 /*
  * Invalidate any data from cleancache associated with the poolid and the
  * page's inode and page index so that a subsequent "get" will fail.
+ *
+ * The function has two checks before any action is taken - whether
+ * a backend is registered and whether the sb->cleancache_poolid
+ * is correct.
  */
 void __cleancache_invalidate_page(struct address_space *mapping,
 					struct page *page)
 {
 	/* careful... page->mapping is NULL sometimes when this is called */
-	int pool_id = mapping->host->i_sb->cleancache_poolid;
+	int pool_id;
+	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (pool_id >= 0) {
+	if (!backend_registered)
+		return;
+
+	if (fake_pool_id >= 0) {
+		pool_id = get_poolid_from_fake(fake_pool_id);
+		if (pool_id < 0)
+			return;
+
 		VM_BUG_ON(!PageLocked(page));
 		if (cleancache_get_key(mapping->host, &key) >= 0) {
 			(*cleancache_ops.invalidate_page)(pool_id,
-							  key, page->index);
+					key, page->index);
 			cleancache_invalidates++;
 		}
 	}
@@ -176,12 +340,25 @@ EXPORT_SYMBOL(__cleancache_invalidate_page);
  * Invalidate all data from cleancache associated with the poolid and the
  * mappings's inode so that all subsequent gets to this poolid/inode
  * will fail.
+ *
+ * The function has two checks before any action is taken - whether
+ * a backend is registered and whether the sb->cleancache_poolid
+ * is correct.
  */
 void __cleancache_invalidate_inode(struct address_space *mapping)
 {
-	int pool_id = mapping->host->i_sb->cleancache_poolid;
+	int pool_id;
+	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
+	if (!backend_registered)
+		return;
+
+	if (fake_pool_id < 0)
+		return;
+
+	pool_id = get_poolid_from_fake(fake_pool_id);
+
 	if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0)
 		(*cleancache_ops.invalidate_inode)(pool_id, key);
 }
@@ -189,21 +366,37 @@ EXPORT_SYMBOL(__cleancache_invalidate_inode);
 
 /*
  * Called by any cleancache-enabled filesystem at time of unmount;
- * note that pool_id is surrendered and may be reutrned by a subsequent
- * cleancache_init_fs or cleancache_init_shared_fs
+ * note that pool_id is surrendered and may be returned by a subsequent
+ * cleancache_init_fs or cleancache_init_shared_fs.
  */
 void __cleancache_invalidate_fs(struct super_block *sb)
 {
-	if (sb->cleancache_poolid >= 0) {
-		int old_poolid = sb->cleancache_poolid;
-		sb->cleancache_poolid = -1;
-		(*cleancache_ops.invalidate_fs)(old_poolid);
+	int index;
+	int fake_pool_id = sb->cleancache_poolid;
+	int old_poolid = fake_pool_id;
+
+	mutex_lock(&poolid_mutex);
+	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET) {
+		index = fake_pool_id - FAKE_SHARED_FS_POOLID_OFFSET;
+		old_poolid = shared_fs_poolid_map[index];
+		shared_fs_poolid_map[index] = FS_UNKNOWN;
+		uuids[index] = NULL;
+	} else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET) {
+		index = fake_pool_id - FAKE_FS_POOLID_OFFSET;
+		old_poolid = fs_poolid_map[index];
+		fs_poolid_map[index] = FS_UNKNOWN;
 	}
+	sb->cleancache_poolid = -1;
+	if (backend_registered)
+		(*cleancache_ops.invalidate_fs)(old_poolid);
+	mutex_unlock(&poolid_mutex);
 }
 EXPORT_SYMBOL(__cleancache_invalidate_fs);
 
 static int __init init_cleancache(void)
 {
+	int i;
+
 #ifdef CONFIG_DEBUG_FS
 	struct dentry *root = debugfs_create_dir("cleancache", NULL);
 	if (root == NULL)
@@ -215,6 +408,11 @@ static int __init init_cleancache(void)
 	debugfs_create_u64("invalidates", S_IRUGO,
 				root, &cleancache_invalidates);
 #endif
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		fs_poolid_map[i] = FS_UNKNOWN;
+		shared_fs_poolid_map[i] = FS_UNKNOWN;
+	}
+	cleancache_enabled = 1;
 	return 0;
 }
 module_init(init_cleancache)
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 02/15] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
  2013-02-01 20:22 ` [PATCH 01/15] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules Konrad Rzeszutek Wilk
@ 2013-02-01 20:22 ` Konrad Rzeszutek Wilk
  2013-02-03  7:07   ` Ric Mason
  2013-02-01 20:22 ` [PATCH 03/15] frontswap: Make frontswap_init use a pointer for the ops Konrad Rzeszutek Wilk
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:22 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel
  Cc: Stefan Hengelein, Florian Schmaus, Andor Daam

From: Dan Magenheimer <dan.magenheimer@oracle.com>

With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
built/loaded as modules rather than built-in and enabled by a boot parameter,
this patch provides "lazy initialization", allowing backends to register to
frontswap even after swapon was run. Before a backend registers all calls
to init are recorded and the creation of tmem_pools delayed until a backend
registers or until a frontswap put is attempted.

Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
Signed-off-by: Andor Daam <andor.daam@googlemail.com>
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Fixes per Seth Jennings suggestions]
[v2: Removed FRONTSWAP_HAS_.. ]
[v3: Fix up per Bob Liu <lliubbo@gmail.com> recommendations]
[v4: Fix up per Andrew's comments]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 mm/frontswap.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 85 insertions(+), 10 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index 2890e67..c05a9db 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -80,6 +80,46 @@ static inline void inc_frontswap_succ_stores(void) { }
 static inline void inc_frontswap_failed_stores(void) { }
 static inline void inc_frontswap_invalidates(void) { }
 #endif
+
+/*
+ * Due to the asynchronous nature of the backends loading potentially
+ * _after_ the swap system has been activated, we have chokepoints
+ * on all frontswap functions to not call the backend until the backend
+ * has registered.
+ *
+ * Specifically when no backend is registered (nobody called
+ * frontswap_register_ops) all calls to frontswap_init (which is done via
+ * swapon -> enable_swap_info -> frontswap_init) are registered and remembered
+ * (via the setting of need_init bitmap) but fail to create tmem_pools. When a
+ * backend registers with frontswap at some later point the previous
+ * calls to frontswap_init are executed (by iterating over the need_init
+ * bitmap) to create tmem_pools and set the respective poolids. All of that is
+ * guarded by us using atomic bit operations on the 'need_init' bitmap.
+ *
+ * This would not guards us against the user deciding to call swapoff right as
+ * we are calling the backend to initialize (so swapon is in action).
+ * Fortunatly for us, the swapon_mutex has been taked by the callee so we are
+ * OK. The other scenario where calls to frontswap_store (called via
+ * swap_writepage) is racing with frontswap_invalidate_area (called via
+ * swapoff) is again guarded by the swap subsystem.
+ *
+ * While no backend is registered all calls to frontswap_[store|load|
+ * invalidate_area|invalidate_page] are ignored or fail.
+ *
+ * The time between the backend being registered and the swap file system
+ * calling the backend (via the frontswap_* functions) is indeterminate as
+ * backend_registered is not atomic_t (or a value guarded by a spinlock).
+ * That is OK as we are comfortable missing some of these calls to the newly
+ * registered backend.
+ *
+ * Obviously the opposite (unloading the backend) must be done after all
+ * the frontswap_[store|load|invalidate_area|invalidate_page] start
+ * ignorning or failing the requests - at which point backend_registered
+ * would have to be made in some fashion atomic.
+ */
+static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
+static bool backend_registered __read_mostly;
+
 /*
  * Register operations for frontswap, returning previous thus allowing
  * detection of multiple backends and possible nesting.
@@ -87,9 +127,22 @@ static inline void inc_frontswap_invalidates(void) { }
 struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
 {
 	struct frontswap_ops old = frontswap_ops;
+	int i;
 
 	frontswap_ops = *ops;
 	frontswap_enabled = true;
+
+	for (i = 0; i < MAX_SWAPFILES; i++) {
+		if (test_and_clear_bit(i, need_init))
+			(*frontswap_ops.init)(i);
+	}
+	/*
+	 * We MUST have backend_registered set _after_ the frontswap_init's
+	 * have been called. Otherwise __frontswap_store might fail. Hence
+	 * the barrier to make sure compiler does not re-order us.
+	 */
+	barrier();
+	backend_registered = true;
 	return old;
 }
 EXPORT_SYMBOL(frontswap_register_ops);
@@ -119,10 +172,17 @@ void __frontswap_init(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	BUG_ON(sis == NULL);
-	if (sis->frontswap_map == NULL)
-		return;
-	frontswap_ops.init(type);
+	if (backend_registered) {
+		BUG_ON(sis == NULL);
+		if (sis->frontswap_map == NULL)
+			return;
+		(*frontswap_ops.init)(type);
+	}
+	else {
+		BUG_ON(type > MAX_SWAPFILES);
+		set_bit(type, need_init);
+	}
+
 }
 EXPORT_SYMBOL(__frontswap_init);
 
@@ -147,6 +207,11 @@ int __frontswap_store(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
+	if (!backend_registered) {
+		inc_frontswap_failed_stores();
+		return ret;
+	}
+
 	BUG_ON(!PageLocked(page));
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
@@ -186,6 +251,9 @@ int __frontswap_load(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
+	if (!backend_registered)
+		return ret;
+
 	BUG_ON(!PageLocked(page));
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
@@ -209,6 +277,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
+	if (!backend_registered)
+		return;
+
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset)) {
 		frontswap_ops.invalidate_page(type, offset);
@@ -226,12 +297,15 @@ void __frontswap_invalidate_area(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	BUG_ON(sis == NULL);
-	if (sis->frontswap_map == NULL)
-		return;
-	frontswap_ops.invalidate_area(type);
-	atomic_set(&sis->frontswap_pages, 0);
-	memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+	if (backend_registered) {
+		BUG_ON(sis == NULL);
+		if (sis->frontswap_map == NULL)
+			return;
+		(*frontswap_ops.invalidate_area)(type);
+		atomic_set(&sis->frontswap_pages, 0);
+		memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+	}
+	clear_bit(type, need_init);
 }
 EXPORT_SYMBOL(__frontswap_invalidate_area);
 
@@ -364,6 +438,7 @@ static int __init init_frontswap(void)
 	debugfs_create_u64("invalidates", S_IRUGO,
 				root, &frontswap_invalidates);
 #endif
+	frontswap_enabled = 1;
 	return 0;
 }
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 03/15] frontswap: Make frontswap_init use a pointer for the ops.
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
  2013-02-01 20:22 ` [PATCH 01/15] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules Konrad Rzeszutek Wilk
  2013-02-01 20:22 ` [PATCH 02/15] mm: frontswap: " Konrad Rzeszutek Wilk
@ 2013-02-01 20:22 ` Konrad Rzeszutek Wilk
  2013-02-01 20:22 ` [PATCH 04/15] cleancache: Make cleancache_init " Konrad Rzeszutek Wilk
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:22 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

This simplifies the code in the frontswap - we can get rid
of the 'backend_registered' test and instead check against
frontswap_ops.

[v1: Rebase on top of 703ba7fe5e085f2c85eeb451c2ac13cf275c7cb2
(ramster->zcache move]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/staging/zcache/zcache-main.c |  8 ++++----
 drivers/xen/tmem.c                   |  6 +++---
 include/linux/frontswap.h            |  2 +-
 mm/frontswap.c                       | 38 +++++++++++++++++-------------------
 4 files changed, 26 insertions(+), 28 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index c1ac905..05bfa41 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1625,9 +1625,9 @@ static struct frontswap_ops zcache_frontswap_ops = {
 	.init = zcache_frontswap_init
 };
 
-struct frontswap_ops zcache_frontswap_register_ops(void)
+struct frontswap_ops *zcache_frontswap_register_ops(void)
 {
-	struct frontswap_ops old_ops =
+	struct frontswap_ops *old_ops =
 		frontswap_register_ops(&zcache_frontswap_ops);
 
 	return old_ops;
@@ -1794,7 +1794,7 @@ static int __init zcache_init(void)
 			pr_warn("%s: cleancache_ops overridden\n", namestr);
 	}
 	if (zcache_enabled && !disable_frontswap) {
-		struct frontswap_ops old_ops;
+		struct frontswap_ops *old_ops;
 
 		old_ops = zcache_frontswap_register_ops();
 		if (frontswap_has_exclusive_gets)
@@ -1806,7 +1806,7 @@ static int __init zcache_init(void)
 			namestr, frontswap_has_exclusive_gets,
 			!disable_frontswap_ignore_nonactive);
 #endif
-		if (old_ops.init != NULL)
+		if (old_ops != NULL)
 			pr_warn("%s: frontswap_ops overridden\n", namestr);
 	}
 	if (ramster_enabled)
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index 144564e..4b02c07 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -362,7 +362,7 @@ static int __init no_frontswap(char *s)
 }
 __setup("nofrontswap", no_frontswap);
 
-static struct frontswap_ops __initdata tmem_frontswap_ops = {
+static struct frontswap_ops tmem_frontswap_ops = {
 	.store = tmem_frontswap_store,
 	.load = tmem_frontswap_load,
 	.invalidate_page = tmem_frontswap_flush_page,
@@ -378,11 +378,11 @@ static int __init xen_tmem_init(void)
 #ifdef CONFIG_FRONTSWAP
 	if (tmem_enabled && use_frontswap) {
 		char *s = "";
-		struct frontswap_ops old_ops =
+		struct frontswap_ops *old_ops =
 			frontswap_register_ops(&tmem_frontswap_ops);
 
 		tmem_frontswap_poolid = -1;
-		if (old_ops.init != NULL)
+		if (old_ops)
 			s = " (WARNING: frontswap_ops overridden)";
 		printk(KERN_INFO "frontswap enabled, RAM provided by "
 				 "Xen Transcendent Memory\n");
diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
index 3044254..d4f2987 100644
--- a/include/linux/frontswap.h
+++ b/include/linux/frontswap.h
@@ -14,7 +14,7 @@ struct frontswap_ops {
 };
 
 extern bool frontswap_enabled;
-extern struct frontswap_ops
+extern struct frontswap_ops *
 	frontswap_register_ops(struct frontswap_ops *ops);
 extern void frontswap_shrink(unsigned long);
 extern unsigned long frontswap_curr_pages(void);
diff --git a/mm/frontswap.c b/mm/frontswap.c
index c05a9db..b9b23b1 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -24,7 +24,7 @@
  * frontswap_ops is set by frontswap_register_ops to contain the pointers
  * to the frontswap "backend" implementation functions.
  */
-static struct frontswap_ops frontswap_ops __read_mostly;
+static struct frontswap_ops *frontswap_ops __read_mostly;
 
 /*
  * This global enablement flag reduces overhead on systems where frontswap_ops
@@ -108,41 +108,39 @@ static inline void inc_frontswap_invalidates(void) { }
  *
  * The time between the backend being registered and the swap file system
  * calling the backend (via the frontswap_* functions) is indeterminate as
- * backend_registered is not atomic_t (or a value guarded by a spinlock).
+ * frontswap_ops is not atomic_t (or a value guarded by a spinlock).
  * That is OK as we are comfortable missing some of these calls to the newly
  * registered backend.
  *
  * Obviously the opposite (unloading the backend) must be done after all
  * the frontswap_[store|load|invalidate_area|invalidate_page] start
- * ignorning or failing the requests - at which point backend_registered
+ * ignorning or failing the requests - at which point frontswap_ops
  * would have to be made in some fashion atomic.
  */
 static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
-static bool backend_registered __read_mostly;
 
 /*
  * Register operations for frontswap, returning previous thus allowing
  * detection of multiple backends and possible nesting.
  */
-struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
+struct frontswap_ops *frontswap_register_ops(struct frontswap_ops *ops)
 {
-	struct frontswap_ops old = frontswap_ops;
+	struct frontswap_ops *old = frontswap_ops;
 	int i;
 
-	frontswap_ops = *ops;
 	frontswap_enabled = true;
 
 	for (i = 0; i < MAX_SWAPFILES; i++) {
 		if (test_and_clear_bit(i, need_init))
-			(*frontswap_ops.init)(i);
+			ops->init(i);
 	}
 	/*
-	 * We MUST have backend_registered set _after_ the frontswap_init's
+	 * We MUST have frontswap_ops set _after_ the frontswap_init's
 	 * have been called. Otherwise __frontswap_store might fail. Hence
 	 * the barrier to make sure compiler does not re-order us.
 	 */
 	barrier();
-	backend_registered = true;
+	frontswap_ops = ops;
 	return old;
 }
 EXPORT_SYMBOL(frontswap_register_ops);
@@ -172,11 +170,11 @@ void __frontswap_init(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	if (backend_registered) {
+	if (frontswap_ops) {
 		BUG_ON(sis == NULL);
 		if (sis->frontswap_map == NULL)
 			return;
-		(*frontswap_ops.init)(type);
+		frontswap_ops->init(type);
 	}
 	else {
 		BUG_ON(type > MAX_SWAPFILES);
@@ -207,7 +205,7 @@ int __frontswap_store(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
-	if (!backend_registered) {
+	if (!frontswap_ops) {
 		inc_frontswap_failed_stores();
 		return ret;
 	}
@@ -216,7 +214,7 @@ int __frontswap_store(struct page *page)
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
 		dup = 1;
-	ret = frontswap_ops.store(type, offset, page);
+	ret = frontswap_ops->store(type, offset, page);
 	if (ret == 0) {
 		frontswap_set(sis, offset);
 		inc_frontswap_succ_stores();
@@ -251,13 +249,13 @@ int __frontswap_load(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
-	if (!backend_registered)
+	if (!frontswap_ops)
 		return ret;
 
 	BUG_ON(!PageLocked(page));
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
-		ret = frontswap_ops.load(type, offset, page);
+		ret = frontswap_ops->load(type, offset, page);
 	if (ret == 0) {
 		inc_frontswap_loads();
 		if (frontswap_tmem_exclusive_gets_enabled) {
@@ -277,12 +275,12 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	if (!backend_registered)
+	if (!frontswap_ops)
 		return;
 
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset)) {
-		frontswap_ops.invalidate_page(type, offset);
+		frontswap_ops->invalidate_page(type, offset);
 		__frontswap_clear(sis, offset);
 		inc_frontswap_invalidates();
 	}
@@ -297,11 +295,11 @@ void __frontswap_invalidate_area(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	if (backend_registered) {
+	if (frontswap_ops) {
 		BUG_ON(sis == NULL);
 		if (sis->frontswap_map == NULL)
 			return;
-		(*frontswap_ops.invalidate_area)(type);
+		frontswap_ops->invalidate_area(type);
 		atomic_set(&sis->frontswap_pages, 0);
 		memset(sis->frontswap_map, 0, sis->max / sizeof(long));
 	}
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 04/15] cleancache: Make cleancache_init use a pointer for the ops
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (2 preceding siblings ...)
  2013-02-01 20:22 ` [PATCH 03/15] frontswap: Make frontswap_init use a pointer for the ops Konrad Rzeszutek Wilk
@ 2013-02-01 20:22 ` Konrad Rzeszutek Wilk
  2013-02-01 20:22 ` [PATCH 05/15] staging: zcache: enable ramster to be built/loaded as a module Konrad Rzeszutek Wilk
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:22 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

Instead of using a backend_registered to determine whether
a backend is enabled. This allows us to remove the
backend_register check and just do 'if (cleancache_ops)'

[v1: Rebase on top of b97c4b430b0a405a57c78607b520d8000329e259
(ramster->zcache move]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/staging/zcache/zcache-main.c |  8 ++---
 drivers/xen/tmem.c                   |  6 ++--
 include/linux/cleancache.h           |  2 +-
 mm/cleancache.c                      | 62 +++++++++++++++++++-----------------
 4 files changed, 40 insertions(+), 38 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index 05bfa41..b5a9dd0 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1494,9 +1494,9 @@ static struct cleancache_ops zcache_cleancache_ops = {
 	.init_fs = zcache_cleancache_init_fs
 };
 
-struct cleancache_ops zcache_cleancache_register_ops(void)
+struct cleancache_ops *zcache_cleancache_register_ops(void)
 {
-	struct cleancache_ops old_ops =
+	struct cleancache_ops *old_ops =
 		cleancache_register_ops(&zcache_cleancache_ops);
 
 	return old_ops;
@@ -1780,7 +1780,7 @@ static int __init zcache_init(void)
 	}
 	zbud_init();
 	if (zcache_enabled && !disable_cleancache) {
-		struct cleancache_ops old_ops;
+		struct cleancache_ops *old_ops;
 
 		register_shrinker(&zcache_shrinker);
 		old_ops = zcache_cleancache_register_ops();
@@ -1790,7 +1790,7 @@ static int __init zcache_init(void)
 		pr_info("%s: cleancache: ignorenonactive = %d\n",
 			namestr, !disable_cleancache_ignore_nonactive);
 #endif
-		if (old_ops.init_fs != NULL)
+		if (old_ops != NULL)
 			pr_warn("%s: cleancache_ops overridden\n", namestr);
 	}
 	if (zcache_enabled && !disable_frontswap) {
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index 4b02c07..15e776c 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -236,7 +236,7 @@ static int __init no_cleancache(char *s)
 }
 __setup("nocleancache", no_cleancache);
 
-static struct cleancache_ops __initdata tmem_cleancache_ops = {
+static struct cleancache_ops tmem_cleancache_ops = {
 	.put_page = tmem_cleancache_put_page,
 	.get_page = tmem_cleancache_get_page,
 	.invalidate_page = tmem_cleancache_flush_page,
@@ -392,9 +392,9 @@ static int __init xen_tmem_init(void)
 	BUG_ON(sizeof(struct cleancache_filekey) != sizeof(struct tmem_oid));
 	if (tmem_enabled && use_cleancache) {
 		char *s = "";
-		struct cleancache_ops old_ops =
+		struct cleancache_ops *old_ops =
 			cleancache_register_ops(&tmem_cleancache_ops);
-		if (old_ops.init_fs != NULL)
+		if (old_ops)
 			s = " (WARNING: cleancache_ops overridden)";
 		printk(KERN_INFO "cleancache enabled, RAM provided by "
 				 "Xen Transcendent Memory%s\n", s);
diff --git a/include/linux/cleancache.h b/include/linux/cleancache.h
index 42e55de..3af5ea8 100644
--- a/include/linux/cleancache.h
+++ b/include/linux/cleancache.h
@@ -33,7 +33,7 @@ struct cleancache_ops {
 	void (*invalidate_fs)(int);
 };
 
-extern struct cleancache_ops
+extern struct cleancache_ops *
 	cleancache_register_ops(struct cleancache_ops *ops);
 extern void __cleancache_init_fs(struct super_block *);
 extern void __cleancache_init_shared_fs(char *, struct super_block *);
diff --git a/mm/cleancache.c b/mm/cleancache.c
index e4dc314..5d8dbb9 100644
--- a/mm/cleancache.c
+++ b/mm/cleancache.c
@@ -32,7 +32,7 @@ EXPORT_SYMBOL(cleancache_enabled);
  * cleancache_ops is set by cleancache_ops_register to contain the pointers
  * to the cleancache "backend" implementation functions.
  */
-static struct cleancache_ops cleancache_ops __read_mostly;
+static struct cleancache_ops *cleancache_ops __read_mostly;
 
 /*
  * Counters available via /sys/kernel/debug/frontswap (if debugfs is
@@ -72,15 +72,14 @@ static DEFINE_MUTEX(poolid_mutex);
 /*
  * When set to false (default) all calls to the cleancache functions, except
  * the __cleancache_invalidate_fs and __cleancache_init_[shared|]fs are guarded
- * by the if (!backend_registered) return. This means multiple threads (from
- * different filesystems) will be checking backend_registered. The usage of a
+ * by the if (!cleancache_ops) return. This means multiple threads (from
+ * different filesystems) will be checking cleancache_ops. The usage of a
  * bool instead of a atomic_t or a bool guarded by a spinlock is OK - we are
  * OK if the time between the backend's have been initialized (and
- * backend_registered has been set to true) and when the filesystems start
+ * cleancache_ops has been set to not NULL) and when the filesystems start
  * actually calling the backends. The inverse (when unloading) is obviously
  * not good - but this shim does not do that (yet).
  */
-static bool backend_registered __read_mostly;
 
 /*
  * The backends and filesystems work all asynchronously. This is b/c the
@@ -90,13 +89,13 @@ static bool backend_registered __read_mostly;
  * 		[shared_|]fs_poolid_map and uuids for.
  *
  * 	b). user does I/Os -> we call the rest of __cleancache_* functions
- * 		which return immediately as backend_registered is false.
+ * 		which return immediately as cleancache_ops is NULL.
  *
  * 	c). modprobe zcache -> cleancache_register_ops. We init the backend
- * 		and set backend_registered to true, and for any fs_poolid_map
+ * 		and set cleancache_ops to the backend, and for any fs_poolid_map
  * 		(which is set by __cleancache_init_fs) we initialize the poolid.
  *
- * 	d). user does I/Os -> now that backend_registered is true all the
+ * 	d). user does I/Os -> now that clean_ops is not NULL all the
  * 		__cleancache_* functions can call the backend. They all check
  * 		that fs_poolid_map is valid and if so invoke the backend.
  *
@@ -120,23 +119,26 @@ static bool backend_registered __read_mostly;
  * Register operations for cleancache, returning previous thus allowing
  * detection of multiple backends and possible nesting.
  */
-struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
+struct cleancache_ops *cleancache_register_ops(struct cleancache_ops *ops)
 {
-	struct cleancache_ops old = cleancache_ops;
+	struct cleancache_ops *old = cleancache_ops;
 	int i;
 
 	mutex_lock(&poolid_mutex);
-	cleancache_ops = *ops;
-
-	backend_registered = true;
 	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
 		if (fs_poolid_map[i] == FS_NO_BACKEND)
-			fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+			fs_poolid_map[i] = ops->init_fs(PAGE_SIZE);
 		if (shared_fs_poolid_map[i] == FS_NO_BACKEND)
-			shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+			shared_fs_poolid_map[i] = ops->init_shared_fs
 					(uuids[i], PAGE_SIZE);
 	}
-out:
+	/*
+	 * We MUST set cleancache_ops _after_ we have called the backends
+	 * init_fs or init_shared_fs functions. Otherwise the compiler might
+	 * re-order where cleancache_ops is set in this function.
+	 */
+	barrier();
+	cleancache_ops = ops;
 	mutex_unlock(&poolid_mutex);
 	return old;
 }
@@ -151,8 +153,8 @@ void __cleancache_init_fs(struct super_block *sb)
 	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
 		if (fs_poolid_map[i] == FS_UNKNOWN) {
 			sb->cleancache_poolid = i + FAKE_FS_POOLID_OFFSET;
-			if (backend_registered)
-				fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+			if (cleancache_ops)
+				fs_poolid_map[i] = cleancache_ops->init_fs(PAGE_SIZE);
 			else
 				fs_poolid_map[i] = FS_NO_BACKEND;
 			break;
@@ -172,8 +174,8 @@ void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
 		if (shared_fs_poolid_map[i] == FS_UNKNOWN) {
 			sb->cleancache_poolid = i + FAKE_SHARED_FS_POOLID_OFFSET;
 			uuids[i] = uuid;
-			if (backend_registered)
-				shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+			if (cleancache_ops)
+				shared_fs_poolid_map[i] = cleancache_ops->init_shared_fs
 						(uuid, PAGE_SIZE);
 			else
 				shared_fs_poolid_map[i] = FS_NO_BACKEND;
@@ -240,7 +242,7 @@ int __cleancache_get_page(struct page *page)
 	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!backend_registered) {
+	if (!cleancache_ops) {
 		cleancache_failed_gets++;
 		goto out;
 	}
@@ -255,7 +257,7 @@ int __cleancache_get_page(struct page *page)
 		goto out;
 
 	if (pool_id >= 0)
-		ret = (*cleancache_ops.get_page)(pool_id,
+		ret = cleancache_ops->get_page(pool_id,
 				key, page->index, page);
 	if (ret == 0)
 		cleancache_succ_gets++;
@@ -282,7 +284,7 @@ void __cleancache_put_page(struct page *page)
 	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!backend_registered) {
+	if (!cleancache_ops) {
 		cleancache_puts++;
 		return;
 	}
@@ -296,7 +298,7 @@ void __cleancache_put_page(struct page *page)
 
 	if (pool_id >= 0 &&
 		cleancache_get_key(page->mapping->host, &key) >= 0) {
-		(*cleancache_ops.put_page)(pool_id, key, page->index, page);
+		cleancache_ops->put_page(pool_id, key, page->index, page);
 		cleancache_puts++;
 	}
 }
@@ -318,7 +320,7 @@ void __cleancache_invalidate_page(struct address_space *mapping,
 	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!backend_registered)
+	if (!cleancache_ops)
 		return;
 
 	if (fake_pool_id >= 0) {
@@ -328,7 +330,7 @@ void __cleancache_invalidate_page(struct address_space *mapping,
 
 		VM_BUG_ON(!PageLocked(page));
 		if (cleancache_get_key(mapping->host, &key) >= 0) {
-			(*cleancache_ops.invalidate_page)(pool_id,
+			cleancache_ops->invalidate_page(pool_id,
 					key, page->index);
 			cleancache_invalidates++;
 		}
@@ -351,7 +353,7 @@ void __cleancache_invalidate_inode(struct address_space *mapping)
 	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!backend_registered)
+	if (!cleancache_ops)
 		return;
 
 	if (fake_pool_id < 0)
@@ -360,7 +362,7 @@ void __cleancache_invalidate_inode(struct address_space *mapping)
 	pool_id = get_poolid_from_fake(fake_pool_id);
 
 	if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0)
-		(*cleancache_ops.invalidate_inode)(pool_id, key);
+		cleancache_ops->invalidate_inode(pool_id, key);
 }
 EXPORT_SYMBOL(__cleancache_invalidate_inode);
 
@@ -387,8 +389,8 @@ void __cleancache_invalidate_fs(struct super_block *sb)
 		fs_poolid_map[index] = FS_UNKNOWN;
 	}
 	sb->cleancache_poolid = -1;
-	if (backend_registered)
-		(*cleancache_ops.invalidate_fs)(old_poolid);
+	if (cleancache_ops)
+		cleancache_ops->invalidate_fs(old_poolid);
 	mutex_unlock(&poolid_mutex);
 }
 EXPORT_SYMBOL(__cleancache_invalidate_fs);
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 05/15] staging: zcache: enable ramster to be built/loaded as a module
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (3 preceding siblings ...)
  2013-02-01 20:22 ` [PATCH 04/15] cleancache: Make cleancache_init " Konrad Rzeszutek Wilk
@ 2013-02-01 20:22 ` Konrad Rzeszutek Wilk
  2013-02-01 20:22 ` [PATCH 06/15] staging: zcache: enable zcache " Konrad Rzeszutek Wilk
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:22 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

From: Dan Magenheimer <dan.magenheimer@oracle.com>

Enable module support for ramster.  Note runtime dependency disallows
loading if cleancache/frontswap lazy initialization patches are not
present.

If built-in (not built as a module), the original mechanism of enabling via
a kernel boot parameter is retained, but this should be considered deprecated.

Note that module unload is explicitly not yet supported.

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Fixed compile issues since ramster_init now has four arguments]
[v2: Fixed rebase on ramster->zcache move]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/staging/zcache/ramster.h                   |  6 ++++-
 drivers/staging/zcache/ramster/nodemanager.c       |  9 ++++---
 drivers/staging/zcache/ramster/ramster.c           | 29 ++++++++++++++++++----
 drivers/staging/zcache/ramster/ramster.h           |  2 +-
 .../staging/zcache/ramster/ramster_nodemanager.h   |  2 ++
 drivers/staging/zcache/zcache-main.c               |  2 +-
 6 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/zcache/ramster.h b/drivers/staging/zcache/ramster.h
index 1b71aea..e1f91d5 100644
--- a/drivers/staging/zcache/ramster.h
+++ b/drivers/staging/zcache/ramster.h
@@ -11,10 +11,14 @@
 #ifndef _ZCACHE_RAMSTER_H_
 #define _ZCACHE_RAMSTER_H_
 
+#ifdef CONFIG_RAMSTER_MODULE
+#define CONFIG_RAMSTER
+#endif
+
 #ifdef CONFIG_RAMSTER
 #include "ramster/ramster.h"
 #else
-static inline void ramster_init(bool x, bool y, bool z)
+static inline void ramster_init(bool x, bool y, bool z, bool w)
 {
 }
 
diff --git a/drivers/staging/zcache/ramster/nodemanager.c b/drivers/staging/zcache/ramster/nodemanager.c
index c0f4815..2cfe933 100644
--- a/drivers/staging/zcache/ramster/nodemanager.c
+++ b/drivers/staging/zcache/ramster/nodemanager.c
@@ -949,7 +949,7 @@ static void __exit exit_r2nm(void)
 	r2hb_exit();
 }
 
-static int __init init_r2nm(void)
+int r2nm_init(void)
 {
 	int ret = -1;
 
@@ -986,10 +986,11 @@ out_r2hb:
 out:
 	return ret;
 }
+EXPORT_SYMBOL_GPL(r2nm_init);
 
 MODULE_AUTHOR("Oracle");
 MODULE_LICENSE("GPL");
 
-/* module_init(init_r2nm) */
-late_initcall(init_r2nm);
-/* module_exit(exit_r2nm) */
+#ifndef CONFIG_RAMSTER_MODULE
+late_initcall(r2nm_init);
+#endif
diff --git a/drivers/staging/zcache/ramster/ramster.c b/drivers/staging/zcache/ramster/ramster.c
index c06709f..491ec70 100644
--- a/drivers/staging/zcache/ramster/ramster.c
+++ b/drivers/staging/zcache/ramster/ramster.c
@@ -92,7 +92,7 @@ static unsigned long ramster_remote_page_flushes_failed;
 #include <linux/debugfs.h>
 #define	zdfs	debugfs_create_size_t
 #define	zdfs64	debugfs_create_u64
-static int __init ramster_debugfs_init(void)
+static int ramster_debugfs_init(void)
 {
 	struct dentry *root = debugfs_create_dir("ramster", NULL);
 	if (root == NULL)
@@ -191,6 +191,7 @@ int ramster_do_preload_flnode(struct tmem_pool *pool)
 		kmem_cache_free(ramster_flnode_cache, flnode);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(ramster_do_preload_flnode);
 
 /*
  * Called by the message handler after a (still compressed) page has been
@@ -458,6 +459,7 @@ void *ramster_pampd_free(void *pampd, struct tmem_pool *pool,
 	}
 	return local_pampd;
 }
+EXPORT_SYMBOL_GPL(ramster_pampd_free);
 
 void ramster_count_foreign_pages(bool eph, int count)
 {
@@ -489,6 +491,7 @@ void ramster_count_foreign_pages(bool eph, int count)
 		ramster_foreign_pers_pages = c;
 	}
 }
+EXPORT_SYMBOL_GPL(ramster_count_foreign_pages);
 
 /*
  * For now, just push over a few pages every few seconds to
@@ -674,7 +677,7 @@ requeue:
 	ramster_remotify_queue_delayed_work(HZ);
 }
 
-void __init ramster_remotify_init(void)
+void ramster_remotify_init(void)
 {
 	unsigned long n = 60UL;
 	ramster_remotify_workqueue =
@@ -849,8 +852,10 @@ static bool frontswap_selfshrinking __read_mostly;
 static void selfshrink_process(struct work_struct *work);
 static DECLARE_DELAYED_WORK(selfshrink_worker, selfshrink_process);
 
+#ifndef CONFIG_RAMSTER_MODULE
 /* Enable/disable with kernel boot option. */
 static bool use_frontswap_selfshrink __initdata = true;
+#endif
 
 /*
  * The default values for the following parameters were deemed reasonable
@@ -905,6 +910,7 @@ static void frontswap_selfshrink(void)
 	frontswap_shrink(tgt_frontswap_pages);
 }
 
+#ifndef CONFIG_RAMSTER_MODULE
 static int __init ramster_nofrontswap_selfshrink_setup(char *s)
 {
 	use_frontswap_selfshrink = false;
@@ -912,6 +918,7 @@ static int __init ramster_nofrontswap_selfshrink_setup(char *s)
 }
 
 __setup("noselfshrink", ramster_nofrontswap_selfshrink_setup);
+#endif
 
 static void selfshrink_process(struct work_struct *work)
 {
@@ -930,6 +937,7 @@ void ramster_cpu_up(int cpu)
 	per_cpu(ramster_remoteputmem1, cpu) = p1;
 	per_cpu(ramster_remoteputmem2, cpu) = p2;
 }
+EXPORT_SYMBOL_GPL(ramster_cpu_up);
 
 void ramster_cpu_down(int cpu)
 {
@@ -945,6 +953,7 @@ void ramster_cpu_down(int cpu)
 		kp->flnode = NULL;
 	}
 }
+EXPORT_SYMBOL_GPL(ramster_cpu_down);
 
 void ramster_register_pamops(struct tmem_pamops *pamops)
 {
@@ -955,9 +964,11 @@ void ramster_register_pamops(struct tmem_pamops *pamops)
 	pamops->repatriate = ramster_pampd_repatriate;
 	pamops->repatriate_preload = ramster_pampd_repatriate_preload;
 }
+EXPORT_SYMBOL_GPL(ramster_register_pamops);
 
-void __init ramster_init(bool cleancache, bool frontswap,
-				bool frontswap_exclusive_gets)
+void ramster_init(bool cleancache, bool frontswap,
+				bool frontswap_exclusive_gets,
+				bool frontswap_selfshrink)
 {
 	int ret = 0;
 
@@ -972,10 +983,17 @@ void __init ramster_init(bool cleancache, bool frontswap,
 	if (ret)
 		pr_err("ramster: can't create sysfs for ramster\n");
 	(void)r2net_register_handlers();
+#ifdef CONFIG_RAMSTER_MODULE
+	ret = r2nm_init();
+	if (ret)
+		pr_err("ramster: can't init r2net\n");
+	frontswap_selfshrinking = frontswap_selfshrink;
+#else
+	frontswap_selfshrinking = use_frontswap_selfshrink;
+#endif
 	INIT_LIST_HEAD(&ramster_rem_op_list);
 	ramster_flnode_cache = kmem_cache_create("ramster_flnode",
 				sizeof(struct flushlist_node), 0, 0, NULL);
-	frontswap_selfshrinking = use_frontswap_selfshrink;
 	if (frontswap_selfshrinking) {
 		pr_info("ramster: Initializing frontswap selfshrink driver.\n");
 		schedule_delayed_work(&selfshrink_worker,
@@ -983,3 +1001,4 @@ void __init ramster_init(bool cleancache, bool frontswap,
 	}
 	ramster_remotify_init();
 }
+EXPORT_SYMBOL_GPL(ramster_init);
diff --git a/drivers/staging/zcache/ramster/ramster.h b/drivers/staging/zcache/ramster/ramster.h
index 12ae56f..6d41a7a 100644
--- a/drivers/staging/zcache/ramster/ramster.h
+++ b/drivers/staging/zcache/ramster/ramster.h
@@ -147,7 +147,7 @@ extern int r2net_register_handlers(void);
 extern int r2net_remote_target_node_set(int);
 
 extern int ramster_remotify_pageframe(bool);
-extern void ramster_init(bool, bool, bool);
+extern void ramster_init(bool, bool, bool, bool);
 extern void ramster_register_pamops(struct tmem_pamops *);
 extern int ramster_localify(int, struct tmem_oid *oidp, uint32_t, char *,
 				unsigned int, void *);
diff --git a/drivers/staging/zcache/ramster/ramster_nodemanager.h b/drivers/staging/zcache/ramster/ramster_nodemanager.h
index 49f879d..dbaae34 100644
--- a/drivers/staging/zcache/ramster/ramster_nodemanager.h
+++ b/drivers/staging/zcache/ramster/ramster_nodemanager.h
@@ -36,4 +36,6 @@
 /* host name, group name, cluster name all 64 bytes */
 #define R2NM_MAX_NAME_LEN        64    /* __NEW_UTS_LEN */
 
+extern int r2nm_init(void);
+
 #endif /* _RAMSTER_NODEMANAGER_H */
diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index b5a9dd0..bb59b78 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1811,7 +1811,7 @@ static int __init zcache_init(void)
 	}
 	if (ramster_enabled)
 		ramster_init(!disable_cleancache, !disable_frontswap,
-				frontswap_has_exclusive_gets);
+				frontswap_has_exclusive_gets, false);
 out:
 	return ret;
 }
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 06/15] staging: zcache: enable zcache to be built/loaded as a module
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (4 preceding siblings ...)
  2013-02-01 20:22 ` [PATCH 05/15] staging: zcache: enable ramster to be built/loaded as a module Konrad Rzeszutek Wilk
@ 2013-02-01 20:22 ` Konrad Rzeszutek Wilk
  2013-02-01 20:22 ` [PATCH 07/15] xen: tmem: enable Xen tmem shim " Konrad Rzeszutek Wilk
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:22 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

Allow zcache to be built/loaded as a module.  Note runtime dependency
disallows loading if cleancache/frontswap lazy initialization patches
are not present.  Zsmalloc support has not yet been merged into zcache
but, once merged, could now easily be selected via a module_param.

If built-in (not built as a module), the original mechanism of enabling via
a kernel boot parameter is retained, but this should be considered deprecated.

Note that module unload is explicitly not yet supported.

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Rebased with different order of patches]
[v2: Removed [CLEANCACHE|FRONTSWAP]_HAS_LAZY_INIT ifdef]
[v3: Rebased on top of ramster->zcache move]
[v4: Redid the Makefile]
[v5: s/ZCACHE2/ZCACHE/]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/staging/zcache/Kconfig       |  6 ++---
 drivers/staging/zcache/Makefile      | 11 ++++-----
 drivers/staging/zcache/tmem.c        |  6 ++++-
 drivers/staging/zcache/tmem.h        |  8 +++----
 drivers/staging/zcache/zcache-main.c | 45 +++++++++++++++++++++++++++++++++---
 drivers/staging/zcache/zcache.h      |  2 +-
 6 files changed, 60 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/zcache/Kconfig b/drivers/staging/zcache/Kconfig
index c1dbd04..88e3c59 100644
--- a/drivers/staging/zcache/Kconfig
+++ b/drivers/staging/zcache/Kconfig
@@ -1,5 +1,5 @@
 config ZCACHE
-	bool "Dynamic compression of swap pages and clean pagecache pages"
+	tristate "Dynamic compression of swap pages and clean pagecache pages"
 	depends on CRYPTO=y && SWAP=y && CLEANCACHE && FRONTSWAP
 	select CRYPTO_LZO
 	default n
@@ -11,8 +11,8 @@ config ZCACHE
 	  providing a noticeable reduction in disk I/O.
 
 config RAMSTER
-	bool "Cross-machine RAM capacity sharing, aka peer-to-peer tmem"
-	depends on CONFIGFS_FS=y && SYSFS=y && !HIGHMEM && ZCACHE=y
+	tristate "Cross-machine RAM capacity sharing, aka peer-to-peer tmem"
+	depends on CONFIGFS_FS=y && SYSFS=y && !HIGHMEM && ZCACHE
 	depends on NET
 	# must ensure struct page is 8-byte aligned
 	select HAVE_ALIGNED_STRUCT_PAGE if !64_BIT
diff --git a/drivers/staging/zcache/Makefile b/drivers/staging/zcache/Makefile
index 4711049..98c4e32 100644
--- a/drivers/staging/zcache/Makefile
+++ b/drivers/staging/zcache/Makefile
@@ -1,6 +1,5 @@
-zcache-y	:=		zcache-main.o tmem.o zbud.o
-zcache-$(CONFIG_RAMSTER)	+=	ramster/ramster.o ramster/r2net.o
-zcache-$(CONFIG_RAMSTER)	+=	ramster/nodemanager.o ramster/tcp.o
-zcache-$(CONFIG_RAMSTER)	+=	ramster/heartbeat.o ramster/masklog.o
-
-obj-$(CONFIG_ZCACHE)	+=	zcache.o
+obj-$(CONFIG_ZCACHE)	+= zcache.o
+zcache-y		:=	zcache-main.o tmem.o zbud.o
+zcache-$(CONFIG_RAMSTER)+=	ramster/ramster.o ramster/r2net.o
+zcache-$(CONFIG_RAMSTER)+=	ramster/nodemanager.o ramster/tcp.o
+zcache-$(CONFIG_RAMSTER)+=	ramster/heartbeat.o ramster/masklog.o
diff --git a/drivers/staging/zcache/tmem.c b/drivers/staging/zcache/tmem.c
index a2b7e03..d7e51e4 100644
--- a/drivers/staging/zcache/tmem.c
+++ b/drivers/staging/zcache/tmem.c
@@ -35,7 +35,8 @@
 #include <linux/list.h>
 #include <linux/spinlock.h>
 #include <linux/atomic.h>
-#ifdef CONFIG_RAMSTER
+#include <linux/export.h>
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 #include <linux/delay.h>
 #endif
 
@@ -641,6 +642,7 @@ void *tmem_localify_get_pampd(struct tmem_pool *pool, struct tmem_oid *oidp,
 	/* note, hashbucket remains locked */
 	return pampd;
 }
+EXPORT_SYMBOL_GPL(tmem_localify_get_pampd);
 
 void tmem_localify_finish(struct tmem_obj *obj, uint32_t index,
 			  void *pampd, void *saved_hb, bool delete)
@@ -658,6 +660,7 @@ void tmem_localify_finish(struct tmem_obj *obj, uint32_t index,
 	}
 	spin_unlock(&hb->lock);
 }
+EXPORT_SYMBOL_GPL(tmem_localify_finish);
 
 /*
  * For ramster only.  Helper function to support asynchronous tmem_get.
@@ -719,6 +722,7 @@ out:
 	spin_unlock(&hb->lock);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(tmem_replace);
 #endif
 
 /*
diff --git a/drivers/staging/zcache/tmem.h b/drivers/staging/zcache/tmem.h
index adbe5a8..d128ce2 100644
--- a/drivers/staging/zcache/tmem.h
+++ b/drivers/staging/zcache/tmem.h
@@ -126,7 +126,7 @@ static inline unsigned tmem_oid_hash(struct tmem_oid *oidp)
 				TMEM_HASH_BUCKET_BITS);
 }
 
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 struct tmem_xhandle {
 	uint8_t client_id;
 	uint8_t xh_data_cksum;
@@ -171,7 +171,7 @@ struct tmem_obj {
 	unsigned int objnode_tree_height;
 	unsigned long objnode_count;
 	long pampd_count;
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 	/*
 	 * for current design of ramster, all pages belonging to
 	 * an object reside on the same remotenode and extra is
@@ -215,7 +215,7 @@ struct tmem_pamops {
 				uint32_t);
 	void (*free)(void *, struct tmem_pool *,
 				struct tmem_oid *, uint32_t, bool);
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 	void (*new_obj)(struct tmem_obj *);
 	void (*free_obj)(struct tmem_pool *, struct tmem_obj *, bool);
 	void *(*repatriate_preload)(void *, struct tmem_pool *,
@@ -247,7 +247,7 @@ extern int tmem_flush_page(struct tmem_pool *, struct tmem_oid *,
 extern int tmem_flush_object(struct tmem_pool *, struct tmem_oid *);
 extern int tmem_destroy_pool(struct tmem_pool *);
 extern void tmem_new_pool(struct tmem_pool *, uint32_t);
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 extern int tmem_replace(struct tmem_pool *, struct tmem_oid *, uint32_t index,
 			void *);
 extern void *tmem_localify_get_pampd(struct tmem_pool *, struct tmem_oid *,
diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index bb59b78..288c841 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -31,8 +31,10 @@
 #include "ramster.h"
 #ifdef CONFIG_RAMSTER
 static int ramster_enabled;
+static int disable_frontswap_selfshrink;
 #else
 #define ramster_enabled 0
+#define disable_frontswap_selfshrink 0
 #endif
 
 #ifndef __PG_WAS_ACTIVE
@@ -68,8 +70,12 @@ static char *namestr __read_mostly = "zcache";
 MODULE_LICENSE("GPL");
 
 /* crypto API for zcache  */
+#ifdef CONFIG_ZCACHE_MODULE
+static char *zcache_comp_name = "lzo";
+#else
 #define ZCACHE_COMP_NAME_SZ CRYPTO_MAX_ALG_NAME
 static char zcache_comp_name[ZCACHE_COMP_NAME_SZ] __read_mostly;
+#endif
 static struct crypto_comp * __percpu *zcache_comp_pcpu_tfms __read_mostly;
 
 enum comp_op {
@@ -1639,6 +1645,7 @@ struct frontswap_ops *zcache_frontswap_register_ops(void)
  * OR NOTHING HAPPENS!
  */
 
+#ifndef CONFIG_ZCACHE_MODULE
 static int __init enable_zcache(char *s)
 {
 	zcache_enabled = 1;
@@ -1705,18 +1712,27 @@ static int __init enable_zcache_compressor(char *s)
 	return 1;
 }
 __setup("zcache=", enable_zcache_compressor);
+#endif
 
 
-static int __init zcache_comp_init(void)
+static int zcache_comp_init(void)
 {
 	int ret = 0;
 
 	/* check crypto algorithm */
+#ifdef CONFIG_ZCACHE_MODULE
+	ret = crypto_has_comp(zcache_comp_name, 0, 0);
+	if (!ret) {
+		ret = -1;
+		goto out;
+	}
+#else
 	if (*zcache_comp_name != '\0') {
 		ret = crypto_has_comp(zcache_comp_name, 0, 0);
 		if (!ret)
 			pr_info("zcache: %s not supported\n",
 					zcache_comp_name);
+		goto out;
 	}
 	if (!ret)
 		strcpy(zcache_comp_name, "lzo");
@@ -1725,6 +1741,7 @@ static int __init zcache_comp_init(void)
 		ret = 1;
 		goto out;
 	}
+#endif
 	pr_info("zcache: using %s compressor\n", zcache_comp_name);
 
 	/* alloc percpu transforms */
@@ -1736,10 +1753,13 @@ out:
 	return ret;
 }
 
-static int __init zcache_init(void)
+static int zcache_init(void)
 {
 	int ret = 0;
 
+#ifdef CONFIG_ZCACHE_MODULE
+	zcache_enabled = 1;
+#endif
 	if (ramster_enabled) {
 		namestr = "ramster";
 		ramster_register_pamops(&zcache_pamops);
@@ -1811,9 +1831,28 @@ static int __init zcache_init(void)
 	}
 	if (ramster_enabled)
 		ramster_init(!disable_cleancache, !disable_frontswap,
-				frontswap_has_exclusive_gets, false);
+				frontswap_has_exclusive_gets,
+				!disable_frontswap_selfshrink);
 out:
 	return ret;
 }
 
+#ifdef CONFIG_ZCACHE_MODULE
+#ifdef CONFIG_RAMSTER
+module_param(ramster_enabled, int, S_IRUGO);
+module_param(disable_frontswap_selfshrink, int, S_IRUGO);
+#endif
+module_param(disable_cleancache, int, S_IRUGO);
+module_param(disable_frontswap, int, S_IRUGO);
+#ifdef FRONTSWAP_HAS_EXCLUSIVE_GETS
+module_param(frontswap_has_exclusive_gets, bool, S_IRUGO);
+#endif
+module_param(disable_frontswap_ignore_nonactive, int, S_IRUGO);
+module_param(zcache_comp_name, charp, S_IRUGO);
+module_init(zcache_init);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Dan Magenheimer <dan.magenheimer@oracle.com>");
+MODULE_DESCRIPTION("In-kernel compression of cleancache/frontswap pages");
+#else
 late_initcall(zcache_init);
+#endif
diff --git a/drivers/staging/zcache/zcache.h b/drivers/staging/zcache/zcache.h
index 81722b3..8491200 100644
--- a/drivers/staging/zcache/zcache.h
+++ b/drivers/staging/zcache/zcache.h
@@ -39,7 +39,7 @@ extern int zcache_flush_page(int, int, struct tmem_oid *, uint32_t);
 extern int zcache_flush_object(int, int, struct tmem_oid *);
 extern void zcache_decompress_to_page(char *, unsigned int, struct page *);
 
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 extern void *zcache_pampd_create(char *, unsigned int, bool, int,
 				struct tmem_handle *);
 int zcache_autocreate_pool(unsigned int cli_id, unsigned int pool_id, bool eph);
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 07/15] xen: tmem: enable Xen tmem shim to be built/loaded as a module
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (5 preceding siblings ...)
  2013-02-01 20:22 ` [PATCH 06/15] staging: zcache: enable zcache " Konrad Rzeszutek Wilk
@ 2013-02-01 20:22 ` Konrad Rzeszutek Wilk
  2013-02-01 20:22 ` [PATCH 08/15] xen/tmem: Remove the subsys call Konrad Rzeszutek Wilk
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:22 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

From: Dan Magenheimer <dan.magenheimer@oracle.com>

Allow Xen tmem shim to be built/loaded as a module.  Xen self-ballooning
and frontswap-selfshrinking are now also "lazily" initialized when the
Xen tmem shim is loaded as a module, unless explicitly disabled
by module parameters.

Note runtime dependency disallows loading if cleancache/frontswap lazy
initialization patches are not present.

If built-in (not built as a module), the original mechanism of enabling via
a kernel boot parameter is retained, but this should be considered deprecated.

Note that module unload is explicitly not yet supported.

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Removed the [CLEANCACHE|FRONTSWAP]_HAS_LAZY_INIT ifdef]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/Kconfig           |  4 ++--
 drivers/xen/tmem.c            | 38 +++++++++++++++++++++++++++++---------
 drivers/xen/xen-selfballoon.c | 13 +++++++------
 include/xen/tmem.h            |  8 ++++++++
 4 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index cabfa97..981646e 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -145,9 +145,9 @@ config SWIOTLB_XEN
 	select SWIOTLB
 
 config XEN_TMEM
-	bool
+	tristate
 	depends on !ARM
-	default y if (CLEANCACHE || FRONTSWAP)
+	default m if (CLEANCACHE || FRONTSWAP)
 	help
 	  Shim to interface in-kernel Transcendent Memory hooks
 	  (e.g. cleancache and frontswap) to Xen tmem hypercalls.
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index 15e776c..9a4a9ec 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -5,6 +5,7 @@
  * Author: Dan Magenheimer
  */
 
+#include <linux/module.h>
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/init.h>
@@ -128,6 +129,7 @@ static int xen_tmem_flush_object(u32 pool_id, struct tmem_oid oid)
 	return xen_tmem_op(TMEM_FLUSH_OBJECT, pool_id, oid, 0, 0, 0, 0, 0);
 }
 
+#ifndef CONFIG_XEN_TMEM_MODULE
 bool __read_mostly tmem_enabled = false;
 
 static int __init enable_tmem(char *s)
@@ -136,6 +138,7 @@ static int __init enable_tmem(char *s)
 	return 1;
 }
 __setup("tmem", enable_tmem);
+#endif
 
 #ifdef CONFIG_CLEANCACHE
 static int xen_tmem_destroy_pool(u32 pool_id)
@@ -227,14 +230,19 @@ static int tmem_cleancache_init_shared_fs(char *uuid, size_t pagesize)
 	return xen_tmem_new_pool(shared_uuid, TMEM_POOL_SHARED, pagesize);
 }
 
-static bool __initdata use_cleancache = true;
-
+static bool disable_cleancache __read_mostly;
+static bool disable_selfballooning __read_mostly;
+#ifdef CONFIG_XEN_TMEM_MODULE
+module_param(disable_cleancache, bool, S_IRUGO);
+module_param(disable_selfballooning, bool, S_IRUGO);
+#else
 static int __init no_cleancache(char *s)
 {
-	use_cleancache = false;
+	disable_cleancache = true;
 	return 1;
 }
 __setup("nocleancache", no_cleancache);
+#endif
 
 static struct cleancache_ops tmem_cleancache_ops = {
 	.put_page = tmem_cleancache_put_page,
@@ -353,14 +361,19 @@ static void tmem_frontswap_init(unsigned ignored)
 		    xen_tmem_new_pool(private, TMEM_POOL_PERSIST, PAGE_SIZE);
 }
 
-static bool __initdata use_frontswap = true;
-
+static bool disable_frontswap __read_mostly;
+static bool disable_frontswap_selfshrinking __read_mostly;
+#ifdef CONFIG_XEN_TMEM_MODULE
+module_param(disable_frontswap, bool, S_IRUGO);
+module_param(disable_frontswap_selfshrinking, bool, S_IRUGO);
+#else
 static int __init no_frontswap(char *s)
 {
-	use_frontswap = false;
+	disable_frontswap = true;
 	return 1;
 }
 __setup("nofrontswap", no_frontswap);
+#endif
 
 static struct frontswap_ops tmem_frontswap_ops = {
 	.store = tmem_frontswap_store,
@@ -371,12 +384,12 @@ static struct frontswap_ops tmem_frontswap_ops = {
 };
 #endif
 
-static int __init xen_tmem_init(void)
+static int xen_tmem_init(void)
 {
 	if (!xen_domain())
 		return 0;
 #ifdef CONFIG_FRONTSWAP
-	if (tmem_enabled && use_frontswap) {
+	if (tmem_enabled && !disable_frontswap) {
 		char *s = "";
 		struct frontswap_ops *old_ops =
 			frontswap_register_ops(&tmem_frontswap_ops);
@@ -390,7 +403,7 @@ static int __init xen_tmem_init(void)
 #endif
 #ifdef CONFIG_CLEANCACHE
 	BUG_ON(sizeof(struct cleancache_filekey) != sizeof(struct tmem_oid));
-	if (tmem_enabled && use_cleancache) {
+	if (tmem_enabled && !disable_cleancache) {
 		char *s = "";
 		struct cleancache_ops *old_ops =
 			cleancache_register_ops(&tmem_cleancache_ops);
@@ -400,7 +413,14 @@ static int __init xen_tmem_init(void)
 				 "Xen Transcendent Memory%s\n", s);
 	}
 #endif
+#ifdef CONFIG_XEN_SELFBALLOONING
+	xen_selfballoon_init(!disable_selfballooning,
+				!disable_frontswap_selfshrinking);
+#endif
 	return 0;
 }
 
 module_init(xen_tmem_init)
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Dan Magenheimer <dan.magenheimer@oracle.com>");
+MODULE_DESCRIPTION("Shim to Xen transcendent memory");
diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c
index 2552d3e..6965e9b 100644
--- a/drivers/xen/xen-selfballoon.c
+++ b/drivers/xen/xen-selfballoon.c
@@ -121,7 +121,7 @@ static DECLARE_DELAYED_WORK(selfballoon_worker, selfballoon_process);
 static bool frontswap_selfshrinking __read_mostly;
 
 /* Enable/disable with kernel boot option. */
-static bool use_frontswap_selfshrink __initdata = true;
+static bool use_frontswap_selfshrink = true;
 
 /*
  * The default values for the following parameters were deemed reasonable
@@ -185,7 +185,7 @@ static int __init xen_nofrontswap_selfshrink_setup(char *s)
 __setup("noselfshrink", xen_nofrontswap_selfshrink_setup);
 
 /* Disable with kernel boot option. */
-static bool use_selfballooning __initdata = true;
+static bool use_selfballooning = true;
 
 static int __init xen_noselfballooning_setup(char *s)
 {
@@ -196,7 +196,7 @@ static int __init xen_noselfballooning_setup(char *s)
 __setup("noselfballooning", xen_noselfballooning_setup);
 #else /* !CONFIG_FRONTSWAP */
 /* Enable with kernel boot option. */
-static bool use_selfballooning __initdata = false;
+static bool use_selfballooning;
 
 static int __init xen_selfballooning_setup(char *s)
 {
@@ -537,7 +537,7 @@ int register_xen_selfballooning(struct device *dev)
 }
 EXPORT_SYMBOL(register_xen_selfballooning);
 
-static int __init xen_selfballoon_init(void)
+int xen_selfballoon_init(bool use_selfballooning, bool use_frontswap_selfshrink)
 {
 	bool enable = false;
 
@@ -571,7 +571,8 @@ static int __init xen_selfballoon_init(void)
 
 	return 0;
 }
+EXPORT_SYMBOL(xen_selfballoon_init);
 
+#ifndef CONFIG_XEN_TMEM_MODULE
 subsys_initcall(xen_selfballoon_init);
-
-MODULE_LICENSE("GPL");
+#endif
diff --git a/include/xen/tmem.h b/include/xen/tmem.h
index 591550a..3930a90 100644
--- a/include/xen/tmem.h
+++ b/include/xen/tmem.h
@@ -3,7 +3,15 @@
 
 #include <linux/types.h>
 
+#ifdef CONFIG_XEN_TMEM_MODULE
+#define tmem_enabled true
+#else
 /* defined in drivers/xen/tmem.c */
 extern bool tmem_enabled;
+#endif
+
+#ifdef CONFIG_XEN_SELFBALLOONING
+extern int xen_selfballoon_init(bool, bool);
+#endif
 
 #endif /* _XEN_TMEM_H */
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 08/15] xen/tmem: Remove the subsys call.
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (6 preceding siblings ...)
  2013-02-01 20:22 ` [PATCH 07/15] xen: tmem: enable Xen tmem shim " Konrad Rzeszutek Wilk
@ 2013-02-01 20:22 ` Konrad Rzeszutek Wilk
  2013-02-01 20:22 ` [PATCH 09/15] frontswap: Remove the check for frontswap_enabled Konrad Rzeszutek Wilk
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:22 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

We get:
drivers/xen/xen-selfballoon.c:577:134: warning: initialization from incompatible pointer type [enabled by default]

We actually do not need this function to be called
before tmem is loaded. So lets remove the subsys_init.

If tmem is built in as a module this is still OK as
xen_selfballoon_init ends up being exported and can
be called by the tmem module.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-selfballoon.c | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c
index 6965e9b..f2ef569 100644
--- a/drivers/xen/xen-selfballoon.c
+++ b/drivers/xen/xen-selfballoon.c
@@ -572,7 +572,3 @@ int xen_selfballoon_init(bool use_selfballooning, bool use_frontswap_selfshrink)
 	return 0;
 }
 EXPORT_SYMBOL(xen_selfballoon_init);
-
-#ifndef CONFIG_XEN_TMEM_MODULE
-subsys_initcall(xen_selfballoon_init);
-#endif
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 09/15] frontswap: Remove the check for frontswap_enabled.
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (7 preceding siblings ...)
  2013-02-01 20:22 ` [PATCH 08/15] xen/tmem: Remove the subsys call Konrad Rzeszutek Wilk
@ 2013-02-01 20:22 ` Konrad Rzeszutek Wilk
  2013-02-01 20:22 ` [PATCH 10/15] frontswap: Use static_key instead of frontswap_enabled and frontswap_ops Konrad Rzeszutek Wilk
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:22 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

With the support for loading of backends as modules (see for example:
"staging: zcache: enable zcache to be built/loaded as a module"), the
frontswap_enabled is always set to true ("mm: frontswap: lazy
initialization to allow tmem backends to build/run as modules").

The next patch "frontswap: Use static_key instead of frontswap_enabled and
frontswap_ops" is are going to convert the frontswap_enabled to be a bit more
selective and be on/off depending on whether the backend has registered - and
not whether the frontswap API is enabled.

The two functions: frontswap_init and frontswap_invalidate_area
can be called anytime - they queue up which of the swap devices are
active and can use the frontswap API - once the backend is loaded.

As such there is no need to check for 'frontswap_enabled' at all.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 include/linux/frontswap.h | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
index d4f2987..140323b 100644
--- a/include/linux/frontswap.h
+++ b/include/linux/frontswap.h
@@ -116,14 +116,12 @@ static inline void frontswap_invalidate_page(unsigned type, pgoff_t offset)
 
 static inline void frontswap_invalidate_area(unsigned type)
 {
-	if (frontswap_enabled)
-		__frontswap_invalidate_area(type);
+	__frontswap_invalidate_area(type);
 }
 
 static inline void frontswap_init(unsigned type)
 {
-	if (frontswap_enabled)
-		__frontswap_init(type);
+	__frontswap_init(type);
 }
 
 #endif /* _LINUX_FRONTSWAP_H */
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 10/15] frontswap: Use static_key instead of frontswap_enabled and frontswap_ops
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (8 preceding siblings ...)
  2013-02-01 20:22 ` [PATCH 09/15] frontswap: Remove the check for frontswap_enabled Konrad Rzeszutek Wilk
@ 2013-02-01 20:22 ` Konrad Rzeszutek Wilk
  2013-02-01 20:23 ` [PATCH 11/15] cleancache: Remove the check for cleancache_enabled Konrad Rzeszutek Wilk
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:22 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

As ways to determine whether to allow certain functions to be called.
This makes it easier to understand the code - the two functions
that can be called irregardless whether a backend is set or not is
the frontswap_init and frontswap_invalidate_area. The rest of the
frontswap functions end up being NOPs if the backend has not yet
registered.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 include/linux/frontswap.h | 17 ++++++++------
 mm/frontswap.c            | 58 ++++++++++++++++++++---------------------------
 2 files changed, 34 insertions(+), 41 deletions(-)

diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
index 140323b..8d24167 100644
--- a/include/linux/frontswap.h
+++ b/include/linux/frontswap.h
@@ -12,8 +12,6 @@ struct frontswap_ops {
 	void (*invalidate_page)(unsigned, pgoff_t);
 	void (*invalidate_area)(unsigned);
 };
-
-extern bool frontswap_enabled;
 extern struct frontswap_ops *
 	frontswap_register_ops(struct frontswap_ops *ops);
 extern void frontswap_shrink(unsigned long);
@@ -29,25 +27,30 @@ extern void __frontswap_invalidate_page(unsigned, pgoff_t);
 extern void __frontswap_invalidate_area(unsigned);
 
 #ifdef CONFIG_FRONTSWAP
+#include <linux/jump_label.h>
+
+extern struct static_key frontswap_key;
+
+#define frontswap_enabled (1)
 
 static inline bool frontswap_test(struct swap_info_struct *sis, pgoff_t offset)
 {
 	bool ret = false;
 
-	if (frontswap_enabled && sis->frontswap_map)
+	if (static_key_false(&frontswap_key) && sis->frontswap_map)
 		ret = test_bit(offset, sis->frontswap_map);
 	return ret;
 }
 
 static inline void frontswap_set(struct swap_info_struct *sis, pgoff_t offset)
 {
-	if (frontswap_enabled && sis->frontswap_map)
+	if (static_key_false(&frontswap_key) && sis->frontswap_map)
 		set_bit(offset, sis->frontswap_map);
 }
 
 static inline void frontswap_clear(struct swap_info_struct *sis, pgoff_t offset)
 {
-	if (frontswap_enabled && sis->frontswap_map)
+	if (static_key_false(&frontswap_key) && sis->frontswap_map)
 		clear_bit(offset, sis->frontswap_map);
 }
 
@@ -94,7 +97,7 @@ static inline int frontswap_store(struct page *page)
 {
 	int ret = -1;
 
-	if (frontswap_enabled)
+	if (static_key_false(&frontswap_key))
 		ret = __frontswap_store(page);
 	return ret;
 }
@@ -103,7 +106,7 @@ static inline int frontswap_load(struct page *page)
 {
 	int ret = -1;
 
-	if (frontswap_enabled)
+	if (static_key_false(&frontswap_key))
 		ret = __frontswap_load(page);
 	return ret;
 }
diff --git a/mm/frontswap.c b/mm/frontswap.c
index b9b23b1..ebf4c18 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -27,13 +27,13 @@
 static struct frontswap_ops *frontswap_ops __read_mostly;
 
 /*
- * This global enablement flag reduces overhead on systems where frontswap_ops
- * has not been registered, so is preferred to the slower alternative: a
- * function call that checks a non-global.
+ * The key is enabled once a backend has registered itself. The key
+ * by default is set to false which means that all (except frontswap_init
+ * and frontswap_invalidate) calls are NOPs. Once the key is turned on
+ * all functions execute the __frontswap code.
  */
-bool frontswap_enabled __read_mostly;
-EXPORT_SYMBOL(frontswap_enabled);
-
+struct static_key frontswap_key = STATIC_KEY_INIT_FALSE;
+EXPORT_SYMBOL(frontswap_key);
 /*
  * If enabled, frontswap_store will return failure even on success.  As
  * a result, the swap subsystem will always write the page to swap, in
@@ -83,8 +83,8 @@ static inline void inc_frontswap_invalidates(void) { }
 
 /*
  * Due to the asynchronous nature of the backends loading potentially
- * _after_ the swap system has been activated, we have chokepoints
- * on all frontswap functions to not call the backend until the backend
+ * _after_ the swap system has been activated, we have a static key
+ * on all almost frontswap functions to not call the backend until the backend
  * has registered.
  *
  * Specifically when no backend is registered (nobody called
@@ -104,21 +104,19 @@ static inline void inc_frontswap_invalidates(void) { }
  * swapoff) is again guarded by the swap subsystem.
  *
  * While no backend is registered all calls to frontswap_[store|load|
- * invalidate_area|invalidate_page] are ignored or fail.
+ * invalidate_page] are never executed as the __frontswap_*
+ * macros are based on the key which is set to false
  *
  * The time between the backend being registered and the swap file system
- * calling the backend (via the frontswap_* functions) is indeterminate as
- * frontswap_ops is not atomic_t (or a value guarded by a spinlock).
- * That is OK as we are comfortable missing some of these calls to the newly
- * registered backend.
+ * calling the backend (via the frontswap_* functions) is instant as the
+ * static_key patches over the code in question.
  *
- * Obviously the opposite (unloading the backend) must be done after all
- * the frontswap_[store|load|invalidate_area|invalidate_page] start
- * ignorning or failing the requests - at which point frontswap_ops
- * would have to be made in some fashion atomic.
+ * Obviously the opposite (unloading the backend) can be done after the
+ * the frontswap_[store|load|invalidate_page] have been turned off so there
+ * are no more requests to process. The backend has to flush out all its
+ * pages back to the swap system after that.
  */
 static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
-
 /*
  * Register operations for frontswap, returning previous thus allowing
  * detection of multiple backends and possible nesting.
@@ -128,8 +126,6 @@ struct frontswap_ops *frontswap_register_ops(struct frontswap_ops *ops)
 	struct frontswap_ops *old = frontswap_ops;
 	int i;
 
-	frontswap_enabled = true;
-
 	for (i = 0; i < MAX_SWAPFILES; i++) {
 		if (test_and_clear_bit(i, need_init))
 			ops->init(i);
@@ -141,6 +137,8 @@ struct frontswap_ops *frontswap_register_ops(struct frontswap_ops *ops)
 	 */
 	barrier();
 	frontswap_ops = ops;
+	if (!static_key_enabled(&frontswap_key))
+		static_key_slow_inc(&frontswap_key);
 	return old;
 }
 EXPORT_SYMBOL(frontswap_register_ops);
@@ -165,12 +163,14 @@ EXPORT_SYMBOL(frontswap_tmem_exclusive_gets);
 
 /*
  * Called when a swap device is swapon'd.
+ *
+ * Can be called without any backend driver is registered.
  */
 void __frontswap_init(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	if (frontswap_ops) {
+	if (static_key_false(&frontswap_key)) {
 		BUG_ON(sis == NULL);
 		if (sis->frontswap_map == NULL)
 			return;
@@ -205,11 +205,6 @@ int __frontswap_store(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
-	if (!frontswap_ops) {
-		inc_frontswap_failed_stores();
-		return ret;
-	}
-
 	BUG_ON(!PageLocked(page));
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
@@ -249,9 +244,6 @@ int __frontswap_load(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
-	if (!frontswap_ops)
-		return ret;
-
 	BUG_ON(!PageLocked(page));
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
@@ -275,9 +267,6 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	if (!frontswap_ops)
-		return;
-
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset)) {
 		frontswap_ops->invalidate_page(type, offset);
@@ -290,12 +279,14 @@ EXPORT_SYMBOL(__frontswap_invalidate_page);
 /*
  * Invalidate all data from frontswap associated with all offsets for the
  * specified swaptype.
+ *
+ * Can be called without any backend driver registered.
  */
 void __frontswap_invalidate_area(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	if (frontswap_ops) {
+	if (static_key_false(&frontswap_key)) {
 		BUG_ON(sis == NULL);
 		if (sis->frontswap_map == NULL)
 			return;
@@ -436,7 +427,6 @@ static int __init init_frontswap(void)
 	debugfs_create_u64("invalidates", S_IRUGO,
 				root, &frontswap_invalidates);
 #endif
-	frontswap_enabled = 1;
 	return 0;
 }
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 11/15] cleancache: Remove the check for cleancache_enabled.
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (9 preceding siblings ...)
  2013-02-01 20:22 ` [PATCH 10/15] frontswap: Use static_key instead of frontswap_enabled and frontswap_ops Konrad Rzeszutek Wilk
@ 2013-02-01 20:23 ` Konrad Rzeszutek Wilk
  2013-02-01 20:23 ` [PATCH 12/15] cleancache: Use static_key instead of cleancache_ops and cleancache_enabled Konrad Rzeszutek Wilk
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:23 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

With the support for loading of backends as modules, the
cleancache_enabled is always set to true. The next subsequent patches
are going to convert the cleancache_enabled to be a bit more selective
and be on/off depending on whether the backend has registered - and not
whether the cleancache API is enabled.

The three functions: cleancache_init_[shared|]fs and
cleancache_invalidate_fs can be called anytime - they queue up which
of the filesystems are active and can use the cleancache API - and when
the backend is registered they will know which filesystem to
process.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 include/linux/cleancache.h | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/include/linux/cleancache.h b/include/linux/cleancache.h
index 3af5ea8..dfa1ccb 100644
--- a/include/linux/cleancache.h
+++ b/include/linux/cleancache.h
@@ -74,14 +74,12 @@ static inline bool cleancache_fs_enabled_mapping(struct address_space *mapping)
 
 static inline void cleancache_init_fs(struct super_block *sb)
 {
-	if (cleancache_enabled)
-		__cleancache_init_fs(sb);
+	__cleancache_init_fs(sb);
 }
 
 static inline void cleancache_init_shared_fs(char *uuid, struct super_block *sb)
 {
-	if (cleancache_enabled)
-		__cleancache_init_shared_fs(uuid, sb);
+	__cleancache_init_shared_fs(uuid, sb);
 }
 
 static inline int cleancache_get_page(struct page *page)
@@ -115,8 +113,7 @@ static inline void cleancache_invalidate_inode(struct address_space *mapping)
 
 static inline void cleancache_invalidate_fs(struct super_block *sb)
 {
-	if (cleancache_enabled)
-		__cleancache_invalidate_fs(sb);
+	__cleancache_invalidate_fs(sb);
 }
 
 #endif /* _LINUX_CLEANCACHE_H */
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 12/15] cleancache: Use static_key instead of cleancache_ops and cleancache_enabled.
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (10 preceding siblings ...)
  2013-02-01 20:23 ` [PATCH 11/15] cleancache: Remove the check for cleancache_enabled Konrad Rzeszutek Wilk
@ 2013-02-01 20:23 ` Konrad Rzeszutek Wilk
  2013-02-01 20:23 ` [PATCH 13/15] frontswap: Get rid of swap_lock dependency Konrad Rzeszutek Wilk
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:23 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

As ways to determine whether to allow certain functions to be
called.

This makes it easier to understand the code - the three functions
that can be called by the filesystem irregardless whether a backend is set or
not cleancache_init_fs, cleancache_init_shared_fs, and cleancache_invalidate_fs.

The rest of the cleancache functions end up being NOPs when the backend
has not yet registered.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 include/linux/cleancache.h | 16 +++++++++-----
 include/linux/frontswap.h  |  2 +-
 mm/cleancache.c            | 53 +++++++++++++++-------------------------------
 3 files changed, 29 insertions(+), 42 deletions(-)

diff --git a/include/linux/cleancache.h b/include/linux/cleancache.h
index dfa1ccb..09df0e4 100644
--- a/include/linux/cleancache.h
+++ b/include/linux/cleancache.h
@@ -42,9 +42,15 @@ extern void __cleancache_put_page(struct page *);
 extern void __cleancache_invalidate_page(struct address_space *, struct page *);
 extern void __cleancache_invalidate_inode(struct address_space *);
 extern void __cleancache_invalidate_fs(struct super_block *);
-extern int cleancache_enabled;
 
 #ifdef CONFIG_CLEANCACHE
+#include <linux/jump_label.h>
+extern struct static_key cleancache_key;
+
+static inline bool cleancache_enabled(void)
+{
+	return static_key_false(&cleancache_key);
+}
 static inline bool cleancache_fs_enabled(struct page *page)
 {
 	return page->mapping->host->i_sb->cleancache_poolid >= 0;
@@ -86,14 +92,14 @@ static inline int cleancache_get_page(struct page *page)
 {
 	int ret = -1;
 
-	if (cleancache_enabled && cleancache_fs_enabled(page))
+	if (cleancache_enabled() && cleancache_fs_enabled(page))
 		ret = __cleancache_get_page(page);
 	return ret;
 }
 
 static inline void cleancache_put_page(struct page *page)
 {
-	if (cleancache_enabled && cleancache_fs_enabled(page))
+	if (cleancache_enabled() && cleancache_fs_enabled(page))
 		__cleancache_put_page(page);
 }
 
@@ -101,13 +107,13 @@ static inline void cleancache_invalidate_page(struct address_space *mapping,
 					struct page *page)
 {
 	/* careful... page->mapping is NULL sometimes when this is called */
-	if (cleancache_enabled && cleancache_fs_enabled_mapping(mapping))
+	if (cleancache_enabled() && cleancache_fs_enabled_mapping(mapping))
 		__cleancache_invalidate_page(mapping, page);
 }
 
 static inline void cleancache_invalidate_inode(struct address_space *mapping)
 {
-	if (cleancache_enabled && cleancache_fs_enabled_mapping(mapping))
+	if (cleancache_enabled() && cleancache_fs_enabled_mapping(mapping))
 		__cleancache_invalidate_inode(mapping);
 }
 
diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
index 8d24167..612c176 100644
--- a/include/linux/frontswap.h
+++ b/include/linux/frontswap.h
@@ -113,7 +113,7 @@ static inline int frontswap_load(struct page *page)
 
 static inline void frontswap_invalidate_page(unsigned type, pgoff_t offset)
 {
-	if (frontswap_enabled)
+	if (static_key_false(&frontswap_key))
 		__frontswap_invalidate_page(type, offset);
 }
 
diff --git a/mm/cleancache.c b/mm/cleancache.c
index 5d8dbb9..de0b905 100644
--- a/mm/cleancache.c
+++ b/mm/cleancache.c
@@ -24,9 +24,17 @@
  * is not claimed (e.g. cleancache is config'ed on but remains
  * disabled), so is preferred to the slower alternative: a function
  * call that checks a non-global.
+ *
+ * When set to false (default) all calls to the cleancache functions, except
+ * the __cleancache_invalidate_fs and __cleancache_init_[shared|]fs are guarded
+ * by the if (cleancache_enabled()) return. This means multiple threads (from
+ * different filesystems) will be doing a NOP - b/c by default the
+ * cleancache_enabled() is returing false. The usage of a static_key allows
+ * us to patch up the code and allow the flood-gates to open when a backend
+ * has registered. Vice versa when unloading a backend.
  */
-int cleancache_enabled __read_mostly;
-EXPORT_SYMBOL(cleancache_enabled);
+struct static_key cleancache_key __read_mostly;
+EXPORT_SYMBOL(cleancache_key);
 
 /*
  * cleancache_ops is set by cleancache_ops_register to contain the pointers
@@ -70,18 +78,6 @@ static char *uuids[MAX_INITIALIZABLE_FS];
  */
 static DEFINE_MUTEX(poolid_mutex);
 /*
- * When set to false (default) all calls to the cleancache functions, except
- * the __cleancache_invalidate_fs and __cleancache_init_[shared|]fs are guarded
- * by the if (!cleancache_ops) return. This means multiple threads (from
- * different filesystems) will be checking cleancache_ops. The usage of a
- * bool instead of a atomic_t or a bool guarded by a spinlock is OK - we are
- * OK if the time between the backend's have been initialized (and
- * cleancache_ops has been set to not NULL) and when the filesystems start
- * actually calling the backends. The inverse (when unloading) is obviously
- * not good - but this shim does not do that (yet).
- */
-
-/*
  * The backends and filesystems work all asynchronously. This is b/c the
  * backends can be built as modules.
  * The usual sequence of events is:
@@ -89,13 +85,13 @@ static DEFINE_MUTEX(poolid_mutex);
  * 		[shared_|]fs_poolid_map and uuids for.
  *
  * 	b). user does I/Os -> we call the rest of __cleancache_* functions
- * 		which return immediately as cleancache_ops is NULL.
+ * 		which return immediately as cleancache_enabled() returns false.
  *
  * 	c). modprobe zcache -> cleancache_register_ops. We init the backend
  * 		and set cleancache_ops to the backend, and for any fs_poolid_map
  * 		(which is set by __cleancache_init_fs) we initialize the poolid.
  *
- * 	d). user does I/Os -> now that clean_ops is not NULL all the
+ * 	d). user does I/Os -> now that cleancache_enabled is turned to on, the
  * 		__cleancache_* functions can call the backend. They all check
  * 		that fs_poolid_map is valid and if so invoke the backend.
  *
@@ -111,8 +107,8 @@ static DEFINE_MUTEX(poolid_mutex);
  * of unmounting process).
  *
  * Note: The acute reader will notice that there is no "rmmod zcache" case.
- * This is b/c the functionality for that is not yet implemented and when
- * done, will require some extra locking not yet devised.
+ * This is b/c the functionality for that is not yet implemented in the
+ * backend. In here, it will require turning the cleancache_key off.
  */
 
 /*
@@ -139,6 +135,8 @@ struct cleancache_ops *cleancache_register_ops(struct cleancache_ops *ops)
 	 */
 	barrier();
 	cleancache_ops = ops;
+	if (!static_key_enabled(&cleancache_key))
+		static_key_slow_inc(&cleancache_key);
 	mutex_unlock(&poolid_mutex);
 	return old;
 }
@@ -242,11 +240,6 @@ int __cleancache_get_page(struct page *page)
 	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!cleancache_ops) {
-		cleancache_failed_gets++;
-		goto out;
-	}
-
 	VM_BUG_ON(!PageLocked(page));
 	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
 	if (fake_pool_id < 0)
@@ -284,11 +277,6 @@ void __cleancache_put_page(struct page *page)
 	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!cleancache_ops) {
-		cleancache_puts++;
-		return;
-	}
-
 	VM_BUG_ON(!PageLocked(page));
 	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
 	if (fake_pool_id < 0)
@@ -320,9 +308,6 @@ void __cleancache_invalidate_page(struct address_space *mapping,
 	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!cleancache_ops)
-		return;
-
 	if (fake_pool_id >= 0) {
 		pool_id = get_poolid_from_fake(fake_pool_id);
 		if (pool_id < 0)
@@ -353,9 +338,6 @@ void __cleancache_invalidate_inode(struct address_space *mapping)
 	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!cleancache_ops)
-		return;
-
 	if (fake_pool_id < 0)
 		return;
 
@@ -389,7 +371,7 @@ void __cleancache_invalidate_fs(struct super_block *sb)
 		fs_poolid_map[index] = FS_UNKNOWN;
 	}
 	sb->cleancache_poolid = -1;
-	if (cleancache_ops)
+	if (cleancache_enabled())
 		cleancache_ops->invalidate_fs(old_poolid);
 	mutex_unlock(&poolid_mutex);
 }
@@ -414,7 +396,6 @@ static int __init init_cleancache(void)
 		fs_poolid_map[i] = FS_UNKNOWN;
 		shared_fs_poolid_map[i] = FS_UNKNOWN;
 	}
-	cleancache_enabled = 1;
 	return 0;
 }
 module_init(init_cleancache)
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 13/15] frontswap: Get rid of swap_lock dependency
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (11 preceding siblings ...)
  2013-02-01 20:23 ` [PATCH 12/15] cleancache: Use static_key instead of cleancache_ops and cleancache_enabled Konrad Rzeszutek Wilk
@ 2013-02-01 20:23 ` Konrad Rzeszutek Wilk
  2013-02-01 20:23 ` [PATCH 14/15] zcache/tmem: Better error checking on frontswap_register_ops return value Konrad Rzeszutek Wilk
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:23 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel
  Cc: Minchan Kim, Konrad Rzeszutek Wilk

From: Minchan Kim <minchan@kernel.org>

Frontswap initialization routine depends on swap_lock, which want
to be atomic about frontswap's first appearance.
IOW, frontswap is not present and will fail all calls OR frontswap is
fully functional but if new swap_info_struct isn't registered
by enable_swap_info, swap subsystem doesn't start I/O so there is no
race
between init procedure and page I/O working on frontswap.

So let's remove unncessary swap_lock dependency.

Cc: Dan Magenheimer <dan.magenheimer@oracle.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
[v1: Rebased on my branch, reworked to work with backends loading late]
[v2: Added a check for !map]
Signed-off-by: Konrad Rzeszutek Wilk <konrad@darnok.org>

squash
---
 include/linux/frontswap.h |  6 +++---
 mm/frontswap.c            | 12 +++++++++---
 mm/swapfile.c             |  7 ++++++-
 3 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
index 612c176..3d72f14 100644
--- a/include/linux/frontswap.h
+++ b/include/linux/frontswap.h
@@ -20,7 +20,7 @@ extern void frontswap_writethrough(bool);
 #define FRONTSWAP_HAS_EXCLUSIVE_GETS
 extern void frontswap_tmem_exclusive_gets(bool);
 
-extern void __frontswap_init(unsigned type);
+extern void __frontswap_init(unsigned type, unsigned long *map);
 extern int __frontswap_store(struct page *page);
 extern int __frontswap_load(struct page *page);
 extern void __frontswap_invalidate_page(unsigned, pgoff_t);
@@ -122,9 +122,9 @@ static inline void frontswap_invalidate_area(unsigned type)
 	__frontswap_invalidate_area(type);
 }
 
-static inline void frontswap_init(unsigned type)
+static inline void frontswap_init(unsigned type, unsigned long *map)
 {
-	__frontswap_init(type);
+	__frontswap_init(type, map);
 }
 
 #endif /* _LINUX_FRONTSWAP_H */
diff --git a/mm/frontswap.c b/mm/frontswap.c
index ebf4c18..8254a6a 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -127,8 +127,13 @@ struct frontswap_ops *frontswap_register_ops(struct frontswap_ops *ops)
 	int i;
 
 	for (i = 0; i < MAX_SWAPFILES; i++) {
-		if (test_and_clear_bit(i, need_init))
+		if (test_and_clear_bit(i, need_init)) {
+			struct swap_info_struct *sis = swap_info[i];
+			/* enable_swap_info _should_ have set it! */
+			if (!sis->frontswap_map)
+				return ERR_PTR(-EINVAL);
 			ops->init(i);
+		}
 	}
 	/*
 	 * We MUST have frontswap_ops set _after_ the frontswap_init's
@@ -166,14 +171,15 @@ EXPORT_SYMBOL(frontswap_tmem_exclusive_gets);
  *
  * Can be called without any backend driver is registered.
  */
-void __frontswap_init(unsigned type)
+void __frontswap_init(unsigned type, unsigned long *map)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
 	if (static_key_false(&frontswap_key)) {
 		BUG_ON(sis == NULL);
-		if (sis->frontswap_map == NULL)
+		if (!map)
 			return;
+		frontswap_map_set(sis, map);
 		frontswap_ops->init(type);
 	}
 	else {
diff --git a/mm/swapfile.c b/mm/swapfile.c
index e97a0e5..c1c3a62 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1454,6 +1454,10 @@ static void _enable_swap_info(struct swap_info_struct *p, int prio,
 	else
 		p->prio = --least_priority;
 	p->swap_map = swap_map;
+	/*
+	 * This is required for frontswap to handle backends loading
+	 * after the swap has been activated.
+	 */
 	frontswap_map_set(p, frontswap_map);
 	p->flags |= SWP_WRITEOK;
 	nr_swap_pages += p->pages;
@@ -1477,9 +1481,9 @@ static void enable_swap_info(struct swap_info_struct *p, int prio,
 				unsigned char *swap_map,
 				unsigned long *frontswap_map)
 {
+	frontswap_init(p->type, frontswap_map);
 	spin_lock(&swap_lock);
 	_enable_swap_info(p, prio, swap_map, frontswap_map);
-	frontswap_init(p->type);
 	spin_unlock(&swap_lock);
 }
 
@@ -1589,6 +1593,7 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
 	p->swap_map = NULL;
 	p->flags = 0;
 	frontswap_invalidate_area(type);
+	frontswap_map_set(p, NULL);
 	spin_unlock(&swap_lock);
 	mutex_unlock(&swapon_mutex);
 	vfree(swap_map);
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 14/15] zcache/tmem: Better error checking on frontswap_register_ops return value.
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (12 preceding siblings ...)
  2013-02-01 20:23 ` [PATCH 13/15] frontswap: Get rid of swap_lock dependency Konrad Rzeszutek Wilk
@ 2013-02-01 20:23 ` Konrad Rzeszutek Wilk
  2013-02-01 20:23 ` [PATCH 15/15] xen/tmem: Add missing %s in the printk statement Konrad Rzeszutek Wilk
  2013-02-03  8:52 ` [PATCH v2] Make frontswap+cleancache and its friend be modularized Ric Mason
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:23 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

In the past it either used to be NULL or the "older" backend. Now we
also return -Exx error codes.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/staging/zcache/zcache-main.c | 5 ++++-
 drivers/xen/tmem.c                   | 5 ++++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index 288c841..79c10af 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1826,8 +1826,11 @@ static int zcache_init(void)
 			namestr, frontswap_has_exclusive_gets,
 			!disable_frontswap_ignore_nonactive);
 #endif
-		if (old_ops != NULL)
+		if (IS_ERR(old_ops) || old_ops) {
+			if (IS_ERR(old_ops))
+				return PTR_RET(old_ops);
 			pr_warn("%s: frontswap_ops overridden\n", namestr);
+		}
 	}
 	if (ramster_enabled)
 		ramster_init(!disable_cleancache, !disable_frontswap,
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index 9a4a9ec..2f939e5 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -395,8 +395,11 @@ static int xen_tmem_init(void)
 			frontswap_register_ops(&tmem_frontswap_ops);
 
 		tmem_frontswap_poolid = -1;
-		if (old_ops)
+		if (IS_ERR(old_ops) || old_ops) {
+			if (IS_ERR(old_ops))
+				return PTR_ERR(old_ops);
 			s = " (WARNING: frontswap_ops overridden)";
+		}
 		printk(KERN_INFO "frontswap enabled, RAM provided by "
 				 "Xen Transcendent Memory\n");
 	}
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 15/15] xen/tmem: Add missing %s in the printk statement.
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (13 preceding siblings ...)
  2013-02-01 20:23 ` [PATCH 14/15] zcache/tmem: Better error checking on frontswap_register_ops return value Konrad Rzeszutek Wilk
@ 2013-02-01 20:23 ` Konrad Rzeszutek Wilk
  2013-02-03  8:52 ` [PATCH v2] Make frontswap+cleancache and its friend be modularized Ric Mason
  15 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-01 20:23 UTC (permalink / raw)
  To: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

Seems that it got lost.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/tmem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index 2f939e5..4f3ff99 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -401,7 +401,7 @@ static int xen_tmem_init(void)
 			s = " (WARNING: frontswap_ops overridden)";
 		}
 		printk(KERN_INFO "frontswap enabled, RAM provided by "
-				 "Xen Transcendent Memory\n");
+				 "Xen Transcendent Memory%s\n", s);
 	}
 #endif
 #ifdef CONFIG_CLEANCACHE
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 02/15] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
  2013-02-01 20:22 ` [PATCH 02/15] mm: frontswap: " Konrad Rzeszutek Wilk
@ 2013-02-03  7:07   ` Ric Mason
  2013-02-04  5:53     ` Bob Liu
  0 siblings, 1 reply; 24+ messages in thread
From: Ric Mason @ 2013-02-03  7:07 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel, Stefan Hengelein,
	Florian Schmaus, Andor Daam

Hi Konrad,
On Fri, 2013-02-01 at 15:22 -0500, Konrad Rzeszutek Wilk wrote:
> From: Dan Magenheimer <dan.magenheimer@oracle.com>
> 
> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> built/loaded as modules rather than built-in and enabled by a boot parameter,
> this patch provides "lazy initialization", allowing backends to register to
> frontswap even after swapon was run. Before a backend registers all calls
> to init are recorded and the creation of tmem_pools delayed until a backend
> registers or until a frontswap put is attempted.

Do you mean __frontswap_store? It seems that just add fail count if
backend doesn't register, why you said that the creation of tmem_pools
will delay until this time?

> 
> Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
> Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
> Signed-off-by: Andor Daam <andor.daam@googlemail.com>
> Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
> [v1: Fixes per Seth Jennings suggestions]
> [v2: Removed FRONTSWAP_HAS_.. ]
> [v3: Fix up per Bob Liu <lliubbo@gmail.com> recommendations]
> [v4: Fix up per Andrew's comments]
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  mm/frontswap.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 85 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/frontswap.c b/mm/frontswap.c
> index 2890e67..c05a9db 100644
> --- a/mm/frontswap.c
> +++ b/mm/frontswap.c
> @@ -80,6 +80,46 @@ static inline void inc_frontswap_succ_stores(void) { }
>  static inline void inc_frontswap_failed_stores(void) { }
>  static inline void inc_frontswap_invalidates(void) { }
>  #endif
> +
> +/*
> + * Due to the asynchronous nature of the backends loading potentially
> + * _after_ the swap system has been activated, we have chokepoints
> + * on all frontswap functions to not call the backend until the backend
> + * has registered.
> + *
> + * Specifically when no backend is registered (nobody called
> + * frontswap_register_ops) all calls to frontswap_init (which is done via
> + * swapon -> enable_swap_info -> frontswap_init) are registered and remembered
> + * (via the setting of need_init bitmap) but fail to create tmem_pools. When a
> + * backend registers with frontswap at some later point the previous
> + * calls to frontswap_init are executed (by iterating over the need_init
> + * bitmap) to create tmem_pools and set the respective poolids. All of that is
> + * guarded by us using atomic bit operations on the 'need_init' bitmap.
> + *
> + * This would not guards us against the user deciding to call swapoff right as
> + * we are calling the backend to initialize (so swapon is in action).
> + * Fortunatly for us, the swapon_mutex has been taked by the callee so we are
> + * OK. The other scenario where calls to frontswap_store (called via
> + * swap_writepage) is racing with frontswap_invalidate_area (called via
> + * swapoff) is again guarded by the swap subsystem.
> + *
> + * While no backend is registered all calls to frontswap_[store|load|
> + * invalidate_area|invalidate_page] are ignored or fail.
> + *
> + * The time between the backend being registered and the swap file system
> + * calling the backend (via the frontswap_* functions) is indeterminate as
> + * backend_registered is not atomic_t (or a value guarded by a spinlock).
> + * That is OK as we are comfortable missing some of these calls to the newly
> + * registered backend.
> + *
> + * Obviously the opposite (unloading the backend) must be done after all
> + * the frontswap_[store|load|invalidate_area|invalidate_page] start
> + * ignorning or failing the requests - at which point backend_registered
> + * would have to be made in some fashion atomic.
> + */
> +static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
> +static bool backend_registered __read_mostly;
> +
>  /*
>   * Register operations for frontswap, returning previous thus allowing
>   * detection of multiple backends and possible nesting.
> @@ -87,9 +127,22 @@ static inline void inc_frontswap_invalidates(void) { }
>  struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
>  {
>  	struct frontswap_ops old = frontswap_ops;
> +	int i;
>  
>  	frontswap_ops = *ops;
>  	frontswap_enabled = true;
> +
> +	for (i = 0; i < MAX_SWAPFILES; i++) {
> +		if (test_and_clear_bit(i, need_init))
> +			(*frontswap_ops.init)(i);
> +	}
> +	/*
> +	 * We MUST have backend_registered set _after_ the frontswap_init's
> +	 * have been called. Otherwise __frontswap_store might fail. Hence
> +	 * the barrier to make sure compiler does not re-order us.
> +	 */
> +	barrier();
> +	backend_registered = true;
>  	return old;
>  }
>  EXPORT_SYMBOL(frontswap_register_ops);
> @@ -119,10 +172,17 @@ void __frontswap_init(unsigned type)
>  {
>  	struct swap_info_struct *sis = swap_info[type];
>  
> -	BUG_ON(sis == NULL);
> -	if (sis->frontswap_map == NULL)
> -		return;
> -	frontswap_ops.init(type);
> +	if (backend_registered) {
> +		BUG_ON(sis == NULL);
> +		if (sis->frontswap_map == NULL)
> +			return;
> +		(*frontswap_ops.init)(type);
> +	}
> +	else {
> +		BUG_ON(type > MAX_SWAPFILES);
> +		set_bit(type, need_init);
> +	}
> +
>  }
>  EXPORT_SYMBOL(__frontswap_init);
>  
> @@ -147,6 +207,11 @@ int __frontswap_store(struct page *page)
>  	struct swap_info_struct *sis = swap_info[type];
>  	pgoff_t offset = swp_offset(entry);
>  
> +	if (!backend_registered) {
> +		inc_frontswap_failed_stores();
> +		return ret;
> +	}
> +
>  	BUG_ON(!PageLocked(page));
>  	BUG_ON(sis == NULL);
>  	if (frontswap_test(sis, offset))
> @@ -186,6 +251,9 @@ int __frontswap_load(struct page *page)
>  	struct swap_info_struct *sis = swap_info[type];
>  	pgoff_t offset = swp_offset(entry);
>  
> +	if (!backend_registered)
> +		return ret;
> +
>  	BUG_ON(!PageLocked(page));
>  	BUG_ON(sis == NULL);
>  	if (frontswap_test(sis, offset))
> @@ -209,6 +277,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
>  {
>  	struct swap_info_struct *sis = swap_info[type];
>  
> +	if (!backend_registered)
> +		return;
> +
>  	BUG_ON(sis == NULL);
>  	if (frontswap_test(sis, offset)) {
>  		frontswap_ops.invalidate_page(type, offset);
> @@ -226,12 +297,15 @@ void __frontswap_invalidate_area(unsigned type)
>  {
>  	struct swap_info_struct *sis = swap_info[type];
>  
> -	BUG_ON(sis == NULL);
> -	if (sis->frontswap_map == NULL)
> -		return;
> -	frontswap_ops.invalidate_area(type);
> -	atomic_set(&sis->frontswap_pages, 0);
> -	memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> +	if (backend_registered) {
> +		BUG_ON(sis == NULL);
> +		if (sis->frontswap_map == NULL)
> +			return;
> +		(*frontswap_ops.invalidate_area)(type);
> +		atomic_set(&sis->frontswap_pages, 0);
> +		memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> +	}
> +	clear_bit(type, need_init);
>  }
>  EXPORT_SYMBOL(__frontswap_invalidate_area);
>  
> @@ -364,6 +438,7 @@ static int __init init_frontswap(void)
>  	debugfs_create_u64("invalidates", S_IRUGO,
>  				root, &frontswap_invalidates);
>  #endif
> +	frontswap_enabled = 1;

Why has this change?

>  	return 0;
>  }
>  



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 01/15] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules
  2013-02-01 20:22 ` [PATCH 01/15] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules Konrad Rzeszutek Wilk
@ 2013-02-03  8:06   ` Ric Mason
  0 siblings, 0 replies; 24+ messages in thread
From: Ric Mason @ 2013-02-03  8:06 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel, Stefan Hengelein,
	Florian Schmaus, Andor Daam

Hi Konrad,
On Fri, 2013-02-01 at 15:22 -0500, Konrad Rzeszutek Wilk wrote:
> From: Dan Magenheimer <dan.magenheimer@oracle.com>
> 
> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> built/loaded as modules rather than built-in and enabled by a boot parameter,

Which boot parameter? I can't find it in
Documentation/kernl-parameters.txt

> this patch provides "lazy initialization", allowing backends to register to
> cleancache even after filesystems were mounted. Calls to init_fs and
> init_shared_fs are remembered as fake poolids but no real tmem_pools created.
> On backend registration the fake poolids are mapped to real poolids and
> respective tmem_pools.
> 
> Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
> Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
> Signed-off-by: Andor Daam <andor.daam@googlemail.com>
> Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
> [v1: Minor fixes: used #define for some values and bools]
> [v2: Removed CLEANCACHE_HAS_LAZY_INIT]
> [v3: Added more comments, added a lock for [shared_|]fs_poolid_map]
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  mm/cleancache.c | 240 +++++++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 219 insertions(+), 21 deletions(-)
> 
> diff --git a/mm/cleancache.c b/mm/cleancache.c
> index 32e6f41..e4dc314 100644
> --- a/mm/cleancache.c
> +++ b/mm/cleancache.c
> @@ -45,15 +45,99 @@ static u64 cleancache_puts;
>  static u64 cleancache_invalidates;
>  
>  /*
> - * register operations for cleancache, returning previous thus allowing
> - * detection of multiple backends and possible nesting
> + * When no backend is registered all calls to init_fs and init_shared_fs
> + * are registered and fake poolids (FAKE_FS_POOLID_OFFSET or
> + * FAKE_SHARED_FS_POOLID_OFFSET, plus offset in the respective array
> + * [shared_|]fs_poolid_map) are given to the respective super block
> + * (sb->cleancache_poolid) and no tmem_pools are created. When a backend
> + * registers with cleancache the previous calls to init_fs and init_shared_fs
> + * are executed to create tmem_pools and set the respective poolids. While no
> + * backend is registered all "puts", "gets" and "flushes" are ignored or failed.
> + */
> +#define MAX_INITIALIZABLE_FS 32
> +#define FAKE_FS_POOLID_OFFSET 1000
> +#define FAKE_SHARED_FS_POOLID_OFFSET 2000
> +
> +#define FS_NO_BACKEND (-1)
> +#define FS_UNKNOWN (-2)
> +static int fs_poolid_map[MAX_INITIALIZABLE_FS];
> +static int shared_fs_poolid_map[MAX_INITIALIZABLE_FS];
> +static char *uuids[MAX_INITIALIZABLE_FS];
> +/*
> + * Mutex for the [shared_|]fs_poolid_map to guard against multiple threads
> + * invoking umount (and ending in __cleancache_invalidate_fs) and also multiple
> + * threads calling mount (and ending up in __cleancache_init_[shared|]fs).
> + */
> +static DEFINE_MUTEX(poolid_mutex);
> +/*
> + * When set to false (default) all calls to the cleancache functions, except
> + * the __cleancache_invalidate_fs and __cleancache_init_[shared|]fs are guarded
> + * by the if (!backend_registered) return. This means multiple threads (from
> + * different filesystems) will be checking backend_registered. The usage of a
> + * bool instead of a atomic_t or a bool guarded by a spinlock is OK - we are
> + * OK if the time between the backend's have been initialized (and
> + * backend_registered has been set to true) and when the filesystems start
> + * actually calling the backends. The inverse (when unloading) is obviously
> + * not good - but this shim does not do that (yet).
> + */
> +static bool backend_registered __read_mostly;
> +
> +/*
> + * The backends and filesystems work all asynchronously. This is b/c the

What's the meaning of b/c?zx

> + * backends can be built as modules.
> + * The usual sequence of events is:
> + * 	a) mount /	-> __cleancache_init_fs is called. We set the
> + * 		[shared_|]fs_poolid_map and uuids for.
> + *
> + * 	b). user does I/Os -> we call the rest of __cleancache_* functions
> + * 		which return immediately as backend_registered is false.
> + *
> + * 	c). modprobe zcache -> cleancache_register_ops. We init the backend
> + * 		and set backend_registered to true, and for any fs_poolid_map
> + * 		(which is set by __cleancache_init_fs) we initialize the poolid.
> + *
> + * 	d). user does I/Os -> now that backend_registered is true all the
> + * 		__cleancache_* functions can call the backend. They all check
> + * 		that fs_poolid_map is valid and if so invoke the backend.
> + *
> + * 	e). umount /	-> __cleancache_invalidate_fs, the fs_poolid_map is
> + * 		reset (which is the second check in the __cleancache_* ops
> + * 		to call the backend).
> + *
> + * The sequence of event could also be c), followed by a), and d). and e). The
> + * c) would not happen anymore. There is also the chance of c), and one thread
> + * doing a) + d), and another doing e). For that case we depend on the
> + * filesystem calling __cleancache_invalidate_fs in the proper sequence (so
> + * that it handles all I/Os before it invalidates the fs (which is last part
> + * of unmounting process).
> + *
> + * Note: The acute reader will notice that there is no "rmmod zcache" case.
> + * This is b/c the functionality for that is not yet implemented and when
> + * done, will require some extra locking not yet devised.
> + */
> +
> +/*
> + * Register operations for cleancache, returning previous thus allowing
> + * detection of multiple backends and possible nesting.
>   */
>  struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
>  {
>  	struct cleancache_ops old = cleancache_ops;
> +	int i;
>  
> +	mutex_lock(&poolid_mutex);
>  	cleancache_ops = *ops;
> -	cleancache_enabled = 1;
> +
> +	backend_registered = true;
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (fs_poolid_map[i] == FS_NO_BACKEND)
> +			fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +		if (shared_fs_poolid_map[i] == FS_NO_BACKEND)
> +			shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
> +					(uuids[i], PAGE_SIZE);
> +	}
> +out:
> +	mutex_unlock(&poolid_mutex);
>  	return old;
>  }
>  EXPORT_SYMBOL(cleancache_register_ops);
> @@ -61,15 +145,42 @@ EXPORT_SYMBOL(cleancache_register_ops);
>  /* Called by a cleancache-enabled filesystem at time of mount */
>  void __cleancache_init_fs(struct super_block *sb)
>  {
> -	sb->cleancache_poolid = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +	int i;
> +
> +	mutex_lock(&poolid_mutex);
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (fs_poolid_map[i] == FS_UNKNOWN) {
> +			sb->cleancache_poolid = i + FAKE_FS_POOLID_OFFSET;
> +			if (backend_registered)
> +				fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +			else
> +				fs_poolid_map[i] = FS_NO_BACKEND;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&poolid_mutex);
>  }
>  EXPORT_SYMBOL(__cleancache_init_fs);
>  
>  /* Called by a cleancache-enabled clustered filesystem at time of mount */
>  void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
>  {
> -	sb->cleancache_poolid =
> -		(*cleancache_ops.init_shared_fs)(uuid, PAGE_SIZE);
> +	int i;
> +
> +	mutex_lock(&poolid_mutex);
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (shared_fs_poolid_map[i] == FS_UNKNOWN) {
> +			sb->cleancache_poolid = i + FAKE_SHARED_FS_POOLID_OFFSET;
> +			uuids[i] = uuid;
> +			if (backend_registered)
> +				shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
> +						(uuid, PAGE_SIZE);
> +			else
> +				shared_fs_poolid_map[i] = FS_NO_BACKEND;
> +			break;
> +		}
> +	}
> +	mutex_unlock(&poolid_mutex);
>  }
>  EXPORT_SYMBOL(__cleancache_init_shared_fs);
>  
> @@ -99,27 +210,53 @@ static int cleancache_get_key(struct inode *inode,
>  }
>  
>  /*
> + * Returns a pool_id that is associated with a given fake poolid.
> + */
> +static int get_poolid_from_fake(int fake_pool_id)
> +{
> +	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET)
> +		return shared_fs_poolid_map[fake_pool_id -
> +			FAKE_SHARED_FS_POOLID_OFFSET];
> +	else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET)
> +		return fs_poolid_map[fake_pool_id - FAKE_FS_POOLID_OFFSET];
> +	return FS_NO_BACKEND;
> +}
> +
> +/*
>   * "Get" data from cleancache associated with the poolid/inode/index
>   * that were specified when the data was put to cleanache and, if
>   * successful, use it to fill the specified page with data and return 0.
>   * The pageframe is unchanged and returns -1 if the get fails.
>   * Page must be locked by caller.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  int __cleancache_get_page(struct page *page)
>  {
>  	int ret = -1;
>  	int pool_id;
> +	int fake_pool_id;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> +	if (!backend_registered) {
> +		cleancache_failed_gets++;
> +		goto out;
> +	}
> +
>  	VM_BUG_ON(!PageLocked(page));
> -	pool_id = page->mapping->host->i_sb->cleancache_poolid;
> -	if (pool_id < 0)
> +	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	if (fake_pool_id < 0)
>  		goto out;
> +	pool_id = get_poolid_from_fake(fake_pool_id);
>  
>  	if (cleancache_get_key(page->mapping->host, &key) < 0)
>  		goto out;
>  
> -	ret = (*cleancache_ops.get_page)(pool_id, key, page->index, page);
> +	if (pool_id >= 0)
> +		ret = (*cleancache_ops.get_page)(pool_id,
> +				key, page->index, page);
>  	if (ret == 0)
>  		cleancache_succ_gets++;
>  	else
> @@ -134,16 +271,31 @@ EXPORT_SYMBOL(__cleancache_get_page);
>   * (previously-obtained per-filesystem) poolid and the page's,
>   * inode and page index.  Page must be locked.  Note that a put_page
>   * always "succeeds", though a subsequent get_page may succeed or fail.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  void __cleancache_put_page(struct page *page)
>  {
>  	int pool_id;
> +	int fake_pool_id;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> +	if (!backend_registered) {
> +		cleancache_puts++;
> +		return;
> +	}
> +
>  	VM_BUG_ON(!PageLocked(page));
> -	pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	if (fake_pool_id < 0)
> +		return;
> +
> +	pool_id = get_poolid_from_fake(fake_pool_id);
> +
>  	if (pool_id >= 0 &&
> -	      cleancache_get_key(page->mapping->host, &key) >= 0) {
> +		cleancache_get_key(page->mapping->host, &key) >= 0) {
>  		(*cleancache_ops.put_page)(pool_id, key, page->index, page);
>  		cleancache_puts++;
>  	}
> @@ -153,19 +305,31 @@ EXPORT_SYMBOL(__cleancache_put_page);
>  /*
>   * Invalidate any data from cleancache associated with the poolid and the
>   * page's inode and page index so that a subsequent "get" will fail.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  void __cleancache_invalidate_page(struct address_space *mapping,
>  					struct page *page)
>  {
>  	/* careful... page->mapping is NULL sometimes when this is called */
> -	int pool_id = mapping->host->i_sb->cleancache_poolid;
> +	int pool_id;
> +	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> -	if (pool_id >= 0) {
> +	if (!backend_registered)
> +		return;
> +
> +	if (fake_pool_id >= 0) {
> +		pool_id = get_poolid_from_fake(fake_pool_id);
> +		if (pool_id < 0)
> +			return;
> +
>  		VM_BUG_ON(!PageLocked(page));
>  		if (cleancache_get_key(mapping->host, &key) >= 0) {
>  			(*cleancache_ops.invalidate_page)(pool_id,
> -							  key, page->index);
> +					key, page->index);
>  			cleancache_invalidates++;
>  		}
>  	}
> @@ -176,12 +340,25 @@ EXPORT_SYMBOL(__cleancache_invalidate_page);
>   * Invalidate all data from cleancache associated with the poolid and the
>   * mappings's inode so that all subsequent gets to this poolid/inode
>   * will fail.
> + *
> + * The function has two checks before any action is taken - whether
> + * a backend is registered and whether the sb->cleancache_poolid
> + * is correct.
>   */
>  void __cleancache_invalidate_inode(struct address_space *mapping)
>  {
> -	int pool_id = mapping->host->i_sb->cleancache_poolid;
> +	int pool_id;
> +	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> +	if (!backend_registered)
> +		return;
> +
> +	if (fake_pool_id < 0)
> +		return;
> +
> +	pool_id = get_poolid_from_fake(fake_pool_id);
> +
>  	if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0)
>  		(*cleancache_ops.invalidate_inode)(pool_id, key);
>  }
> @@ -189,21 +366,37 @@ EXPORT_SYMBOL(__cleancache_invalidate_inode);
>  
>  /*
>   * Called by any cleancache-enabled filesystem at time of unmount;
> - * note that pool_id is surrendered and may be reutrned by a subsequent
> - * cleancache_init_fs or cleancache_init_shared_fs
> + * note that pool_id is surrendered and may be returned by a subsequent
> + * cleancache_init_fs or cleancache_init_shared_fs.
>   */
>  void __cleancache_invalidate_fs(struct super_block *sb)
>  {
> -	if (sb->cleancache_poolid >= 0) {
> -		int old_poolid = sb->cleancache_poolid;
> -		sb->cleancache_poolid = -1;
> -		(*cleancache_ops.invalidate_fs)(old_poolid);
> +	int index;
> +	int fake_pool_id = sb->cleancache_poolid;
> +	int old_poolid = fake_pool_id;
> +
> +	mutex_lock(&poolid_mutex);
> +	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET) {
> +		index = fake_pool_id - FAKE_SHARED_FS_POOLID_OFFSET;
> +		old_poolid = shared_fs_poolid_map[index];
> +		shared_fs_poolid_map[index] = FS_UNKNOWN;
> +		uuids[index] = NULL;
> +	} else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET) {
> +		index = fake_pool_id - FAKE_FS_POOLID_OFFSET;
> +		old_poolid = fs_poolid_map[index];
> +		fs_poolid_map[index] = FS_UNKNOWN;
>  	}
> +	sb->cleancache_poolid = -1;
> +	if (backend_registered)
> +		(*cleancache_ops.invalidate_fs)(old_poolid);
> +	mutex_unlock(&poolid_mutex);
>  }
>  EXPORT_SYMBOL(__cleancache_invalidate_fs);
>  
>  static int __init init_cleancache(void)
>  {
> +	int i;
> +
>  #ifdef CONFIG_DEBUG_FS
>  	struct dentry *root = debugfs_create_dir("cleancache", NULL);
>  	if (root == NULL)
> @@ -215,6 +408,11 @@ static int __init init_cleancache(void)
>  	debugfs_create_u64("invalidates", S_IRUGO,
>  				root, &cleancache_invalidates);
>  #endif
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		fs_poolid_map[i] = FS_UNKNOWN;
> +		shared_fs_poolid_map[i] = FS_UNKNOWN;
> +	}
> +	cleancache_enabled = 1;
>  	return 0;
>  }
>  module_init(init_cleancache)



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] Make frontswap+cleancache and its friend be modularized.
  2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
                   ` (14 preceding siblings ...)
  2013-02-01 20:23 ` [PATCH 15/15] xen/tmem: Add missing %s in the printk statement Konrad Rzeszutek Wilk
@ 2013-02-03  8:52 ` Ric Mason
  2013-02-04 15:14   ` Seth Jennings
  15 siblings, 1 reply; 24+ messages in thread
From: Ric Mason @ 2013-02-03  8:52 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: dan.magenheimer, konrad.wilk, sjenning, gregkh, akpm, ngupta,
	rcj, linux-mm, linux-kernel, devel

Hi Konrad,
On Fri, 2013-02-01 at 15:22 -0500, Konrad Rzeszutek Wilk wrote:

I have already enable frontswap,cleancache,zcache,
 FRONTSWAP [=y]  
 CLEANCACHE [=y]
 ZCACHE [=y]
But all of knode under /sys/kernel/debug/frontswap and cleancache still
zero, my swap device is enable, where I miss?

> Parts of this patch have been posted in the post (way back in November), but
> this patchset expanded it a bit. The goal of the patches is to make the
> different frontswap/cleancache API backends be modules - and load way way after
> the swap system (or filesystem) has been initialized. Naturally one can still
> build the frontswap+cleancache backend devices in the kernel. The next goal
> (after these patches) is to also be able to unload the backend drivers - but
> that places some interesting requirements to "reload" the swap device with
> swap pages (don't need to worry that much about cleancache as it is a "secondary"
> cache and can be dumped). Seth had posted some patches for that in the zswap
> backend - and they could be more generally repurporsed.
> 
> Anyhow, I did not want to lose the authorship of some of the patches so I
> didn't squash the ones that were made by Dan and mine. I can do it for review
> if it would make it easier, but from my recollection on how Linus likes things
> run he would prefer to keep the history (even the kludge parts).
> 
> The general flow prior to these patches was [I am concentrating on the
> frontswap here, but the cleancache is similar, just s/swapon/mount/]:
> 
>  1) kernel inits frontswap_init
>  2) kernel inits zcache (or some other backend)
>  3) user does swapon /dev/XX and the writes to the swap disk end up in
>     frontswap and then in the backend.
> 
> With the module loading, the 1) is still part of the bootup, but the
> 2) or 3) can be run at anytime. This means one could load the backend
> _after_ the swap disk has been initialized and running along. Or
> _before_ the swap disk has been setup - but that is similar to the
> existing case so not that exciting.
> 
> To deal with that scenario the frontswap keeps an queue (actually an atomic
> bitmap of the swap disks that have been init) and when the backend registers -
> frontswap runs the backend init on the queued up swap disks.
> 
> The interesting thing is that we can be to certain degree racy when the
> swap system starts steering pages to frontswap. Meaning after the backend
> has registered it is OK if the pages are still hitting the disk instead of
> the backend. Naturally this is unacceptable if one were to unload the
> backend (not yet supported) - as we need to be quite atomic at that stage
> and need to stop processing the pages the moment the backend is being
> unloaded. To support this, the frontswap is using the struct static_key
> which are incredibly light when they are in usage. They are incredibly heavy
> when the value switches (on/off), but that is OK. The next part of unloading is
> also taking the pages that are in the backend and feed them in the swap
> storage (and Seth's patches do some of this).
> 
> Also attached is one patch from Minchan that fixes the condition where the
> backend was constricted in allocating memory at init - b/c we were holding
> a spin-lock. His patch fixes that and we are just holding the swapon_mutex
> instead. It has been rebased on top of my patches.
> 
> This patchset is based on Greg KH's staging tree (since the zcache2 has
> now been renamed to zcache). To be exact, it is based on
> 085494ac2039433a5df9fdd6fb653579e18b8c71
> 
> Dan Magenheimer (4):
>       mm: cleancache: lazy initialization to allow tmem backends to build/run as modules
>       mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
>       staging: zcache: enable ramster to be built/loaded as a module
>       xen: tmem: enable Xen tmem shim to be built/loaded as a module
> 
> Konrad Rzeszutek Wilk (10):
>       frontswap: Make frontswap_init use a pointer for the ops.
>       cleancache: Make cleancache_init use a pointer for the ops
>       staging: zcache: enable zcache to be built/loaded as a module
>       xen/tmem: Remove the subsys call.
>       frontswap: Remove the check for frontswap_enabled.
>       frontswap: Use static_key instead of frontswap_enabled and frontswap_ops
>       cleancache: Remove the check for cleancache_enabled.
>       cleancache: Use static_key instead of cleancache_ops and cleancache_enabled.
>       zcache/tmem: Better error checking on frontswap_register_ops return     value.
>       xen/tmem: Add missing %s in the printk statement.
> 
> Minchan Kim (1):
>       frontswap: Get rid of swap_lock dependency
> 
> 
>  drivers/staging/zcache/Kconfig                     |   6 +-
>  drivers/staging/zcache/Makefile                    |  11 +-
>  drivers/staging/zcache/ramster.h                   |   6 +-
>  drivers/staging/zcache/ramster/nodemanager.c       |   9 +-
>  drivers/staging/zcache/ramster/ramster.c           |  29 ++-
>  drivers/staging/zcache/ramster/ramster.h           |   2 +-
>  .../staging/zcache/ramster/ramster_nodemanager.h   |   2 +
>  drivers/staging/zcache/tmem.c                      |   6 +-
>  drivers/staging/zcache/tmem.h                      |   8 +-
>  drivers/staging/zcache/zcache-main.c               |  64 +++++-
>  drivers/staging/zcache/zcache.h                    |   2 +-
>  drivers/xen/Kconfig                                |   4 +-
>  drivers/xen/tmem.c                                 |  55 +++--
>  drivers/xen/xen-selfballoon.c                      |  13 +-
>  include/linux/cleancache.h                         |  27 ++-
>  include/linux/frontswap.h                          |  31 +--
>  include/xen/tmem.h                                 |   8 +
>  mm/cleancache.c                                    | 241 ++++++++++++++++++---
>  mm/frontswap.c                                     | 121 ++++++++---
>  mm/swapfile.c                                      |   7 +-
>  20 files changed, 505 insertions(+), 147 deletions(-)
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 02/15] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
  2013-02-03  7:07   ` Ric Mason
@ 2013-02-04  5:53     ` Bob Liu
  2013-02-05  0:23       ` Ric Mason
  0 siblings, 1 reply; 24+ messages in thread
From: Bob Liu @ 2013-02-04  5:53 UTC (permalink / raw)
  To: Ric Mason
  Cc: Konrad Rzeszutek Wilk, dan.magenheimer, konrad.wilk, sjenning,
	gregkh, akpm, ngupta, rcj, linux-mm, linux-kernel, devel,
	Stefan Hengelein, Florian Schmaus, Andor Daam

On Sun, Feb 3, 2013 at 3:07 PM, Ric Mason <ric.masonn@gmail.com> wrote:
> Hi Konrad,
> On Fri, 2013-02-01 at 15:22 -0500, Konrad Rzeszutek Wilk wrote:
>> From: Dan Magenheimer <dan.magenheimer@oracle.com>
>>
>> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
>> built/loaded as modules rather than built-in and enabled by a boot parameter,
>> this patch provides "lazy initialization", allowing backends to register to
>> frontswap even after swapon was run. Before a backend registers all calls
>> to init are recorded and the creation of tmem_pools delayed until a backend
>> registers or until a frontswap put is attempted.
>
> Do you mean __frontswap_store? It seems that just add fail count if
> backend doesn't register, why you said that the creation of tmem_pools
> will delay until this time?
>
>>
>> Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
>> Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
>> Signed-off-by: Andor Daam <andor.daam@googlemail.com>
>> Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
>> [v1: Fixes per Seth Jennings suggestions]
>> [v2: Removed FRONTSWAP_HAS_.. ]
>> [v3: Fix up per Bob Liu <lliubbo@gmail.com> recommendations]
>> [v4: Fix up per Andrew's comments]
>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> ---
>>  mm/frontswap.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++-------
>>  1 file changed, 85 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/frontswap.c b/mm/frontswap.c
>> index 2890e67..c05a9db 100644
>> --- a/mm/frontswap.c
>> +++ b/mm/frontswap.c
>> @@ -80,6 +80,46 @@ static inline void inc_frontswap_succ_stores(void) { }
>>  static inline void inc_frontswap_failed_stores(void) { }
>>  static inline void inc_frontswap_invalidates(void) { }
>>  #endif
>> +
>> +/*
>> + * Due to the asynchronous nature of the backends loading potentially
>> + * _after_ the swap system has been activated, we have chokepoints
>> + * on all frontswap functions to not call the backend until the backend
>> + * has registered.
>> + *
>> + * Specifically when no backend is registered (nobody called
>> + * frontswap_register_ops) all calls to frontswap_init (which is done via
>> + * swapon -> enable_swap_info -> frontswap_init) are registered and remembered
>> + * (via the setting of need_init bitmap) but fail to create tmem_pools. When a
>> + * backend registers with frontswap at some later point the previous
>> + * calls to frontswap_init are executed (by iterating over the need_init
>> + * bitmap) to create tmem_pools and set the respective poolids. All of that is
>> + * guarded by us using atomic bit operations on the 'need_init' bitmap.
>> + *
>> + * This would not guards us against the user deciding to call swapoff right as
>> + * we are calling the backend to initialize (so swapon is in action).
>> + * Fortunatly for us, the swapon_mutex has been taked by the callee so we are
>> + * OK. The other scenario where calls to frontswap_store (called via
>> + * swap_writepage) is racing with frontswap_invalidate_area (called via
>> + * swapoff) is again guarded by the swap subsystem.
>> + *
>> + * While no backend is registered all calls to frontswap_[store|load|
>> + * invalidate_area|invalidate_page] are ignored or fail.
>> + *
>> + * The time between the backend being registered and the swap file system
>> + * calling the backend (via the frontswap_* functions) is indeterminate as
>> + * backend_registered is not atomic_t (or a value guarded by a spinlock).
>> + * That is OK as we are comfortable missing some of these calls to the newly
>> + * registered backend.
>> + *
>> + * Obviously the opposite (unloading the backend) must be done after all
>> + * the frontswap_[store|load|invalidate_area|invalidate_page] start
>> + * ignorning or failing the requests - at which point backend_registered
>> + * would have to be made in some fashion atomic.
>> + */
>> +static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
>> +static bool backend_registered __read_mostly;
>> +
>>  /*
>>   * Register operations for frontswap, returning previous thus allowing
>>   * detection of multiple backends and possible nesting.
>> @@ -87,9 +127,22 @@ static inline void inc_frontswap_invalidates(void) { }
>>  struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
>>  {
>>       struct frontswap_ops old = frontswap_ops;
>> +     int i;
>>
>>       frontswap_ops = *ops;
>>       frontswap_enabled = true;
>> +
>> +     for (i = 0; i < MAX_SWAPFILES; i++) {
>> +             if (test_and_clear_bit(i, need_init))
>> +                     (*frontswap_ops.init)(i);
>> +     }
>> +     /*
>> +      * We MUST have backend_registered set _after_ the frontswap_init's
>> +      * have been called. Otherwise __frontswap_store might fail. Hence
>> +      * the barrier to make sure compiler does not re-order us.
>> +      */
>> +     barrier();
>> +     backend_registered = true;
>>       return old;
>>  }
>>  EXPORT_SYMBOL(frontswap_register_ops);
>> @@ -119,10 +172,17 @@ void __frontswap_init(unsigned type)
>>  {
>>       struct swap_info_struct *sis = swap_info[type];
>>
>> -     BUG_ON(sis == NULL);
>> -     if (sis->frontswap_map == NULL)
>> -             return;
>> -     frontswap_ops.init(type);
>> +     if (backend_registered) {
>> +             BUG_ON(sis == NULL);
>> +             if (sis->frontswap_map == NULL)
>> +                     return;
>> +             (*frontswap_ops.init)(type);
>> +     }
>> +     else {
>> +             BUG_ON(type > MAX_SWAPFILES);
>> +             set_bit(type, need_init);
>> +     }
>> +
>>  }
>>  EXPORT_SYMBOL(__frontswap_init);
>>
>> @@ -147,6 +207,11 @@ int __frontswap_store(struct page *page)
>>       struct swap_info_struct *sis = swap_info[type];
>>       pgoff_t offset = swp_offset(entry);
>>
>> +     if (!backend_registered) {
>> +             inc_frontswap_failed_stores();
>> +             return ret;
>> +     }
>> +
>>       BUG_ON(!PageLocked(page));
>>       BUG_ON(sis == NULL);
>>       if (frontswap_test(sis, offset))
>> @@ -186,6 +251,9 @@ int __frontswap_load(struct page *page)
>>       struct swap_info_struct *sis = swap_info[type];
>>       pgoff_t offset = swp_offset(entry);
>>
>> +     if (!backend_registered)
>> +             return ret;
>> +
>>       BUG_ON(!PageLocked(page));
>>       BUG_ON(sis == NULL);
>>       if (frontswap_test(sis, offset))
>> @@ -209,6 +277,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
>>  {
>>       struct swap_info_struct *sis = swap_info[type];
>>
>> +     if (!backend_registered)
>> +             return;
>> +
>>       BUG_ON(sis == NULL);
>>       if (frontswap_test(sis, offset)) {
>>               frontswap_ops.invalidate_page(type, offset);
>> @@ -226,12 +297,15 @@ void __frontswap_invalidate_area(unsigned type)
>>  {
>>       struct swap_info_struct *sis = swap_info[type];
>>
>> -     BUG_ON(sis == NULL);
>> -     if (sis->frontswap_map == NULL)
>> -             return;
>> -     frontswap_ops.invalidate_area(type);
>> -     atomic_set(&sis->frontswap_pages, 0);
>> -     memset(sis->frontswap_map, 0, sis->max / sizeof(long));
>> +     if (backend_registered) {
>> +             BUG_ON(sis == NULL);
>> +             if (sis->frontswap_map == NULL)
>> +                     return;
>> +             (*frontswap_ops.invalidate_area)(type);
>> +             atomic_set(&sis->frontswap_pages, 0);
>> +             memset(sis->frontswap_map, 0, sis->max / sizeof(long));
>> +     }
>> +     clear_bit(type, need_init);
>>  }
>>  EXPORT_SYMBOL(__frontswap_invalidate_area);
>>
>> @@ -364,6 +438,7 @@ static int __init init_frontswap(void)
>>       debugfs_create_u64("invalidates", S_IRUGO,
>>                               root, &frontswap_invalidates);
>>  #endif
>> +     frontswap_enabled = 1;
>
> Why has this change?
>

If don't set frontswap_enabled to 1,  frontswap_init() will return
without record the swap type.

-- 
Regards,
--Bob

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] Make frontswap+cleancache and its friend be modularized.
  2013-02-03  8:52 ` [PATCH v2] Make frontswap+cleancache and its friend be modularized Ric Mason
@ 2013-02-04 15:14   ` Seth Jennings
  2013-02-05  0:21     ` Ric Mason
  0 siblings, 1 reply; 24+ messages in thread
From: Seth Jennings @ 2013-02-04 15:14 UTC (permalink / raw)
  To: Ric Mason
  Cc: Konrad Rzeszutek Wilk, dan.magenheimer, konrad.wilk, gregkh,
	akpm, ngupta, rcj, linux-mm, linux-kernel, devel

On 02/03/2013 02:52 AM, Ric Mason wrote:
> Hi Konrad,
> On Fri, 2013-02-01 at 15:22 -0500, Konrad Rzeszutek Wilk wrote:
> 
> I have already enable frontswap,cleancache,zcache,
>  FRONTSWAP [=y]  
>  CLEANCACHE [=y]
>  ZCACHE [=y]
> But all of knode under /sys/kernel/debug/frontswap and cleancache still
> zero, my swap device is enable, where I miss?

Did you pass "zcache" in the kernel boot parameters?

Seth


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] Make frontswap+cleancache and its friend be modularized.
  2013-02-04 15:14   ` Seth Jennings
@ 2013-02-05  0:21     ` Ric Mason
  2013-02-05  0:38       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 24+ messages in thread
From: Ric Mason @ 2013-02-05  0:21 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Konrad Rzeszutek Wilk, dan.magenheimer, konrad.wilk, gregkh,
	akpm, ngupta, rcj, linux-mm, linux-kernel, devel

On Mon, 2013-02-04 at 09:14 -0600, Seth Jennings wrote:
> On 02/03/2013 02:52 AM, Ric Mason wrote:
> > Hi Konrad,
> > On Fri, 2013-02-01 at 15:22 -0500, Konrad Rzeszutek Wilk wrote:
> > 
> > I have already enable frontswap,cleancache,zcache,
> >  FRONTSWAP [=y]  
> >  CLEANCACHE [=y]
> >  ZCACHE [=y]
> > But all of knode under /sys/kernel/debug/frontswap and cleancache still
> > zero, my swap device is enable, where I miss?
> 
> Did you pass "zcache" in the kernel boot parameters?

Thanks Seth, I think it should be add to kernel-parameters.txt.

> 
> Seth
> 



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 02/15] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
  2013-02-04  5:53     ` Bob Liu
@ 2013-02-05  0:23       ` Ric Mason
  0 siblings, 0 replies; 24+ messages in thread
From: Ric Mason @ 2013-02-05  0:23 UTC (permalink / raw)
  To: Bob Liu
  Cc: Konrad Rzeszutek Wilk, dan.magenheimer, konrad.wilk, sjenning,
	gregkh, akpm, ngupta, rcj, linux-mm, linux-kernel, devel,
	Stefan Hengelein, Florian Schmaus, Andor Daam

On Mon, 2013-02-04 at 13:53 +0800, Bob Liu wrote:
> On Sun, Feb 3, 2013 at 3:07 PM, Ric Mason <ric.masonn@gmail.com> wrote:
> > Hi Konrad,
> > On Fri, 2013-02-01 at 15:22 -0500, Konrad Rzeszutek Wilk wrote:
> >> From: Dan Magenheimer <dan.magenheimer@oracle.com>
> >>
> >> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> >> built/loaded as modules rather than built-in and enabled by a boot parameter,
> >> this patch provides "lazy initialization", allowing backends to register to
> >> frontswap even after swapon was run. Before a backend registers all calls
> >> to init are recorded and the creation of tmem_pools delayed until a backend
> >> registers or until a frontswap put is attempted.
> >
> > Do you mean __frontswap_store? It seems that just add fail count if
> > backend doesn't register, why you said that the creation of tmem_pools
> > will delay until this time?
> >
> >>
> >> Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
> >> Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
> >> Signed-off-by: Andor Daam <andor.daam@googlemail.com>
> >> Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
> >> [v1: Fixes per Seth Jennings suggestions]
> >> [v2: Removed FRONTSWAP_HAS_.. ]
> >> [v3: Fix up per Bob Liu <lliubbo@gmail.com> recommendations]
> >> [v4: Fix up per Andrew's comments]
> >> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> >> ---
> >>  mm/frontswap.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++-------
> >>  1 file changed, 85 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/mm/frontswap.c b/mm/frontswap.c
> >> index 2890e67..c05a9db 100644
> >> --- a/mm/frontswap.c
> >> +++ b/mm/frontswap.c
> >> @@ -80,6 +80,46 @@ static inline void inc_frontswap_succ_stores(void) { }
> >>  static inline void inc_frontswap_failed_stores(void) { }
> >>  static inline void inc_frontswap_invalidates(void) { }
> >>  #endif
> >> +
> >> +/*
> >> + * Due to the asynchronous nature of the backends loading potentially
> >> + * _after_ the swap system has been activated, we have chokepoints
> >> + * on all frontswap functions to not call the backend until the backend
> >> + * has registered.
> >> + *
> >> + * Specifically when no backend is registered (nobody called
> >> + * frontswap_register_ops) all calls to frontswap_init (which is done via
> >> + * swapon -> enable_swap_info -> frontswap_init) are registered and remembered
> >> + * (via the setting of need_init bitmap) but fail to create tmem_pools. When a
> >> + * backend registers with frontswap at some later point the previous
> >> + * calls to frontswap_init are executed (by iterating over the need_init
> >> + * bitmap) to create tmem_pools and set the respective poolids. All of that is
> >> + * guarded by us using atomic bit operations on the 'need_init' bitmap.
> >> + *
> >> + * This would not guards us against the user deciding to call swapoff right as
> >> + * we are calling the backend to initialize (so swapon is in action).
> >> + * Fortunatly for us, the swapon_mutex has been taked by the callee so we are
> >> + * OK. The other scenario where calls to frontswap_store (called via
> >> + * swap_writepage) is racing with frontswap_invalidate_area (called via
> >> + * swapoff) is again guarded by the swap subsystem.
> >> + *
> >> + * While no backend is registered all calls to frontswap_[store|load|
> >> + * invalidate_area|invalidate_page] are ignored or fail.
> >> + *
> >> + * The time between the backend being registered and the swap file system
> >> + * calling the backend (via the frontswap_* functions) is indeterminate as
> >> + * backend_registered is not atomic_t (or a value guarded by a spinlock).
> >> + * That is OK as we are comfortable missing some of these calls to the newly
> >> + * registered backend.
> >> + *
> >> + * Obviously the opposite (unloading the backend) must be done after all
> >> + * the frontswap_[store|load|invalidate_area|invalidate_page] start
> >> + * ignorning or failing the requests - at which point backend_registered
> >> + * would have to be made in some fashion atomic.
> >> + */
> >> +static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
> >> +static bool backend_registered __read_mostly;
> >> +
> >>  /*
> >>   * Register operations for frontswap, returning previous thus allowing
> >>   * detection of multiple backends and possible nesting.
> >> @@ -87,9 +127,22 @@ static inline void inc_frontswap_invalidates(void) { }
> >>  struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
> >>  {
> >>       struct frontswap_ops old = frontswap_ops;
> >> +     int i;
> >>
> >>       frontswap_ops = *ops;
> >>       frontswap_enabled = true;
> >> +
> >> +     for (i = 0; i < MAX_SWAPFILES; i++) {
> >> +             if (test_and_clear_bit(i, need_init))
> >> +                     (*frontswap_ops.init)(i);
> >> +     }
> >> +     /*
> >> +      * We MUST have backend_registered set _after_ the frontswap_init's
> >> +      * have been called. Otherwise __frontswap_store might fail. Hence
> >> +      * the barrier to make sure compiler does not re-order us.
> >> +      */
> >> +     barrier();
> >> +     backend_registered = true;
> >>       return old;
> >>  }
> >>  EXPORT_SYMBOL(frontswap_register_ops);
> >> @@ -119,10 +172,17 @@ void __frontswap_init(unsigned type)
> >>  {
> >>       struct swap_info_struct *sis = swap_info[type];
> >>
> >> -     BUG_ON(sis == NULL);
> >> -     if (sis->frontswap_map == NULL)
> >> -             return;
> >> -     frontswap_ops.init(type);
> >> +     if (backend_registered) {
> >> +             BUG_ON(sis == NULL);
> >> +             if (sis->frontswap_map == NULL)
> >> +                     return;
> >> +             (*frontswap_ops.init)(type);
> >> +     }
> >> +     else {
> >> +             BUG_ON(type > MAX_SWAPFILES);
> >> +             set_bit(type, need_init);
> >> +     }
> >> +
> >>  }
> >>  EXPORT_SYMBOL(__frontswap_init);
> >>
> >> @@ -147,6 +207,11 @@ int __frontswap_store(struct page *page)
> >>       struct swap_info_struct *sis = swap_info[type];
> >>       pgoff_t offset = swp_offset(entry);
> >>
> >> +     if (!backend_registered) {
> >> +             inc_frontswap_failed_stores();
> >> +             return ret;
> >> +     }
> >> +
> >>       BUG_ON(!PageLocked(page));
> >>       BUG_ON(sis == NULL);
> >>       if (frontswap_test(sis, offset))
> >> @@ -186,6 +251,9 @@ int __frontswap_load(struct page *page)
> >>       struct swap_info_struct *sis = swap_info[type];
> >>       pgoff_t offset = swp_offset(entry);
> >>
> >> +     if (!backend_registered)
> >> +             return ret;
> >> +
> >>       BUG_ON(!PageLocked(page));
> >>       BUG_ON(sis == NULL);
> >>       if (frontswap_test(sis, offset))
> >> @@ -209,6 +277,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
> >>  {
> >>       struct swap_info_struct *sis = swap_info[type];
> >>
> >> +     if (!backend_registered)
> >> +             return;
> >> +
> >>       BUG_ON(sis == NULL);
> >>       if (frontswap_test(sis, offset)) {
> >>               frontswap_ops.invalidate_page(type, offset);
> >> @@ -226,12 +297,15 @@ void __frontswap_invalidate_area(unsigned type)
> >>  {
> >>       struct swap_info_struct *sis = swap_info[type];
> >>
> >> -     BUG_ON(sis == NULL);
> >> -     if (sis->frontswap_map == NULL)
> >> -             return;
> >> -     frontswap_ops.invalidate_area(type);
> >> -     atomic_set(&sis->frontswap_pages, 0);
> >> -     memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> >> +     if (backend_registered) {
> >> +             BUG_ON(sis == NULL);
> >> +             if (sis->frontswap_map == NULL)
> >> +                     return;
> >> +             (*frontswap_ops.invalidate_area)(type);
> >> +             atomic_set(&sis->frontswap_pages, 0);
> >> +             memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> >> +     }
> >> +     clear_bit(type, need_init);
> >>  }
> >>  EXPORT_SYMBOL(__frontswap_invalidate_area);
> >>
> >> @@ -364,6 +438,7 @@ static int __init init_frontswap(void)
> >>       debugfs_create_u64("invalidates", S_IRUGO,
> >>                               root, &frontswap_invalidates);
> >>  #endif
> >> +     frontswap_enabled = 1;
> >
> > Why has this change?
> >
> 
> If don't set frontswap_enabled to 1,  frontswap_init() will return
> without record the swap type.
> 

Yup, thanks.




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2] Make frontswap+cleancache and its friend be modularized.
  2013-02-05  0:21     ` Ric Mason
@ 2013-02-05  0:38       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 24+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-02-05  0:38 UTC (permalink / raw)
  To: Ric Mason
  Cc: Seth Jennings, Konrad Rzeszutek Wilk, dan.magenheimer, gregkh,
	akpm, ngupta, rcj, linux-mm, linux-kernel, devel

On Mon, Feb 04, 2013 at 06:21:12PM -0600, Ric Mason wrote:
> On Mon, 2013-02-04 at 09:14 -0600, Seth Jennings wrote:
> > On 02/03/2013 02:52 AM, Ric Mason wrote:
> > > Hi Konrad,
> > > On Fri, 2013-02-01 at 15:22 -0500, Konrad Rzeszutek Wilk wrote:
> > > 
> > > I have already enable frontswap,cleancache,zcache,
> > >  FRONTSWAP [=y]  
> > >  CLEANCACHE [=y]
> > >  ZCACHE [=y]
> > > But all of knode under /sys/kernel/debug/frontswap and cleancache still
> > > zero, my swap device is enable, where I miss?
> > 
> > Did you pass "zcache" in the kernel boot parameters?
> 
> Thanks Seth, I think it should be add to kernel-parameters.txt.

Actually I think you spotted a bug. It made sense when the zcache was
built-in the kernel. But as a module - it should be enabled when the
system admin loads the module.
> 
> > 
> > Seth
> > 
> 
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2013-02-05  0:39 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-01 20:22 [PATCH v2] Make frontswap+cleancache and its friend be modularized Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 01/15] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules Konrad Rzeszutek Wilk
2013-02-03  8:06   ` Ric Mason
2013-02-01 20:22 ` [PATCH 02/15] mm: frontswap: " Konrad Rzeszutek Wilk
2013-02-03  7:07   ` Ric Mason
2013-02-04  5:53     ` Bob Liu
2013-02-05  0:23       ` Ric Mason
2013-02-01 20:22 ` [PATCH 03/15] frontswap: Make frontswap_init use a pointer for the ops Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 04/15] cleancache: Make cleancache_init " Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 05/15] staging: zcache: enable ramster to be built/loaded as a module Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 06/15] staging: zcache: enable zcache " Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 07/15] xen: tmem: enable Xen tmem shim " Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 08/15] xen/tmem: Remove the subsys call Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 09/15] frontswap: Remove the check for frontswap_enabled Konrad Rzeszutek Wilk
2013-02-01 20:22 ` [PATCH 10/15] frontswap: Use static_key instead of frontswap_enabled and frontswap_ops Konrad Rzeszutek Wilk
2013-02-01 20:23 ` [PATCH 11/15] cleancache: Remove the check for cleancache_enabled Konrad Rzeszutek Wilk
2013-02-01 20:23 ` [PATCH 12/15] cleancache: Use static_key instead of cleancache_ops and cleancache_enabled Konrad Rzeszutek Wilk
2013-02-01 20:23 ` [PATCH 13/15] frontswap: Get rid of swap_lock dependency Konrad Rzeszutek Wilk
2013-02-01 20:23 ` [PATCH 14/15] zcache/tmem: Better error checking on frontswap_register_ops return value Konrad Rzeszutek Wilk
2013-02-01 20:23 ` [PATCH 15/15] xen/tmem: Add missing %s in the printk statement Konrad Rzeszutek Wilk
2013-02-03  8:52 ` [PATCH v2] Make frontswap+cleancache and its friend be modularized Ric Mason
2013-02-04 15:14   ` Seth Jennings
2013-02-05  0:21     ` Ric Mason
2013-02-05  0:38       ` Konrad Rzeszutek Wilk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).