linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] enable all tmem backends to be built and loaded as modules.
@ 2012-11-14 18:57 Konrad Rzeszutek Wilk
  2012-11-14 18:57 ` [PATCH 1/8] mm: cleancache: lazy initialization to allow tmem backends to build/run " Konrad Rzeszutek Wilk
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-14 18:57 UTC (permalink / raw)
  To: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, akpm, mgorman
  Cc: fschmaus, andor.daam, ilendir

There are also some patch I wrote up that are based on this patchset
that I will post soonish.

I hope these patches come out - I am travelling and not sure if my
git-send-email foo will make this work.

I copying here what Dan mentioned with some modifications by me:

 Since various parts of transcendent memory ("tmem") [1] were first posted in
 2009, reviewers have suggested that various tmem features should be built
 as a module and enabled by loading the module, rather than the current clunky
 method of compiling as a built-in and enabling via boot parameter.  Due
 to certain tmem initialization steps, that was not feasible at the time.
 
 [1] http://lwn.net/Articles/454795/
 
 This patchset allows each of the three merged transcendent memory
 backends (zcache, ramster, Xen tmem) to be used as modules by first
 enabling transcendent memory frontends (cleancache, frontswap) to deal
 with "lazy initialization" and, second, by adding the necessary code for
 the backends to be built and loaded as modules.
 
 The original mechanism to enable tmem backends -- namely to hardwire
 them into the kernel and select/enable one with a kernel boot
 parameter --  is retained but should be considered deprecated.  When
 backends are loaded as modules, certain knobs will now be
 properly selected via module_params rather than via undocumented
 kernel boot parameters.  Note that module UNloading is not yet
 supported as it is lower priority and will require significant
 additional work.
 
 The lazy initialization support is necessary because filesystems
 and swap devices are normally mounted early in boot and these
 activites normally trigger tmem calls to setup certain data structures;
 if the respective cleancache/frontswap ops are not yet registered
 by a back end, the tmem setup would fail for these devices and
 cleancache/frontswap would never be enabled for them which limits
 much of the value of tmem in many system configurations.  Lazy
 initialization records the necessary information in cleancache/frontswap
 data structures and "replays" it after the ops are registered
 to ensure that all filesystems and swap devices can benefit from
 the loaded tmem backend.
 
 [PATCH 1/8] mm: cleancache: lazy initialization to allow tmem
 [PATCH 2/8] mm: frontswap: lazy initialization to allow tmem

 Patches 1 and 2 are the original [2] patches to cleancache and frontswap
 proposed by Erlangen University, but rebased to 3.7-rcN plus a couple
 of bug fixes I found necessary to run properly + extra review comments.
 
 [2] http://www.spinics.net/lists/linux-mm/msg31490.html

 The other two:
 [PATCH 3/8] frontswap: Make frontswap_init use a pointer for the
 [PATCH 4/8] cleancache: Make cleancache_init use a pointer for the
 do a bit of code cleanup that can be done to make it easier to
 read and also remove some of the bools.
 
 [PATCH 5/8] staging: zcache2+ramster: enable ramster to be
 Enables module support for zcache2.  Zsmalloc support
 has not yet been merged into zcache2 but, once merged, could now
 easily be selected via a module_param.
 
 [PATCH 6/8] staging: zcache2+ramster: enable zcache2 to be
 Enables module support for ramster.  Ramster will now be
 enabled with a module_param to zcache2.
 
 [PATCH 7/8] xen: tmem: enable Xen tmem shim to be built/loaded as a
 [PATCH 8/8] xen/tmem: Remove the subsys call.
 Enables module support for the Xen tmem shim.  Xen self-ballooning and
 frontswap-selfshrinking are also "lazily"
 initialized when the Xen tmem shim is loaded as a module, unless
 explicitly disabled by module_params.


Dan Magenheimer (5):
      mm: cleancache: lazy initialization to allow tmem backends to build/run as modules
      mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
      staging: zcache2+ramster: enable ramster to be built/loaded as a module
      staging: zcache2+ramster: enable zcache2 to be built/loaded as a module
      xen: tmem: enable Xen tmem shim to be built/loaded as a module

Konrad Rzeszutek Wilk (3):
      frontswap: Make frontswap_init use a pointer for the ops.
      cleancache: Make cleancache_init use a pointer for the ops
      xen/tmem: Remove the subsys call.


 drivers/staging/ramster/Kconfig                    |    6 +-
 drivers/staging/ramster/Makefile                   |   11 +-
 drivers/staging/ramster/ramster.h                  |    6 +-
 drivers/staging/ramster/ramster/nodemanager.c      |    9 +-
 drivers/staging/ramster/ramster/ramster.c          |   29 +++-
 drivers/staging/ramster/ramster/ramster.h          |    2 +-
 .../staging/ramster/ramster/ramster_nodemanager.h  |    2 +
 drivers/staging/ramster/tmem.c                     |    6 +-
 drivers/staging/ramster/tmem.h                     |    8 +-
 drivers/staging/ramster/zcache-main.c              |   63 ++++++--
 drivers/staging/ramster/zcache.h                   |    2 +-
 drivers/staging/zcache/zcache-main.c               |   16 +-
 drivers/xen/Kconfig                                |    4 +-
 drivers/xen/tmem.c                                 |   50 ++++--
 drivers/xen/xen-selfballoon.c                      |   13 +-
 include/linux/cleancache.h                         |    2 +-
 include/linux/frontswap.h                          |    2 +-
 include/xen/tmem.h                                 |    8 +
 mm/cleancache.c                                    |  167 +++++++++++++++++---
 mm/frontswap.c                                     |   78 +++++++--
 20 files changed, 371 insertions(+), 113 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/8] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules
  2012-11-14 18:57 [PATCH v2] enable all tmem backends to be built and loaded as modules Konrad Rzeszutek Wilk
@ 2012-11-14 18:57 ` Konrad Rzeszutek Wilk
  2012-11-16 23:10   ` Andrew Morton
  2012-11-14 18:57 ` [PATCH 2/8] mm: frontswap: " Konrad Rzeszutek Wilk
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-14 18:57 UTC (permalink / raw)
  To: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, akpm, mgorman
  Cc: fschmaus, andor.daam, ilendir, Konrad Rzeszutek Wilk

From: Dan Magenheimer <dan.magenheimer@oracle.com>

With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
built/loaded as modules rather than built-in and enabled by a boot parameter,
this patch provides "lazy initialization", allowing backends to register to
cleancache even after filesystems were mounted. Calls to init_fs and
init_shared_fs are remembered as fake poolids but no real tmem_pools created.
On backend registration the fake poolids are mapped to real poolids and
respective tmem_pools.

Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
Signed-off-by: Andor Daam <andor.daam@googlemail.com>
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Minor fixes: used #define for some values and bools]
[v2: Removed CLEANCACHE_HAS_LAZY_INIT]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 mm/cleancache.c |  156 +++++++++++++++++++++++++++++++++++++++++++++++++------
 1 files changed, 139 insertions(+), 17 deletions(-)

diff --git a/mm/cleancache.c b/mm/cleancache.c
index 32e6f41..318a0ad 100644
--- a/mm/cleancache.c
+++ b/mm/cleancache.c
@@ -45,15 +45,45 @@ static u64 cleancache_puts;
 static u64 cleancache_invalidates;
 
 /*
+ * When no backend is registered all calls to init_fs and init_shard_fs
+ * are registered and fake poolids are given to the respective
+ * super block but no tmem_pools are created. When a backend
+ * registers with cleancache the previous calls to init_fs and
+ * init_shared_fs are executed to create tmem_pools and set the
+ * respective poolids. While no backend is registered all "puts",
+ * "gets" and "flushes" are ignored or fail.
+ */
+#define MAX_INITIALIZABLE_FS 32
+#define FAKE_FS_POOLID_OFFSET 1000
+#define FAKE_SHARED_FS_POOLID_OFFSET 2000
+
+#define FS_NO_BACKEND (-1)
+#define FS_UNKNOWN (-2)
+static int fs_poolid_map[MAX_INITIALIZABLE_FS];
+static int shared_fs_poolid_map[MAX_INITIALIZABLE_FS];
+
+static char *uuids[MAX_INITIALIZABLE_FS];
+static bool __read_mostly backend_registered;
+
+/*
  * register operations for cleancache, returning previous thus allowing
  * detection of multiple backends and possible nesting
  */
 struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
 {
 	struct cleancache_ops old = cleancache_ops;
+	int i;
 
 	cleancache_ops = *ops;
-	cleancache_enabled = 1;
+
+	backend_registered = true;
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		if (fs_poolid_map[i] == FS_NO_BACKEND)
+			fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+		if (shared_fs_poolid_map[i] == FS_NO_BACKEND)
+			shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+					(uuids[i], PAGE_SIZE);
+	}
 	return old;
 }
 EXPORT_SYMBOL(cleancache_register_ops);
@@ -61,15 +91,38 @@ EXPORT_SYMBOL(cleancache_register_ops);
 /* Called by a cleancache-enabled filesystem at time of mount */
 void __cleancache_init_fs(struct super_block *sb)
 {
-	sb->cleancache_poolid = (*cleancache_ops.init_fs)(PAGE_SIZE);
+	int i;
+
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		if (fs_poolid_map[i] == FS_UNKNOWN) {
+			sb->cleancache_poolid = i + FAKE_FS_POOLID_OFFSET;
+			if (backend_registered)
+				fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+			else
+				fs_poolid_map[i] = FS_NO_BACKEND;
+			break;
+		}
+	}
 }
 EXPORT_SYMBOL(__cleancache_init_fs);
 
 /* Called by a cleancache-enabled clustered filesystem at time of mount */
 void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
 {
-	sb->cleancache_poolid =
-		(*cleancache_ops.init_shared_fs)(uuid, PAGE_SIZE);
+	int i;
+
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		if (shared_fs_poolid_map[i] == FS_UNKNOWN) {
+			sb->cleancache_poolid = i + FAKE_SHARED_FS_POOLID_OFFSET;
+			uuids[i] = uuid;
+			if (backend_registered)
+				shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+						(uuid, PAGE_SIZE);
+			else
+				shared_fs_poolid_map[i] = FS_NO_BACKEND;
+			break;
+		}
+	}
 }
 EXPORT_SYMBOL(__cleancache_init_shared_fs);
 
@@ -99,6 +152,19 @@ static int cleancache_get_key(struct inode *inode,
 }
 
 /*
+ * Returns a pool_id that is associated with a given fake poolid.
+ */
+static int get_poolid_from_fake(int fake_pool_id)
+{
+	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET)
+		return shared_fs_poolid_map[fake_pool_id -
+			FAKE_SHARED_FS_POOLID_OFFSET];
+	else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET)
+		return fs_poolid_map[fake_pool_id - FAKE_FS_POOLID_OFFSET];
+	return FS_NO_BACKEND;
+}
+
+/*
  * "Get" data from cleancache associated with the poolid/inode/index
  * that were specified when the data was put to cleanache and, if
  * successful, use it to fill the specified page with data and return 0.
@@ -109,17 +175,26 @@ int __cleancache_get_page(struct page *page)
 {
 	int ret = -1;
 	int pool_id;
+	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
+	if (!backend_registered) {
+		cleancache_failed_gets++;
+		goto out;
+	}
+
 	VM_BUG_ON(!PageLocked(page));
-	pool_id = page->mapping->host->i_sb->cleancache_poolid;
-	if (pool_id < 0)
+	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
+	if (fake_pool_id < 0)
 		goto out;
+	pool_id = get_poolid_from_fake(fake_pool_id);
 
 	if (cleancache_get_key(page->mapping->host, &key) < 0)
 		goto out;
 
-	ret = (*cleancache_ops.get_page)(pool_id, key, page->index, page);
+	if (pool_id >= 0)
+		ret = (*cleancache_ops.get_page)(pool_id,
+				key, page->index, page);
 	if (ret == 0)
 		cleancache_succ_gets++;
 	else
@@ -138,12 +213,23 @@ EXPORT_SYMBOL(__cleancache_get_page);
 void __cleancache_put_page(struct page *page)
 {
 	int pool_id;
+	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
+	if (!backend_registered) {
+		cleancache_puts++;
+		return;
+	}
+
 	VM_BUG_ON(!PageLocked(page));
-	pool_id = page->mapping->host->i_sb->cleancache_poolid;
+	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
+	if (fake_pool_id < 0)
+		return;
+
+	pool_id = get_poolid_from_fake(fake_pool_id);
+
 	if (pool_id >= 0 &&
-	      cleancache_get_key(page->mapping->host, &key) >= 0) {
+		cleancache_get_key(page->mapping->host, &key) >= 0) {
 		(*cleancache_ops.put_page)(pool_id, key, page->index, page);
 		cleancache_puts++;
 	}
@@ -158,14 +244,22 @@ void __cleancache_invalidate_page(struct address_space *mapping,
 					struct page *page)
 {
 	/* careful... page->mapping is NULL sometimes when this is called */
-	int pool_id = mapping->host->i_sb->cleancache_poolid;
+	int pool_id;
+	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (pool_id >= 0) {
+	if (!backend_registered)
+		return;
+
+	if (fake_pool_id >= 0) {
+		pool_id = get_poolid_from_fake(fake_pool_id);
+		if (pool_id < 0)
+			return;
+
 		VM_BUG_ON(!PageLocked(page));
 		if (cleancache_get_key(mapping->host, &key) >= 0) {
 			(*cleancache_ops.invalidate_page)(pool_id,
-							  key, page->index);
+					key, page->index);
 			cleancache_invalidates++;
 		}
 	}
@@ -179,9 +273,18 @@ EXPORT_SYMBOL(__cleancache_invalidate_page);
  */
 void __cleancache_invalidate_inode(struct address_space *mapping)
 {
-	int pool_id = mapping->host->i_sb->cleancache_poolid;
+	int pool_id;
+	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
+	if (!backend_registered)
+		return;
+
+	if (fake_pool_id < 0)
+		return;
+
+	pool_id = get_poolid_from_fake(fake_pool_id);
+
 	if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0)
 		(*cleancache_ops.invalidate_inode)(pool_id, key);
 }
@@ -194,16 +297,30 @@ EXPORT_SYMBOL(__cleancache_invalidate_inode);
  */
 void __cleancache_invalidate_fs(struct super_block *sb)
 {
-	if (sb->cleancache_poolid >= 0) {
-		int old_poolid = sb->cleancache_poolid;
-		sb->cleancache_poolid = -1;
-		(*cleancache_ops.invalidate_fs)(old_poolid);
+	int index;
+	int fake_pool_id = sb->cleancache_poolid;
+	int old_poolid = fake_pool_id;
+
+	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET) {
+		index = fake_pool_id - FAKE_SHARED_FS_POOLID_OFFSET;
+		old_poolid = shared_fs_poolid_map[index];
+		shared_fs_poolid_map[index] = FS_UNKNOWN;
+		uuids[index] = NULL;
+	} else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET) {
+		index = fake_pool_id - FAKE_FS_POOLID_OFFSET;
+		old_poolid = fs_poolid_map[index];
+		fs_poolid_map[index] = FS_UNKNOWN;
 	}
+	sb->cleancache_poolid = -1;
+	if (backend_registered)
+		(*cleancache_ops.invalidate_fs)(old_poolid);
 }
 EXPORT_SYMBOL(__cleancache_invalidate_fs);
 
 static int __init init_cleancache(void)
 {
+	int i;
+
 #ifdef CONFIG_DEBUG_FS
 	struct dentry *root = debugfs_create_dir("cleancache", NULL);
 	if (root == NULL)
@@ -215,6 +332,11 @@ static int __init init_cleancache(void)
 	debugfs_create_u64("invalidates", S_IRUGO,
 				root, &cleancache_invalidates);
 #endif
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		fs_poolid_map[i] = FS_UNKNOWN;
+		shared_fs_poolid_map[i] = FS_UNKNOWN;
+	}
+	cleancache_enabled = 1;
 	return 0;
 }
 module_init(init_cleancache)
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/8] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
  2012-11-14 18:57 [PATCH v2] enable all tmem backends to be built and loaded as modules Konrad Rzeszutek Wilk
  2012-11-14 18:57 ` [PATCH 1/8] mm: cleancache: lazy initialization to allow tmem backends to build/run " Konrad Rzeszutek Wilk
@ 2012-11-14 18:57 ` Konrad Rzeszutek Wilk
  2012-11-16 23:16   ` Andrew Morton
  2012-11-14 18:57 ` [PATCH 3/8] frontswap: Make frontswap_init use a pointer for the ops Konrad Rzeszutek Wilk
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-14 18:57 UTC (permalink / raw)
  To: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, akpm, mgorman
  Cc: fschmaus, andor.daam, ilendir, Konrad Rzeszutek Wilk

From: Dan Magenheimer <dan.magenheimer@oracle.com>

With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
built/loaded as modules rather than built-in and enabled by a boot parameter,
this patch provides "lazy initialization", allowing backends to register to
frontswap even after swapon was run. Before a backend registers all calls
to init are recorded and the creation of tmem_pools delayed until a backend
registers or until a frontswap put is attempted.

Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
Signed-off-by: Andor Daam <andor.daam@googlemail.com>
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Fixes per Seth Jennings suggestions]
[v2: Removed FRONTSWAP_HAS_.. ]
[v3: Fix up per Bob Liu <lliubbo@gmail.com> recommendations]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 mm/frontswap.c |   66 +++++++++++++++++++++++++++++++++++++++++++++++--------
 1 files changed, 56 insertions(+), 10 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index 2890e67..ba58157 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -80,6 +80,18 @@ static inline void inc_frontswap_succ_stores(void) { }
 static inline void inc_frontswap_failed_stores(void) { }
 static inline void inc_frontswap_invalidates(void) { }
 #endif
+
+/*
+ * When no backend is registered all calls to init are registered and
+ * remembered but fail to create tmem_pools. When a backend registers with
+ * frontswap the previous calls to init are executed to create tmem_pools
+ * and set the respective poolids.
+ * While no backend is registered all "puts", "gets" and "flushes" are
+ * ignored or fail.
+ */
+static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
+static bool backend_registered __read_mostly;
+
 /*
  * Register operations for frontswap, returning previous thus allowing
  * detection of multiple backends and possible nesting.
@@ -87,9 +99,19 @@ static inline void inc_frontswap_invalidates(void) { }
 struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
 {
 	struct frontswap_ops old = frontswap_ops;
+	int i;
 
 	frontswap_ops = *ops;
 	frontswap_enabled = true;
+
+	for (i = 0; i < MAX_SWAPFILES; i++) {
+		if (test_and_clear_bit(i, need_init))
+			(*frontswap_ops.init)(i);
+	}
+	/* We MUST have backend_registered called _after_ the frontswap_init's
+ 	 * have been called. Otherwise __frontswap_store might fail. */
+	barrier();
+	backend_registered = true;
 	return old;
 }
 EXPORT_SYMBOL(frontswap_register_ops);
@@ -119,10 +141,17 @@ void __frontswap_init(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	BUG_ON(sis == NULL);
-	if (sis->frontswap_map == NULL)
-		return;
-	frontswap_ops.init(type);
+	if (backend_registered) {
+		BUG_ON(sis == NULL);
+		if (sis->frontswap_map == NULL)
+			return;
+		(*frontswap_ops.init)(type);
+	}
+	else {
+		BUG_ON(type > MAX_SWAPFILES);
+		set_bit(type, need_init);
+	}
+
 }
 EXPORT_SYMBOL(__frontswap_init);
 
@@ -147,6 +176,11 @@ int __frontswap_store(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
+	if (!backend_registered) {
+		inc_frontswap_failed_stores();
+		return ret;
+	}
+
 	BUG_ON(!PageLocked(page));
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
@@ -186,6 +220,9 @@ int __frontswap_load(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
+	if (!backend_registered)
+		return ret;
+
 	BUG_ON(!PageLocked(page));
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
@@ -209,6 +246,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
+	if (!backend_registered)
+		return;
+
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset)) {
 		frontswap_ops.invalidate_page(type, offset);
@@ -226,12 +266,15 @@ void __frontswap_invalidate_area(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	BUG_ON(sis == NULL);
-	if (sis->frontswap_map == NULL)
-		return;
-	frontswap_ops.invalidate_area(type);
-	atomic_set(&sis->frontswap_pages, 0);
-	memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+	if (backend_registered) {
+		BUG_ON(sis == NULL);
+		if (sis->frontswap_map == NULL)
+			return;
+		(*frontswap_ops.invalidate_area)(type);
+		atomic_set(&sis->frontswap_pages, 0);
+		memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+	}
+	clear_bit(type, need_init);
 }
 EXPORT_SYMBOL(__frontswap_invalidate_area);
 
@@ -364,6 +407,9 @@ static int __init init_frontswap(void)
 	debugfs_create_u64("invalidates", S_IRUGO,
 				root, &frontswap_invalidates);
 #endif
+	bitmap_zero(need_init, MAX_SWAPFILES);
+
+	frontswap_enabled = 1;
 	return 0;
 }
 
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/8] frontswap: Make frontswap_init use a pointer for the ops.
  2012-11-14 18:57 [PATCH v2] enable all tmem backends to be built and loaded as modules Konrad Rzeszutek Wilk
  2012-11-14 18:57 ` [PATCH 1/8] mm: cleancache: lazy initialization to allow tmem backends to build/run " Konrad Rzeszutek Wilk
  2012-11-14 18:57 ` [PATCH 2/8] mm: frontswap: " Konrad Rzeszutek Wilk
@ 2012-11-14 18:57 ` Konrad Rzeszutek Wilk
  2012-11-14 18:57 ` [PATCH 4/8] cleancache: Make cleancache_init " Konrad Rzeszutek Wilk
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-14 18:57 UTC (permalink / raw)
  To: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, akpm, mgorman
  Cc: fschmaus, andor.daam, ilendir, Konrad Rzeszutek Wilk

This simplifies the code in the frontswap - we can get rid
of the 'backend_registered' test and instead check against
frontswap_ops.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/staging/ramster/zcache-main.c |    8 +++---
 drivers/staging/zcache/zcache-main.c  |    8 +++---
 drivers/xen/tmem.c                    |    6 ++--
 include/linux/frontswap.h             |    2 +-
 mm/frontswap.c                        |   34 +++++++++++++++-----------------
 5 files changed, 28 insertions(+), 30 deletions(-)

diff --git a/drivers/staging/ramster/zcache-main.c b/drivers/staging/ramster/zcache-main.c
index a09dd5c..6c8959d 100644
--- a/drivers/staging/ramster/zcache-main.c
+++ b/drivers/staging/ramster/zcache-main.c
@@ -1626,9 +1626,9 @@ static struct frontswap_ops zcache_frontswap_ops = {
 	.init = zcache_frontswap_init
 };
 
-struct frontswap_ops zcache_frontswap_register_ops(void)
+struct frontswap_ops *zcache_frontswap_register_ops(void)
 {
-	struct frontswap_ops old_ops =
+	struct frontswap_ops *old_ops =
 		frontswap_register_ops(&zcache_frontswap_ops);
 
 	return old_ops;
@@ -1795,7 +1795,7 @@ static int __init zcache_init(void)
 			pr_warn("%s: cleancache_ops overridden\n", namestr);
 	}
 	if (zcache_enabled && !disable_frontswap) {
-		struct frontswap_ops old_ops;
+		struct frontswap_ops *old_ops;
 
 		old_ops = zcache_frontswap_register_ops();
 		if (frontswap_has_exclusive_gets)
@@ -1807,7 +1807,7 @@ static int __init zcache_init(void)
 			namestr, frontswap_has_exclusive_gets,
 			!disable_frontswap_ignore_nonactive);
 #endif
-		if (old_ops.init != NULL)
+		if (old_ops)
 			pr_warn("%s: frontswap_ops overridden\n", namestr);
 	}
 	if (ramster_enabled)
diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index 52b43b7..3db38cb 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1919,9 +1919,9 @@ static struct frontswap_ops zcache_frontswap_ops = {
 	.init = zcache_frontswap_init
 };
 
-struct frontswap_ops zcache_frontswap_register_ops(void)
+struct frontswap_ops *zcache_frontswap_register_ops(void)
 {
-	struct frontswap_ops old_ops =
+	struct frontswap_ops *old_ops =
 		frontswap_register_ops(&zcache_frontswap_ops);
 
 	return old_ops;
@@ -2061,12 +2061,12 @@ static int __init zcache_init(void)
 #endif
 #ifdef CONFIG_FRONTSWAP
 	if (zcache_enabled && use_frontswap) {
-		struct frontswap_ops old_ops;
+		struct frontswap_ops *old_ops;
 
 		old_ops = zcache_frontswap_register_ops();
 		pr_info("zcache: frontswap enabled using kernel "
 			"transcendent memory and zsmalloc\n");
-		if (old_ops.init != NULL)
+		if (old_ops)
 			pr_warning("zcache: frontswap_ops overridden");
 	}
 #endif
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index 144564e..4b02c07 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -362,7 +362,7 @@ static int __init no_frontswap(char *s)
 }
 __setup("nofrontswap", no_frontswap);
 
-static struct frontswap_ops __initdata tmem_frontswap_ops = {
+static struct frontswap_ops tmem_frontswap_ops = {
 	.store = tmem_frontswap_store,
 	.load = tmem_frontswap_load,
 	.invalidate_page = tmem_frontswap_flush_page,
@@ -378,11 +378,11 @@ static int __init xen_tmem_init(void)
 #ifdef CONFIG_FRONTSWAP
 	if (tmem_enabled && use_frontswap) {
 		char *s = "";
-		struct frontswap_ops old_ops =
+		struct frontswap_ops *old_ops =
 			frontswap_register_ops(&tmem_frontswap_ops);
 
 		tmem_frontswap_poolid = -1;
-		if (old_ops.init != NULL)
+		if (old_ops)
 			s = " (WARNING: frontswap_ops overridden)";
 		printk(KERN_INFO "frontswap enabled, RAM provided by "
 				 "Xen Transcendent Memory\n");
diff --git a/include/linux/frontswap.h b/include/linux/frontswap.h
index 3044254..d4f2987 100644
--- a/include/linux/frontswap.h
+++ b/include/linux/frontswap.h
@@ -14,7 +14,7 @@ struct frontswap_ops {
 };
 
 extern bool frontswap_enabled;
-extern struct frontswap_ops
+extern struct frontswap_ops *
 	frontswap_register_ops(struct frontswap_ops *ops);
 extern void frontswap_shrink(unsigned long);
 extern unsigned long frontswap_curr_pages(void);
diff --git a/mm/frontswap.c b/mm/frontswap.c
index ba58157..e73dd23 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -24,7 +24,7 @@
  * frontswap_ops is set by frontswap_register_ops to contain the pointers
  * to the frontswap "backend" implementation functions.
  */
-static struct frontswap_ops frontswap_ops __read_mostly;
+static struct frontswap_ops *frontswap_ops __read_mostly;
 
 /*
  * This global enablement flag reduces overhead on systems where frontswap_ops
@@ -90,28 +90,26 @@ static inline void inc_frontswap_invalidates(void) { }
  * ignored or fail.
  */
 static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
-static bool backend_registered __read_mostly;
 
 /*
  * Register operations for frontswap, returning previous thus allowing
  * detection of multiple backends and possible nesting.
  */
-struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
+struct frontswap_ops *frontswap_register_ops(struct frontswap_ops *ops)
 {
-	struct frontswap_ops old = frontswap_ops;
+	struct frontswap_ops *old = frontswap_ops;
 	int i;
 
-	frontswap_ops = *ops;
 	frontswap_enabled = true;
 
 	for (i = 0; i < MAX_SWAPFILES; i++) {
 		if (test_and_clear_bit(i, need_init))
-			(*frontswap_ops.init)(i);
+			ops->init(i);
 	}
-	/* We MUST have backend_registered called _after_ the frontswap_init's
+	/* We MUST have frontswap_ops set _after_ the frontswap_init's
  	 * have been called. Otherwise __frontswap_store might fail. */
 	barrier();
-	backend_registered = true;
+	frontswap_ops = ops;
 	return old;
 }
 EXPORT_SYMBOL(frontswap_register_ops);
@@ -141,11 +139,11 @@ void __frontswap_init(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	if (backend_registered) {
+	if (frontswap_ops) {
 		BUG_ON(sis == NULL);
 		if (sis->frontswap_map == NULL)
 			return;
-		(*frontswap_ops.init)(type);
+		frontswap_ops->init(type);
 	}
 	else {
 		BUG_ON(type > MAX_SWAPFILES);
@@ -176,7 +174,7 @@ int __frontswap_store(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
-	if (!backend_registered) {
+	if (!frontswap_ops) {
 		inc_frontswap_failed_stores();
 		return ret;
 	}
@@ -185,7 +183,7 @@ int __frontswap_store(struct page *page)
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
 		dup = 1;
-	ret = frontswap_ops.store(type, offset, page);
+	ret = frontswap_ops->store(type, offset, page);
 	if (ret == 0) {
 		frontswap_set(sis, offset);
 		inc_frontswap_succ_stores();
@@ -220,13 +218,13 @@ int __frontswap_load(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
-	if (!backend_registered)
+	if (!frontswap_ops)
 		return ret;
 
 	BUG_ON(!PageLocked(page));
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
-		ret = frontswap_ops.load(type, offset, page);
+		ret = frontswap_ops->load(type, offset, page);
 	if (ret == 0) {
 		inc_frontswap_loads();
 		if (frontswap_tmem_exclusive_gets_enabled) {
@@ -246,12 +244,12 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	if (!backend_registered)
+	if (!frontswap_ops)
 		return;
 
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset)) {
-		frontswap_ops.invalidate_page(type, offset);
+		frontswap_ops->invalidate_page(type, offset);
 		__frontswap_clear(sis, offset);
 		inc_frontswap_invalidates();
 	}
@@ -266,11 +264,11 @@ void __frontswap_invalidate_area(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	if (backend_registered) {
+	if (frontswap_ops) {
 		BUG_ON(sis == NULL);
 		if (sis->frontswap_map == NULL)
 			return;
-		(*frontswap_ops.invalidate_area)(type);
+		frontswap_ops->invalidate_area(type);
 		atomic_set(&sis->frontswap_pages, 0);
 		memset(sis->frontswap_map, 0, sis->max / sizeof(long));
 	}
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/8] cleancache: Make cleancache_init use a pointer for the ops
  2012-11-14 18:57 [PATCH v2] enable all tmem backends to be built and loaded as modules Konrad Rzeszutek Wilk
                   ` (2 preceding siblings ...)
  2012-11-14 18:57 ` [PATCH 3/8] frontswap: Make frontswap_init use a pointer for the ops Konrad Rzeszutek Wilk
@ 2012-11-14 18:57 ` Konrad Rzeszutek Wilk
  2012-11-14 18:57 ` [PATCH 5/8] staging: zcache2+ramster: enable ramster to be built/loaded as a module Konrad Rzeszutek Wilk
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-14 18:57 UTC (permalink / raw)
  To: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, akpm, mgorman
  Cc: fschmaus, andor.daam, ilendir, Konrad Rzeszutek Wilk

Instead of using a backend_registered to determine whether
a backend is enabled. This allows us to remove the
backend_register check and just do 'if (cleancache_ops)'

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/staging/ramster/zcache-main.c |    8 +++---
 drivers/staging/zcache/zcache-main.c  |    8 +++---
 drivers/xen/tmem.c                    |    6 ++--
 include/linux/cleancache.h            |    2 +-
 mm/cleancache.c                       |   43 +++++++++++++++-----------------
 5 files changed, 32 insertions(+), 35 deletions(-)

diff --git a/drivers/staging/ramster/zcache-main.c b/drivers/staging/ramster/zcache-main.c
index 6c8959d..ed99170 100644
--- a/drivers/staging/ramster/zcache-main.c
+++ b/drivers/staging/ramster/zcache-main.c
@@ -1495,9 +1495,9 @@ static struct cleancache_ops zcache_cleancache_ops = {
 	.init_fs = zcache_cleancache_init_fs
 };
 
-struct cleancache_ops zcache_cleancache_register_ops(void)
+struct cleancache_ops *zcache_cleancache_register_ops(void)
 {
-	struct cleancache_ops old_ops =
+	struct cleancache_ops *old_ops =
 		cleancache_register_ops(&zcache_cleancache_ops);
 
 	return old_ops;
@@ -1781,7 +1781,7 @@ static int __init zcache_init(void)
 	}
 	zbud_init();
 	if (zcache_enabled && !disable_cleancache) {
-		struct cleancache_ops old_ops;
+		struct cleancache_ops *old_ops;
 
 		register_shrinker(&zcache_shrinker);
 		old_ops = zcache_cleancache_register_ops();
@@ -1791,7 +1791,7 @@ static int __init zcache_init(void)
 		pr_info("%s: cleancache: ignorenonactive = %d\n",
 			namestr, !disable_cleancache_ignore_nonactive);
 #endif
-		if (old_ops.init_fs != NULL)
+		if (old_ops)
 			pr_warn("%s: cleancache_ops overridden\n", namestr);
 	}
 	if (zcache_enabled && !disable_frontswap) {
diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index 3db38cb..f9ab874 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1811,9 +1811,9 @@ static struct cleancache_ops zcache_cleancache_ops = {
 	.init_fs = zcache_cleancache_init_fs
 };
 
-struct cleancache_ops zcache_cleancache_register_ops(void)
+struct cleancache_ops *zcache_cleancache_register_ops(void)
 {
-	struct cleancache_ops old_ops =
+	struct cleancache_ops *old_ops =
 		cleancache_register_ops(&zcache_cleancache_ops);
 
 	return old_ops;
@@ -2048,14 +2048,14 @@ static int __init zcache_init(void)
 
 #ifdef CONFIG_CLEANCACHE
 	if (zcache_enabled && use_cleancache) {
-		struct cleancache_ops old_ops;
+		struct cleancache_ops *old_ops;
 
 		zbud_init();
 		register_shrinker(&zcache_shrinker);
 		old_ops = zcache_cleancache_register_ops();
 		pr_info("zcache: cleancache enabled using kernel "
 			"transcendent memory and compression buddies\n");
-		if (old_ops.init_fs != NULL)
+		if (old_ops)
 			pr_warning("zcache: cleancache_ops overridden");
 	}
 #endif
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index 4b02c07..15e776c 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -236,7 +236,7 @@ static int __init no_cleancache(char *s)
 }
 __setup("nocleancache", no_cleancache);
 
-static struct cleancache_ops __initdata tmem_cleancache_ops = {
+static struct cleancache_ops tmem_cleancache_ops = {
 	.put_page = tmem_cleancache_put_page,
 	.get_page = tmem_cleancache_get_page,
 	.invalidate_page = tmem_cleancache_flush_page,
@@ -392,9 +392,9 @@ static int __init xen_tmem_init(void)
 	BUG_ON(sizeof(struct cleancache_filekey) != sizeof(struct tmem_oid));
 	if (tmem_enabled && use_cleancache) {
 		char *s = "";
-		struct cleancache_ops old_ops =
+		struct cleancache_ops *old_ops =
 			cleancache_register_ops(&tmem_cleancache_ops);
-		if (old_ops.init_fs != NULL)
+		if (old_ops)
 			s = " (WARNING: cleancache_ops overridden)";
 		printk(KERN_INFO "cleancache enabled, RAM provided by "
 				 "Xen Transcendent Memory%s\n", s);
diff --git a/include/linux/cleancache.h b/include/linux/cleancache.h
index 42e55de..3af5ea8 100644
--- a/include/linux/cleancache.h
+++ b/include/linux/cleancache.h
@@ -33,7 +33,7 @@ struct cleancache_ops {
 	void (*invalidate_fs)(int);
 };
 
-extern struct cleancache_ops
+extern struct cleancache_ops *
 	cleancache_register_ops(struct cleancache_ops *ops);
 extern void __cleancache_init_fs(struct super_block *);
 extern void __cleancache_init_shared_fs(char *, struct super_block *);
diff --git a/mm/cleancache.c b/mm/cleancache.c
index 318a0ad..95f6618 100644
--- a/mm/cleancache.c
+++ b/mm/cleancache.c
@@ -32,7 +32,7 @@ EXPORT_SYMBOL(cleancache_enabled);
  * cleancache_ops is set by cleancache_ops_register to contain the pointers
  * to the cleancache "backend" implementation functions.
  */
-static struct cleancache_ops cleancache_ops __read_mostly;
+static struct cleancache_ops *cleancache_ops __read_mostly;
 
 /*
  * Counters available via /sys/kernel/debug/frontswap (if debugfs is
@@ -63,27 +63,24 @@ static int fs_poolid_map[MAX_INITIALIZABLE_FS];
 static int shared_fs_poolid_map[MAX_INITIALIZABLE_FS];
 
 static char *uuids[MAX_INITIALIZABLE_FS];
-static bool __read_mostly backend_registered;
 
 /*
  * register operations for cleancache, returning previous thus allowing
  * detection of multiple backends and possible nesting
  */
-struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
+struct cleancache_ops *cleancache_register_ops(struct cleancache_ops *ops)
 {
-	struct cleancache_ops old = cleancache_ops;
+	struct cleancache_ops *old = cleancache_ops;
 	int i;
 
-	cleancache_ops = *ops;
-
-	backend_registered = true;
 	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
 		if (fs_poolid_map[i] == FS_NO_BACKEND)
-			fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+			fs_poolid_map[i] = ops->init_fs(PAGE_SIZE);
 		if (shared_fs_poolid_map[i] == FS_NO_BACKEND)
-			shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+			shared_fs_poolid_map[i] = ops->init_shared_fs
 					(uuids[i], PAGE_SIZE);
 	}
+	cleancache_ops = ops;
 	return old;
 }
 EXPORT_SYMBOL(cleancache_register_ops);
@@ -96,8 +93,8 @@ void __cleancache_init_fs(struct super_block *sb)
 	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
 		if (fs_poolid_map[i] == FS_UNKNOWN) {
 			sb->cleancache_poolid = i + FAKE_FS_POOLID_OFFSET;
-			if (backend_registered)
-				fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+			if (cleancache_ops)
+				fs_poolid_map[i] = cleancache_ops->init_fs(PAGE_SIZE);
 			else
 				fs_poolid_map[i] = FS_NO_BACKEND;
 			break;
@@ -115,8 +112,8 @@ void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
 		if (shared_fs_poolid_map[i] == FS_UNKNOWN) {
 			sb->cleancache_poolid = i + FAKE_SHARED_FS_POOLID_OFFSET;
 			uuids[i] = uuid;
-			if (backend_registered)
-				shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+			if (cleancache_ops)
+				shared_fs_poolid_map[i] = cleancache_ops->init_shared_fs
 						(uuid, PAGE_SIZE);
 			else
 				shared_fs_poolid_map[i] = FS_NO_BACKEND;
@@ -178,7 +175,7 @@ int __cleancache_get_page(struct page *page)
 	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!backend_registered) {
+	if (!cleancache_ops) {
 		cleancache_failed_gets++;
 		goto out;
 	}
@@ -193,7 +190,7 @@ int __cleancache_get_page(struct page *page)
 		goto out;
 
 	if (pool_id >= 0)
-		ret = (*cleancache_ops.get_page)(pool_id,
+		ret = cleancache_ops->get_page(pool_id,
 				key, page->index, page);
 	if (ret == 0)
 		cleancache_succ_gets++;
@@ -216,7 +213,7 @@ void __cleancache_put_page(struct page *page)
 	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!backend_registered) {
+	if (!cleancache_ops) {
 		cleancache_puts++;
 		return;
 	}
@@ -230,7 +227,7 @@ void __cleancache_put_page(struct page *page)
 
 	if (pool_id >= 0 &&
 		cleancache_get_key(page->mapping->host, &key) >= 0) {
-		(*cleancache_ops.put_page)(pool_id, key, page->index, page);
+		cleancache_ops->put_page(pool_id, key, page->index, page);
 		cleancache_puts++;
 	}
 }
@@ -248,7 +245,7 @@ void __cleancache_invalidate_page(struct address_space *mapping,
 	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!backend_registered)
+	if (!cleancache_ops)
 		return;
 
 	if (fake_pool_id >= 0) {
@@ -258,7 +255,7 @@ void __cleancache_invalidate_page(struct address_space *mapping,
 
 		VM_BUG_ON(!PageLocked(page));
 		if (cleancache_get_key(mapping->host, &key) >= 0) {
-			(*cleancache_ops.invalidate_page)(pool_id,
+			cleancache_ops->invalidate_page(pool_id,
 					key, page->index);
 			cleancache_invalidates++;
 		}
@@ -277,7 +274,7 @@ void __cleancache_invalidate_inode(struct address_space *mapping)
 	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (!backend_registered)
+	if (!cleancache_ops)
 		return;
 
 	if (fake_pool_id < 0)
@@ -286,7 +283,7 @@ void __cleancache_invalidate_inode(struct address_space *mapping)
 	pool_id = get_poolid_from_fake(fake_pool_id);
 
 	if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0)
-		(*cleancache_ops.invalidate_inode)(pool_id, key);
+		cleancache_ops->invalidate_inode(pool_id, key);
 }
 EXPORT_SYMBOL(__cleancache_invalidate_inode);
 
@@ -312,8 +309,8 @@ void __cleancache_invalidate_fs(struct super_block *sb)
 		fs_poolid_map[index] = FS_UNKNOWN;
 	}
 	sb->cleancache_poolid = -1;
-	if (backend_registered)
-		(*cleancache_ops.invalidate_fs)(old_poolid);
+	if (cleancache_ops)
+		cleancache_ops->invalidate_fs(old_poolid);
 }
 EXPORT_SYMBOL(__cleancache_invalidate_fs);
 
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/8] staging: zcache2+ramster: enable ramster to be built/loaded as a module
  2012-11-14 18:57 [PATCH v2] enable all tmem backends to be built and loaded as modules Konrad Rzeszutek Wilk
                   ` (3 preceding siblings ...)
  2012-11-14 18:57 ` [PATCH 4/8] cleancache: Make cleancache_init " Konrad Rzeszutek Wilk
@ 2012-11-14 18:57 ` Konrad Rzeszutek Wilk
  2012-11-14 18:57 ` [PATCH 6/8] staging: zcache2+ramster: enable zcache2 " Konrad Rzeszutek Wilk
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-14 18:57 UTC (permalink / raw)
  To: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, akpm, mgorman
  Cc: fschmaus, andor.daam, ilendir, Konrad Rzeszutek Wilk

From: Dan Magenheimer <dan.magenheimer@oracle.com>

Enable module support for ramster.  Note runtime dependency disallows
loading if cleancache/frontswap lazy initialization patches are not
present.

If built-in (not built as a module), the original mechanism of enabling via
a kernel boot parameter is retained, but this should be considered deprecated.

Note that module unload is explicitly not yet supported.

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Fixed compile issues since ramster_init now has four arguments]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/staging/ramster/ramster.h                  |    6 +++-
 drivers/staging/ramster/ramster/nodemanager.c      |    9 +++---
 drivers/staging/ramster/ramster/ramster.c          |   29 ++++++++++++++++---
 drivers/staging/ramster/ramster/ramster.h          |    2 +-
 .../staging/ramster/ramster/ramster_nodemanager.h  |    2 +
 drivers/staging/ramster/zcache-main.c              |    2 +-
 6 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/drivers/staging/ramster/ramster.h b/drivers/staging/ramster/ramster.h
index 1b71aea..e1f91d5 100644
--- a/drivers/staging/ramster/ramster.h
+++ b/drivers/staging/ramster/ramster.h
@@ -11,10 +11,14 @@
 #ifndef _ZCACHE_RAMSTER_H_
 #define _ZCACHE_RAMSTER_H_
 
+#ifdef CONFIG_RAMSTER_MODULE
+#define CONFIG_RAMSTER
+#endif
+
 #ifdef CONFIG_RAMSTER
 #include "ramster/ramster.h"
 #else
-static inline void ramster_init(bool x, bool y, bool z)
+static inline void ramster_init(bool x, bool y, bool z, bool w)
 {
 }
 
diff --git a/drivers/staging/ramster/ramster/nodemanager.c b/drivers/staging/ramster/ramster/nodemanager.c
index c0f4815..2cfe933 100644
--- a/drivers/staging/ramster/ramster/nodemanager.c
+++ b/drivers/staging/ramster/ramster/nodemanager.c
@@ -949,7 +949,7 @@ static void __exit exit_r2nm(void)
 	r2hb_exit();
 }
 
-static int __init init_r2nm(void)
+int r2nm_init(void)
 {
 	int ret = -1;
 
@@ -986,10 +986,11 @@ out_r2hb:
 out:
 	return ret;
 }
+EXPORT_SYMBOL_GPL(r2nm_init);
 
 MODULE_AUTHOR("Oracle");
 MODULE_LICENSE("GPL");
 
-/* module_init(init_r2nm) */
-late_initcall(init_r2nm);
-/* module_exit(exit_r2nm) */
+#ifndef CONFIG_RAMSTER_MODULE
+late_initcall(r2nm_init);
+#endif
diff --git a/drivers/staging/ramster/ramster/ramster.c b/drivers/staging/ramster/ramster/ramster.c
index c06709f..491ec70 100644
--- a/drivers/staging/ramster/ramster/ramster.c
+++ b/drivers/staging/ramster/ramster/ramster.c
@@ -92,7 +92,7 @@ static unsigned long ramster_remote_page_flushes_failed;
 #include <linux/debugfs.h>
 #define	zdfs	debugfs_create_size_t
 #define	zdfs64	debugfs_create_u64
-static int __init ramster_debugfs_init(void)
+static int ramster_debugfs_init(void)
 {
 	struct dentry *root = debugfs_create_dir("ramster", NULL);
 	if (root == NULL)
@@ -191,6 +191,7 @@ int ramster_do_preload_flnode(struct tmem_pool *pool)
 		kmem_cache_free(ramster_flnode_cache, flnode);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(ramster_do_preload_flnode);
 
 /*
  * Called by the message handler after a (still compressed) page has been
@@ -458,6 +459,7 @@ void *ramster_pampd_free(void *pampd, struct tmem_pool *pool,
 	}
 	return local_pampd;
 }
+EXPORT_SYMBOL_GPL(ramster_pampd_free);
 
 void ramster_count_foreign_pages(bool eph, int count)
 {
@@ -489,6 +491,7 @@ void ramster_count_foreign_pages(bool eph, int count)
 		ramster_foreign_pers_pages = c;
 	}
 }
+EXPORT_SYMBOL_GPL(ramster_count_foreign_pages);
 
 /*
  * For now, just push over a few pages every few seconds to
@@ -674,7 +677,7 @@ requeue:
 	ramster_remotify_queue_delayed_work(HZ);
 }
 
-void __init ramster_remotify_init(void)
+void ramster_remotify_init(void)
 {
 	unsigned long n = 60UL;
 	ramster_remotify_workqueue =
@@ -849,8 +852,10 @@ static bool frontswap_selfshrinking __read_mostly;
 static void selfshrink_process(struct work_struct *work);
 static DECLARE_DELAYED_WORK(selfshrink_worker, selfshrink_process);
 
+#ifndef CONFIG_RAMSTER_MODULE
 /* Enable/disable with kernel boot option. */
 static bool use_frontswap_selfshrink __initdata = true;
+#endif
 
 /*
  * The default values for the following parameters were deemed reasonable
@@ -905,6 +910,7 @@ static void frontswap_selfshrink(void)
 	frontswap_shrink(tgt_frontswap_pages);
 }
 
+#ifndef CONFIG_RAMSTER_MODULE
 static int __init ramster_nofrontswap_selfshrink_setup(char *s)
 {
 	use_frontswap_selfshrink = false;
@@ -912,6 +918,7 @@ static int __init ramster_nofrontswap_selfshrink_setup(char *s)
 }
 
 __setup("noselfshrink", ramster_nofrontswap_selfshrink_setup);
+#endif
 
 static void selfshrink_process(struct work_struct *work)
 {
@@ -930,6 +937,7 @@ void ramster_cpu_up(int cpu)
 	per_cpu(ramster_remoteputmem1, cpu) = p1;
 	per_cpu(ramster_remoteputmem2, cpu) = p2;
 }
+EXPORT_SYMBOL_GPL(ramster_cpu_up);
 
 void ramster_cpu_down(int cpu)
 {
@@ -945,6 +953,7 @@ void ramster_cpu_down(int cpu)
 		kp->flnode = NULL;
 	}
 }
+EXPORT_SYMBOL_GPL(ramster_cpu_down);
 
 void ramster_register_pamops(struct tmem_pamops *pamops)
 {
@@ -955,9 +964,11 @@ void ramster_register_pamops(struct tmem_pamops *pamops)
 	pamops->repatriate = ramster_pampd_repatriate;
 	pamops->repatriate_preload = ramster_pampd_repatriate_preload;
 }
+EXPORT_SYMBOL_GPL(ramster_register_pamops);
 
-void __init ramster_init(bool cleancache, bool frontswap,
-				bool frontswap_exclusive_gets)
+void ramster_init(bool cleancache, bool frontswap,
+				bool frontswap_exclusive_gets,
+				bool frontswap_selfshrink)
 {
 	int ret = 0;
 
@@ -972,10 +983,17 @@ void __init ramster_init(bool cleancache, bool frontswap,
 	if (ret)
 		pr_err("ramster: can't create sysfs for ramster\n");
 	(void)r2net_register_handlers();
+#ifdef CONFIG_RAMSTER_MODULE
+	ret = r2nm_init();
+	if (ret)
+		pr_err("ramster: can't init r2net\n");
+	frontswap_selfshrinking = frontswap_selfshrink;
+#else
+	frontswap_selfshrinking = use_frontswap_selfshrink;
+#endif
 	INIT_LIST_HEAD(&ramster_rem_op_list);
 	ramster_flnode_cache = kmem_cache_create("ramster_flnode",
 				sizeof(struct flushlist_node), 0, 0, NULL);
-	frontswap_selfshrinking = use_frontswap_selfshrink;
 	if (frontswap_selfshrinking) {
 		pr_info("ramster: Initializing frontswap selfshrink driver.\n");
 		schedule_delayed_work(&selfshrink_worker,
@@ -983,3 +1001,4 @@ void __init ramster_init(bool cleancache, bool frontswap,
 	}
 	ramster_remotify_init();
 }
+EXPORT_SYMBOL_GPL(ramster_init);
diff --git a/drivers/staging/ramster/ramster/ramster.h b/drivers/staging/ramster/ramster/ramster.h
index 12ae56f..6d41a7a 100644
--- a/drivers/staging/ramster/ramster/ramster.h
+++ b/drivers/staging/ramster/ramster/ramster.h
@@ -147,7 +147,7 @@ extern int r2net_register_handlers(void);
 extern int r2net_remote_target_node_set(int);
 
 extern int ramster_remotify_pageframe(bool);
-extern void ramster_init(bool, bool, bool);
+extern void ramster_init(bool, bool, bool, bool);
 extern void ramster_register_pamops(struct tmem_pamops *);
 extern int ramster_localify(int, struct tmem_oid *oidp, uint32_t, char *,
 				unsigned int, void *);
diff --git a/drivers/staging/ramster/ramster/ramster_nodemanager.h b/drivers/staging/ramster/ramster/ramster_nodemanager.h
index 49f879d..dbaae34 100644
--- a/drivers/staging/ramster/ramster/ramster_nodemanager.h
+++ b/drivers/staging/ramster/ramster/ramster_nodemanager.h
@@ -36,4 +36,6 @@
 /* host name, group name, cluster name all 64 bytes */
 #define R2NM_MAX_NAME_LEN        64    /* __NEW_UTS_LEN */
 
+extern int r2nm_init(void);
+
 #endif /* _RAMSTER_NODEMANAGER_H */
diff --git a/drivers/staging/ramster/zcache-main.c b/drivers/staging/ramster/zcache-main.c
index ed99170..38c78b3 100644
--- a/drivers/staging/ramster/zcache-main.c
+++ b/drivers/staging/ramster/zcache-main.c
@@ -1812,7 +1812,7 @@ static int __init zcache_init(void)
 	}
 	if (ramster_enabled)
 		ramster_init(!disable_cleancache, !disable_frontswap,
-				frontswap_has_exclusive_gets);
+				frontswap_has_exclusive_gets, false);
 out:
 	return ret;
 }
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 6/8] staging: zcache2+ramster: enable zcache2 to be built/loaded as a module
  2012-11-14 18:57 [PATCH v2] enable all tmem backends to be built and loaded as modules Konrad Rzeszutek Wilk
                   ` (4 preceding siblings ...)
  2012-11-14 18:57 ` [PATCH 5/8] staging: zcache2+ramster: enable ramster to be built/loaded as a module Konrad Rzeszutek Wilk
@ 2012-11-14 18:57 ` Konrad Rzeszutek Wilk
  2012-11-14 18:57 ` [PATCH 7/8] xen: tmem: enable Xen tmem shim " Konrad Rzeszutek Wilk
  2012-11-14 18:57 ` [PATCH 8/8] xen/tmem: Remove the subsys call Konrad Rzeszutek Wilk
  7 siblings, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-14 18:57 UTC (permalink / raw)
  To: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, akpm, mgorman
  Cc: fschmaus, andor.daam, ilendir, Konrad Rzeszutek Wilk

From: Dan Magenheimer <dan.magenheimer@oracle.com>

Allow zcache2 to be built/loaded as a module.  Note runtime dependency
disallows loading if cleancache/frontswap lazy initialization patches
are not present.  Zsmalloc support has not yet been merged into zcache2
but, once merged, could now easily be selected via a module_param.

If built-in (not built as a module), the original mechanism of enabling via
a kernel boot parameter is retained, but this should be considered deprecated.

Note that module unload is explicitly not yet supported.

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Rebased with different order of patches]
[v2: Removed [CLEANCACHE|FRONTSWAP]_HAS_LAZY_INIT ifdef]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/staging/ramster/Kconfig       |    6 ++--
 drivers/staging/ramster/Makefile      |   11 ++++---
 drivers/staging/ramster/tmem.c        |    6 +++-
 drivers/staging/ramster/tmem.h        |    8 +++---
 drivers/staging/ramster/zcache-main.c |   47 ++++++++++++++++++++++++++++++---
 drivers/staging/ramster/zcache.h      |    2 +-
 6 files changed, 62 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/ramster/Kconfig b/drivers/staging/ramster/Kconfig
index 3abf661..9ce2590 100644
--- a/drivers/staging/ramster/Kconfig
+++ b/drivers/staging/ramster/Kconfig
@@ -1,5 +1,5 @@
 config ZCACHE2
-	bool "Dynamic compression of swap pages and clean pagecache pages"
+	tristate "Dynamic compression of swap pages and clean pagecache pages"
 	depends on CRYPTO=y && SWAP=y && CLEANCACHE && FRONTSWAP && !ZCACHE 
 	select CRYPTO_LZO
 	default n
@@ -16,8 +16,8 @@ config ZCACHE2
 	  version of ramster.
 
 config RAMSTER
-	bool "Cross-machine RAM capacity sharing, aka peer-to-peer tmem"
-	depends on CONFIGFS_FS=y && SYSFS=y && !HIGHMEM && ZCACHE2=y
+	tristate "Cross-machine RAM capacity sharing, aka peer-to-peer tmem"
+	depends on CONFIGFS_FS && SYSFS && !HIGHMEM && ZCACHE2
 	depends on NET
 	# must ensure struct page is 8-byte aligned
 	select HAVE_ALIGNED_STRUCT_PAGE if !64_BIT
diff --git a/drivers/staging/ramster/Makefile b/drivers/staging/ramster/Makefile
index 2d8b9d0..fcb25cb 100644
--- a/drivers/staging/ramster/Makefile
+++ b/drivers/staging/ramster/Makefile
@@ -1,6 +1,7 @@
 zcache-y	:=		zcache-main.o tmem.o zbud.o
-zcache-$(CONFIG_RAMSTER)	+=	ramster/ramster.o ramster/r2net.o
-zcache-$(CONFIG_RAMSTER)	+=	ramster/nodemanager.o ramster/tcp.o
-zcache-$(CONFIG_RAMSTER)	+=	ramster/heartbeat.o ramster/masklog.o
-
-obj-$(CONFIG_ZCACHE2)	+=	zcache.o
+ifneq (,$(filter $(CONFIG_RAMSTER),y m))
+zcache-y	+=	ramster/ramster.o ramster/r2net.o
+zcache-y	+=	ramster/nodemanager.o ramster/tcp.o
+zcache-y	+=	ramster/heartbeat.o ramster/masklog.o
+endif
+obj-$(CONFIG_MODULES)	+= zcache.o
diff --git a/drivers/staging/ramster/tmem.c b/drivers/staging/ramster/tmem.c
index a2b7e03..d7e51e4 100644
--- a/drivers/staging/ramster/tmem.c
+++ b/drivers/staging/ramster/tmem.c
@@ -35,7 +35,8 @@
 #include <linux/list.h>
 #include <linux/spinlock.h>
 #include <linux/atomic.h>
-#ifdef CONFIG_RAMSTER
+#include <linux/export.h>
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 #include <linux/delay.h>
 #endif
 
@@ -641,6 +642,7 @@ void *tmem_localify_get_pampd(struct tmem_pool *pool, struct tmem_oid *oidp,
 	/* note, hashbucket remains locked */
 	return pampd;
 }
+EXPORT_SYMBOL_GPL(tmem_localify_get_pampd);
 
 void tmem_localify_finish(struct tmem_obj *obj, uint32_t index,
 			  void *pampd, void *saved_hb, bool delete)
@@ -658,6 +660,7 @@ void tmem_localify_finish(struct tmem_obj *obj, uint32_t index,
 	}
 	spin_unlock(&hb->lock);
 }
+EXPORT_SYMBOL_GPL(tmem_localify_finish);
 
 /*
  * For ramster only.  Helper function to support asynchronous tmem_get.
@@ -719,6 +722,7 @@ out:
 	spin_unlock(&hb->lock);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(tmem_replace);
 #endif
 
 /*
diff --git a/drivers/staging/ramster/tmem.h b/drivers/staging/ramster/tmem.h
index adbe5a8..d128ce2 100644
--- a/drivers/staging/ramster/tmem.h
+++ b/drivers/staging/ramster/tmem.h
@@ -126,7 +126,7 @@ static inline unsigned tmem_oid_hash(struct tmem_oid *oidp)
 				TMEM_HASH_BUCKET_BITS);
 }
 
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 struct tmem_xhandle {
 	uint8_t client_id;
 	uint8_t xh_data_cksum;
@@ -171,7 +171,7 @@ struct tmem_obj {
 	unsigned int objnode_tree_height;
 	unsigned long objnode_count;
 	long pampd_count;
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 	/*
 	 * for current design of ramster, all pages belonging to
 	 * an object reside on the same remotenode and extra is
@@ -215,7 +215,7 @@ struct tmem_pamops {
 				uint32_t);
 	void (*free)(void *, struct tmem_pool *,
 				struct tmem_oid *, uint32_t, bool);
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 	void (*new_obj)(struct tmem_obj *);
 	void (*free_obj)(struct tmem_pool *, struct tmem_obj *, bool);
 	void *(*repatriate_preload)(void *, struct tmem_pool *,
@@ -247,7 +247,7 @@ extern int tmem_flush_page(struct tmem_pool *, struct tmem_oid *,
 extern int tmem_flush_object(struct tmem_pool *, struct tmem_oid *);
 extern int tmem_destroy_pool(struct tmem_pool *);
 extern void tmem_new_pool(struct tmem_pool *, uint32_t);
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 extern int tmem_replace(struct tmem_pool *, struct tmem_oid *, uint32_t index,
 			void *);
 extern void *tmem_localify_get_pampd(struct tmem_pool *, struct tmem_oid *,
diff --git a/drivers/staging/ramster/zcache-main.c b/drivers/staging/ramster/zcache-main.c
index 38c78b3..29e2523 100644
--- a/drivers/staging/ramster/zcache-main.c
+++ b/drivers/staging/ramster/zcache-main.c
@@ -31,8 +31,10 @@
 #include "ramster.h"
 #ifdef CONFIG_RAMSTER
 static int ramster_enabled;
+static int disable_frontswap_selfshrink;
 #else
 #define ramster_enabled 0
+#define disable_frontswap_selfshrink 0
 #endif
 
 #ifndef __PG_WAS_ACTIVE
@@ -68,8 +70,12 @@ static char *namestr __read_mostly = "zcache";
 MODULE_LICENSE("GPL");
 
 /* crypto API for zcache  */
+#ifdef CONFIG_ZCACHE2_MODULE
+static char *zcache_comp_name = "lzo";
+#else
 #define ZCACHE_COMP_NAME_SZ CRYPTO_MAX_ALG_NAME
 static char zcache_comp_name[ZCACHE_COMP_NAME_SZ] __read_mostly;
+#endif
 static struct crypto_comp * __percpu *zcache_comp_pcpu_tfms __read_mostly;
 
 enum comp_op {
@@ -1346,7 +1352,7 @@ static int zcache_local_new_pool(uint32_t flags)
 int zcache_autocreate_pool(unsigned int cli_id, unsigned int pool_id, bool eph)
 {
 	struct tmem_pool *pool;
-	struct zcache_client *cli;
+	struct zcache_client *cli = NULL;
 	uint32_t flags = eph ? 0 : TMEM_POOL_PERSIST;
 	int ret = -1;
 
@@ -1640,6 +1646,7 @@ struct frontswap_ops *zcache_frontswap_register_ops(void)
  * OR NOTHING HAPPENS!
  */
 
+#ifndef CONFIG_ZCACHE2_MODULE
 static int __init enable_zcache(char *s)
 {
 	zcache_enabled = 1;
@@ -1706,18 +1713,27 @@ static int __init enable_zcache_compressor(char *s)
 	return 1;
 }
 __setup("zcache=", enable_zcache_compressor);
+#endif
 
 
-static int __init zcache_comp_init(void)
+static int zcache_comp_init(void)
 {
 	int ret = 0;
 
 	/* check crypto algorithm */
+#ifdef CONFIG_ZCACHE2_MODULE
+	ret = crypto_has_comp(zcache_comp_name, 0, 0);
+	if (!ret) {
+		ret = -1;
+		goto out;
+	}
+#else
 	if (*zcache_comp_name != '\0') {
 		ret = crypto_has_comp(zcache_comp_name, 0, 0);
 		if (!ret)
 			pr_info("zcache: %s not supported\n",
 					zcache_comp_name);
+		goto out;
 	}
 	if (!ret)
 		strcpy(zcache_comp_name, "lzo");
@@ -1726,6 +1742,7 @@ static int __init zcache_comp_init(void)
 		ret = 1;
 		goto out;
 	}
+#endif
 	pr_info("zcache: using %s compressor\n", zcache_comp_name);
 
 	/* alloc percpu transforms */
@@ -1737,10 +1754,13 @@ out:
 	return ret;
 }
 
-static int __init zcache_init(void)
+static int zcache_init(void)
 {
 	int ret = 0;
 
+#ifdef CONFIG_ZCACHE2_MODULE
+	zcache_enabled = 1;
+#endif
 	if (ramster_enabled) {
 		namestr = "ramster";
 		ramster_register_pamops(&zcache_pamops);
@@ -1812,9 +1832,28 @@ static int __init zcache_init(void)
 	}
 	if (ramster_enabled)
 		ramster_init(!disable_cleancache, !disable_frontswap,
-				frontswap_has_exclusive_gets, false);
+				frontswap_has_exclusive_gets,
+				!disable_frontswap_selfshrink);
 out:
 	return ret;
 }
 
+#ifdef CONFIG_ZCACHE2_MODULE
+#ifdef CONFIG_RAMSTER
+module_param(ramster_enabled, int, S_IRUGO);
+module_param(disable_frontswap_selfshrink, int, S_IRUGO);
+#endif
+module_param(disable_cleancache, int, S_IRUGO);
+module_param(disable_frontswap, int, S_IRUGO);
+#ifdef FRONTSWAP_HAS_EXCLUSIVE_GETS
+module_param(frontswap_has_exclusive_gets, bool, S_IRUGO);
+#endif
+module_param(disable_frontswap_ignore_nonactive, int, S_IRUGO);
+module_param(zcache_comp_name, charp, S_IRUGO);
+module_init(zcache_init);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Dan Magenheimer <dan.magenheimer@oracle.com>");
+MODULE_DESCRIPTION("In-kernel compression of cleancache/frontswap pages");
+#else
 late_initcall(zcache_init);
+#endif
diff --git a/drivers/staging/ramster/zcache.h b/drivers/staging/ramster/zcache.h
index 81722b3..8491200 100644
--- a/drivers/staging/ramster/zcache.h
+++ b/drivers/staging/ramster/zcache.h
@@ -39,7 +39,7 @@ extern int zcache_flush_page(int, int, struct tmem_oid *, uint32_t);
 extern int zcache_flush_object(int, int, struct tmem_oid *);
 extern void zcache_decompress_to_page(char *, unsigned int, struct page *);
 
-#ifdef CONFIG_RAMSTER
+#if defined(CONFIG_RAMSTER) || defined(CONFIG_RAMSTER_MODULE)
 extern void *zcache_pampd_create(char *, unsigned int, bool, int,
 				struct tmem_handle *);
 int zcache_autocreate_pool(unsigned int cli_id, unsigned int pool_id, bool eph);
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 7/8] xen: tmem: enable Xen tmem shim to be built/loaded as a module
  2012-11-14 18:57 [PATCH v2] enable all tmem backends to be built and loaded as modules Konrad Rzeszutek Wilk
                   ` (5 preceding siblings ...)
  2012-11-14 18:57 ` [PATCH 6/8] staging: zcache2+ramster: enable zcache2 " Konrad Rzeszutek Wilk
@ 2012-11-14 18:57 ` Konrad Rzeszutek Wilk
  2012-11-14 18:57 ` [PATCH 8/8] xen/tmem: Remove the subsys call Konrad Rzeszutek Wilk
  7 siblings, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-14 18:57 UTC (permalink / raw)
  To: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, akpm, mgorman
  Cc: fschmaus, andor.daam, ilendir, Konrad Rzeszutek Wilk

From: Dan Magenheimer <dan.magenheimer@oracle.com>

Allow Xen tmem shim to be built/loaded as a module.  Xen self-ballooning
and frontswap-selfshrinking are now also "lazily" initialized when the
Xen tmem shim is loaded as a module, unless explicitly disabled
by module parameters.

Note runtime dependency disallows loading if cleancache/frontswap lazy
initialization patches are not present.

If built-in (not built as a module), the original mechanism of enabling via
a kernel boot parameter is retained, but this should be considered deprecated.

Note that module unload is explicitly not yet supported.

Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Removed the [CLEANCACHE|FRONTSWAP]_HAS_LAZY_INIT ifdef]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/Kconfig           |    4 ++--
 drivers/xen/tmem.c            |   38 +++++++++++++++++++++++++++++---------
 drivers/xen/xen-selfballoon.c |   13 +++++++------
 include/xen/tmem.h            |    8 ++++++++
 4 files changed, 46 insertions(+), 17 deletions(-)

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 126d8ce..1e70c60 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -145,9 +145,9 @@ config SWIOTLB_XEN
 	select SWIOTLB
 
 config XEN_TMEM
-	bool
+	tristate
 	depends on !ARM
-	default y if (CLEANCACHE || FRONTSWAP)
+	default m if (CLEANCACHE || FRONTSWAP)
 	help
 	  Shim to interface in-kernel Transcendent Memory hooks
 	  (e.g. cleancache and frontswap) to Xen tmem hypercalls.
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index 15e776c..9a4a9ec 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -5,6 +5,7 @@
  * Author: Dan Magenheimer
  */
 
+#include <linux/module.h>
 #include <linux/kernel.h>
 #include <linux/types.h>
 #include <linux/init.h>
@@ -128,6 +129,7 @@ static int xen_tmem_flush_object(u32 pool_id, struct tmem_oid oid)
 	return xen_tmem_op(TMEM_FLUSH_OBJECT, pool_id, oid, 0, 0, 0, 0, 0);
 }
 
+#ifndef CONFIG_XEN_TMEM_MODULE
 bool __read_mostly tmem_enabled = false;
 
 static int __init enable_tmem(char *s)
@@ -136,6 +138,7 @@ static int __init enable_tmem(char *s)
 	return 1;
 }
 __setup("tmem", enable_tmem);
+#endif
 
 #ifdef CONFIG_CLEANCACHE
 static int xen_tmem_destroy_pool(u32 pool_id)
@@ -227,14 +230,19 @@ static int tmem_cleancache_init_shared_fs(char *uuid, size_t pagesize)
 	return xen_tmem_new_pool(shared_uuid, TMEM_POOL_SHARED, pagesize);
 }
 
-static bool __initdata use_cleancache = true;
-
+static bool disable_cleancache __read_mostly;
+static bool disable_selfballooning __read_mostly;
+#ifdef CONFIG_XEN_TMEM_MODULE
+module_param(disable_cleancache, bool, S_IRUGO);
+module_param(disable_selfballooning, bool, S_IRUGO);
+#else
 static int __init no_cleancache(char *s)
 {
-	use_cleancache = false;
+	disable_cleancache = true;
 	return 1;
 }
 __setup("nocleancache", no_cleancache);
+#endif
 
 static struct cleancache_ops tmem_cleancache_ops = {
 	.put_page = tmem_cleancache_put_page,
@@ -353,14 +361,19 @@ static void tmem_frontswap_init(unsigned ignored)
 		    xen_tmem_new_pool(private, TMEM_POOL_PERSIST, PAGE_SIZE);
 }
 
-static bool __initdata use_frontswap = true;
-
+static bool disable_frontswap __read_mostly;
+static bool disable_frontswap_selfshrinking __read_mostly;
+#ifdef CONFIG_XEN_TMEM_MODULE
+module_param(disable_frontswap, bool, S_IRUGO);
+module_param(disable_frontswap_selfshrinking, bool, S_IRUGO);
+#else
 static int __init no_frontswap(char *s)
 {
-	use_frontswap = false;
+	disable_frontswap = true;
 	return 1;
 }
 __setup("nofrontswap", no_frontswap);
+#endif
 
 static struct frontswap_ops tmem_frontswap_ops = {
 	.store = tmem_frontswap_store,
@@ -371,12 +384,12 @@ static struct frontswap_ops tmem_frontswap_ops = {
 };
 #endif
 
-static int __init xen_tmem_init(void)
+static int xen_tmem_init(void)
 {
 	if (!xen_domain())
 		return 0;
 #ifdef CONFIG_FRONTSWAP
-	if (tmem_enabled && use_frontswap) {
+	if (tmem_enabled && !disable_frontswap) {
 		char *s = "";
 		struct frontswap_ops *old_ops =
 			frontswap_register_ops(&tmem_frontswap_ops);
@@ -390,7 +403,7 @@ static int __init xen_tmem_init(void)
 #endif
 #ifdef CONFIG_CLEANCACHE
 	BUG_ON(sizeof(struct cleancache_filekey) != sizeof(struct tmem_oid));
-	if (tmem_enabled && use_cleancache) {
+	if (tmem_enabled && !disable_cleancache) {
 		char *s = "";
 		struct cleancache_ops *old_ops =
 			cleancache_register_ops(&tmem_cleancache_ops);
@@ -400,7 +413,14 @@ static int __init xen_tmem_init(void)
 				 "Xen Transcendent Memory%s\n", s);
 	}
 #endif
+#ifdef CONFIG_XEN_SELFBALLOONING
+	xen_selfballoon_init(!disable_selfballooning,
+				!disable_frontswap_selfshrinking);
+#endif
 	return 0;
 }
 
 module_init(xen_tmem_init)
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Dan Magenheimer <dan.magenheimer@oracle.com>");
+MODULE_DESCRIPTION("Shim to Xen transcendent memory");
diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c
index 7d041cb..f4808aa 100644
--- a/drivers/xen/xen-selfballoon.c
+++ b/drivers/xen/xen-selfballoon.c
@@ -121,7 +121,7 @@ static DECLARE_DELAYED_WORK(selfballoon_worker, selfballoon_process);
 static bool frontswap_selfshrinking __read_mostly;
 
 /* Enable/disable with kernel boot option. */
-static bool use_frontswap_selfshrink __initdata = true;
+static bool use_frontswap_selfshrink = true;
 
 /*
  * The default values for the following parameters were deemed reasonable
@@ -185,7 +185,7 @@ static int __init xen_nofrontswap_selfshrink_setup(char *s)
 __setup("noselfshrink", xen_nofrontswap_selfshrink_setup);
 
 /* Disable with kernel boot option. */
-static bool use_selfballooning __initdata = true;
+static bool use_selfballooning = true;
 
 static int __init xen_noselfballooning_setup(char *s)
 {
@@ -196,7 +196,7 @@ static int __init xen_noselfballooning_setup(char *s)
 __setup("noselfballooning", xen_noselfballooning_setup);
 #else /* !CONFIG_FRONTSWAP */
 /* Enable with kernel boot option. */
-static bool use_selfballooning __initdata = false;
+static bool use_selfballooning;
 
 static int __init xen_selfballooning_setup(char *s)
 {
@@ -537,7 +537,7 @@ int register_xen_selfballooning(struct device *dev)
 }
 EXPORT_SYMBOL(register_xen_selfballooning);
 
-static int __init xen_selfballoon_init(void)
+int xen_selfballoon_init(bool use_selfballooning, bool use_frontswap_selfshrink)
 {
 	bool enable = false;
 
@@ -571,7 +571,8 @@ static int __init xen_selfballoon_init(void)
 
 	return 0;
 }
+EXPORT_SYMBOL(xen_selfballoon_init);
 
+#ifndef CONFIG_XEN_TMEM_MODULE
 subsys_initcall(xen_selfballoon_init);
-
-MODULE_LICENSE("GPL");
+#endif
diff --git a/include/xen/tmem.h b/include/xen/tmem.h
index 591550a..3930a90 100644
--- a/include/xen/tmem.h
+++ b/include/xen/tmem.h
@@ -3,7 +3,15 @@
 
 #include <linux/types.h>
 
+#ifdef CONFIG_XEN_TMEM_MODULE
+#define tmem_enabled true
+#else
 /* defined in drivers/xen/tmem.c */
 extern bool tmem_enabled;
+#endif
+
+#ifdef CONFIG_XEN_SELFBALLOONING
+extern int xen_selfballoon_init(bool, bool);
+#endif
 
 #endif /* _XEN_TMEM_H */
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 8/8] xen/tmem: Remove the subsys call.
  2012-11-14 18:57 [PATCH v2] enable all tmem backends to be built and loaded as modules Konrad Rzeszutek Wilk
                   ` (6 preceding siblings ...)
  2012-11-14 18:57 ` [PATCH 7/8] xen: tmem: enable Xen tmem shim " Konrad Rzeszutek Wilk
@ 2012-11-14 18:57 ` Konrad Rzeszutek Wilk
  7 siblings, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-14 18:57 UTC (permalink / raw)
  To: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, akpm, mgorman
  Cc: fschmaus, andor.daam, ilendir, Konrad Rzeszutek Wilk

We get:
drivers/xen/xen-selfballoon.c:577:134: warning: initialization from incompatible pointer type [enabled by default]

We actually do not need this function to be called
before tmem is loaded. So lets remove the subsys_init.

If tmem is built in as a module this is still OK as
xen_selfballoon_init ends up being exported and can
be called by the tmem module.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/xen/xen-selfballoon.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/xen/xen-selfballoon.c b/drivers/xen/xen-selfballoon.c
index f4808aa..0bd551e 100644
--- a/drivers/xen/xen-selfballoon.c
+++ b/drivers/xen/xen-selfballoon.c
@@ -572,7 +572,3 @@ int xen_selfballoon_init(bool use_selfballooning, bool use_frontswap_selfshrink)
 	return 0;
 }
 EXPORT_SYMBOL(xen_selfballoon_init);
-
-#ifndef CONFIG_XEN_TMEM_MODULE
-subsys_initcall(xen_selfballoon_init);
-#endif
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/8] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules
  2012-11-14 18:57 ` [PATCH 1/8] mm: cleancache: lazy initialization to allow tmem backends to build/run " Konrad Rzeszutek Wilk
@ 2012-11-16 23:10   ` Andrew Morton
  2013-01-30 15:57     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2012-11-16 23:10 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, mgorman, fschmaus, andor.daam, ilendir

On Wed, 14 Nov 2012 13:57:05 -0500
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:

> From: Dan Magenheimer <dan.magenheimer@oracle.com>
> 
> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> built/loaded as modules rather than built-in and enabled by a boot parameter,
> this patch provides "lazy initialization", allowing backends to register to
> cleancache even after filesystems were mounted. Calls to init_fs and
> init_shared_fs are remembered as fake poolids but no real tmem_pools created.
> On backend registration the fake poolids are mapped to real poolids and
> respective tmem_pools.

What is your merge plan/path for this work?

>
> ...
>
> + * When no backend is registered all calls to init_fs and init_shard_fs

"init_shared_fs"

> + * are registered and fake poolids are given to the respective
> + * super block but no tmem_pools are created. When a backend
> + * registers with cleancache the previous calls to init_fs and
> + * init_shared_fs are executed to create tmem_pools and set the
> + * respective poolids. While no backend is registered all "puts",
> + * "gets" and "flushes" are ignored or fail.

The comment could use all 80 cols..

>
> ...
>
>  struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
>  {
>  	struct cleancache_ops old = cleancache_ops;
> +	int i;
>  
>  	cleancache_ops = *ops;
> -	cleancache_enabled = 1;
> +
> +	backend_registered = true;
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (fs_poolid_map[i] == FS_NO_BACKEND)
> +			fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +		if (shared_fs_poolid_map[i] == FS_NO_BACKEND)
> +			shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
> +					(uuids[i], PAGE_SIZE);
> +	}
>  	return old;
>  }

I never noticed before - this function returns a large structure by
value.  That's really really unusual in the kernel.  I see no problem
with it per-se, but it might generate awful code.

Also, this function has no locking and is blatantly racy.

>  EXPORT_SYMBOL(cleancache_register_ops);
> @@ -61,15 +91,38 @@ EXPORT_SYMBOL(cleancache_register_ops);
>  /* Called by a cleancache-enabled filesystem at time of mount */
>  void __cleancache_init_fs(struct super_block *sb)
>  {
> -	sb->cleancache_poolid = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +	int i;
> +
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (fs_poolid_map[i] == FS_UNKNOWN) {
> +			sb->cleancache_poolid = i + FAKE_FS_POOLID_OFFSET;
> +			if (backend_registered)
> +				fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
> +			else
> +				fs_poolid_map[i] = FS_NO_BACKEND;
> +			break;
> +		}
> +	}
>  }
>  EXPORT_SYMBOL(__cleancache_init_fs);

This also looks wildly racy.

>  /* Called by a cleancache-enabled clustered filesystem at time of mount */
>  void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
>  {
> -	sb->cleancache_poolid =
> -		(*cleancache_ops.init_shared_fs)(uuid, PAGE_SIZE);
> +	int i;
> +
> +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> +		if (shared_fs_poolid_map[i] == FS_UNKNOWN) {
> +			sb->cleancache_poolid = i + FAKE_SHARED_FS_POOLID_OFFSET;
> +			uuids[i] = uuid;
> +			if (backend_registered)
> +				shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
> +						(uuid, PAGE_SIZE);
> +			else
> +				shared_fs_poolid_map[i] = FS_NO_BACKEND;
> +			break;
> +		}
> +	}
>  }
>  EXPORT_SYMBOL(__cleancache_init_shared_fs);

Again, a huge mess if two threads execute this concurrently.

> @@ -99,6 +152,19 @@ static int cleancache_get_key(struct inode *inode,
>  }
>  
>  /*
> + * Returns a pool_id that is associated with a given fake poolid.

Was there a comment anywhere which tells the reader what a "fake poolid" is?

> + */
> +static int get_poolid_from_fake(int fake_pool_id)
> +{
> +	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET)
> +		return shared_fs_poolid_map[fake_pool_id -
> +			FAKE_SHARED_FS_POOLID_OFFSET];
> +	else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET)
> +		return fs_poolid_map[fake_pool_id - FAKE_FS_POOLID_OFFSET];
> +	return FS_NO_BACKEND;
> +}
> +
> +/*
>   * "Get" data from cleancache associated with the poolid/inode/index
>   * that were specified when the data was put to cleanache and, if
>   * successful, use it to fill the specified page with data and return 0.
> @@ -109,17 +175,26 @@ int __cleancache_get_page(struct page *page)
>  {
>  	int ret = -1;
>  	int pool_id;
> +	int fake_pool_id;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> +	if (!backend_registered) {
> +		cleancache_failed_gets++;
> +		goto out;
> +	}

Races everywhere...

>  	VM_BUG_ON(!PageLocked(page));
> -	pool_id = page->mapping->host->i_sb->cleancache_poolid;
> -	if (pool_id < 0)
> +	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	if (fake_pool_id < 0)
>  		goto out;
> +	pool_id = get_poolid_from_fake(fake_pool_id);
>  
>  	if (cleancache_get_key(page->mapping->host, &key) < 0)
>  		goto out;
>  
> -	ret = (*cleancache_ops.get_page)(pool_id, key, page->index, page);
> +	if (pool_id >= 0)
> +		ret = (*cleancache_ops.get_page)(pool_id,
> +				key, page->index, page);
>  	if (ret == 0)
>  		cleancache_succ_gets++;
>  	else
> @@ -138,12 +213,23 @@ EXPORT_SYMBOL(__cleancache_get_page);
>  void __cleancache_put_page(struct page *page)
>  {
>  	int pool_id;
> +	int fake_pool_id;
>  	struct cleancache_filekey key = { .u.key = { 0 } };
>  
> +	if (!backend_registered) {
> +		cleancache_puts++;
> +		return;
> +	}

More..

>  	VM_BUG_ON(!PageLocked(page));
> -	pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
> +	if (fake_pool_id < 0)
> +		return;
> +
> +	pool_id = get_poolid_from_fake(fake_pool_id);
> +
>  	if (pool_id >= 0 &&
> -	      cleancache_get_key(page->mapping->host, &key) >= 0) {
> +		cleancache_get_key(page->mapping->host, &key) >= 0) {
>  		(*cleancache_ops.put_page)(pool_id, key, page->index, page);
>  		cleancache_puts++;
>  	}
>
> ...
>

I don't understand the timing flow here, nor the existing constraints
on what can be done and when, but....

The whole thing looks really hacky?  Why do we need to remember all
this stuff for later on?  What prevents us from simply synchonously
doing whatever we need to do when someone wants to register a backend?

Maybe a little ascii time chart/flow diagram would help.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/8] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
  2012-11-14 18:57 ` [PATCH 2/8] mm: frontswap: " Konrad Rzeszutek Wilk
@ 2012-11-16 23:16   ` Andrew Morton
  2012-11-19  0:53     ` Bob Liu
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2012-11-16 23:16 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, mgorman, fschmaus, andor.daam, ilendir

On Wed, 14 Nov 2012 13:57:06 -0500
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:

> From: Dan Magenheimer <dan.magenheimer@oracle.com>
> 
> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> built/loaded as modules rather than built-in and enabled by a boot parameter,
> this patch provides "lazy initialization", allowing backends to register to
> frontswap even after swapon was run. Before a backend registers all calls
> to init are recorded and the creation of tmem_pools delayed until a backend
> registers or until a frontswap put is attempted.
> 
>
> ...
>
> --- a/mm/frontswap.c
> +++ b/mm/frontswap.c
> @@ -80,6 +80,18 @@ static inline void inc_frontswap_succ_stores(void) { }
>  static inline void inc_frontswap_failed_stores(void) { }
>  static inline void inc_frontswap_invalidates(void) { }
>  #endif
> +
> +/*
> + * When no backend is registered all calls to init are registered and

What is "init"?  Spell it out fully, please.

> + * remembered but fail to create tmem_pools. When a backend registers with
> + * frontswap the previous calls to init are executed to create tmem_pools
> + * and set the respective poolids.

Again, seems really hacky.  Why can't we just change callers so they
call things in the correct order?

> + * While no backend is registered all "puts", "gets" and "flushes" are
> + * ignored or fail.
> + */
> +static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
> +static bool backend_registered __read_mostly;
> +
>  /*
>   * Register operations for frontswap, returning previous thus allowing
>   * detection of multiple backends and possible nesting.
> @@ -87,9 +99,19 @@ static inline void inc_frontswap_invalidates(void) { }
>  struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
>  {
>  	struct frontswap_ops old = frontswap_ops;
> +	int i;
>  
>  	frontswap_ops = *ops;
>  	frontswap_enabled = true;
> +
> +	for (i = 0; i < MAX_SWAPFILES; i++) {
> +		if (test_and_clear_bit(i, need_init))

ooh, that wasn't racy ;)

> +			(*frontswap_ops.init)(i);
> +	}
> +	/* We MUST have backend_registered called _after_ the frontswap_init's
> + 	 * have been called. Otherwise __frontswap_store might fail. */

Comment makes no sense - backend_registered is not a function.

Also, let's lay the comments out conventionally please:

	/*
	 * We MUST have backend_registered called _after_ the frontswap_init's
	 * have been called. Otherwise __frontswap_store might fail.
	 */


> +	barrier();
> +	backend_registered = true;
>  	return old;
>  }
>  EXPORT_SYMBOL(frontswap_register_ops);
>
> ...
>
> @@ -226,12 +266,15 @@ void __frontswap_invalidate_area(unsigned type)
>  {
>  	struct swap_info_struct *sis = swap_info[type];
>  
> -	BUG_ON(sis == NULL);
> -	if (sis->frontswap_map == NULL)
> -		return;
> -	frontswap_ops.invalidate_area(type);
> -	atomic_set(&sis->frontswap_pages, 0);
> -	memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> +	if (backend_registered) {
> +		BUG_ON(sis == NULL);
> +		if (sis->frontswap_map == NULL)
> +			return;
> +		(*frontswap_ops.invalidate_area)(type);
> +		atomic_set(&sis->frontswap_pages, 0);
> +		memset(sis->frontswap_map, 0, sis->max / sizeof(long));
> +	}
> +	clear_bit(type, need_init);
>  }
>  EXPORT_SYMBOL(__frontswap_invalidate_area);
>  
> @@ -364,6 +407,9 @@ static int __init init_frontswap(void)
>  	debugfs_create_u64("invalidates", S_IRUGO,
>  				root, &frontswap_invalidates);
>  #endif
> +	bitmap_zero(need_init, MAX_SWAPFILES);

unneeded?

> +	frontswap_enabled = 1;
>  	return 0;
>  }
>
> ...
>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/8] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
  2012-11-16 23:16   ` Andrew Morton
@ 2012-11-19  0:53     ` Bob Liu
  2012-11-19 22:25       ` Andrew Morton
  0 siblings, 1 reply; 16+ messages in thread
From: Bob Liu @ 2012-11-19  0:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Konrad Rzeszutek Wilk, sjenning, dan.magenheimer, devel,
	linux-kernel, linux-mm, ngupta, minchan, mgorman, fschmaus,
	andor.daam, ilendir

On Sat, Nov 17, 2012 at 7:16 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Wed, 14 Nov 2012 13:57:06 -0500
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
>
>> From: Dan Magenheimer <dan.magenheimer@oracle.com>
>>
>> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
>> built/loaded as modules rather than built-in and enabled by a boot parameter,
>> this patch provides "lazy initialization", allowing backends to register to
>> frontswap even after swapon was run. Before a backend registers all calls
>> to init are recorded and the creation of tmem_pools delayed until a backend
>> registers or until a frontswap put is attempted.
>>
>>
>> ...
>>
>> --- a/mm/frontswap.c
>> +++ b/mm/frontswap.c
>> @@ -80,6 +80,18 @@ static inline void inc_frontswap_succ_stores(void) { }
>>  static inline void inc_frontswap_failed_stores(void) { }
>>  static inline void inc_frontswap_invalidates(void) { }
>>  #endif
>> +
>> +/*
>> + * When no backend is registered all calls to init are registered and
>
> What is "init"?  Spell it out fully, please.
>

I think it's frontswap_init().
swapon will call frontswap_init() and in it we need to call init
function of backends with some parameters
like swap_type.

>> + * remembered but fail to create tmem_pools. When a backend registers with
>> + * frontswap the previous calls to init are executed to create tmem_pools
>> + * and set the respective poolids.
>
> Again, seems really hacky.  Why can't we just change callers so they
> call things in the correct order?
>

I don't think so, because it asynchronous.

The original idea was to make backends like zcache/tmem modularization.
So that it's more convenient and flexible to use and testing.

But currently callers like swapon only invoke frontswap_init() once,
it fail if backend not registered.
We have no way to notify swap to call frontswap_init() again when
backend registered in some random time
 in future.

>> + * While no backend is registered all "puts", "gets" and "flushes" are
>> + * ignored or fail.
>> + */
>> +static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
>> +static bool backend_registered __read_mostly;
>> +
>>  /*
>>   * Register operations for frontswap, returning previous thus allowing
>>   * detection of multiple backends and possible nesting.
>> @@ -87,9 +99,19 @@ static inline void inc_frontswap_invalidates(void) { }
>>  struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
>>  {
>>       struct frontswap_ops old = frontswap_ops;
>> +     int i;
>>
>>       frontswap_ops = *ops;
>>       frontswap_enabled = true;
>> +
>> +     for (i = 0; i < MAX_SWAPFILES; i++) {
>> +             if (test_and_clear_bit(i, need_init))
>
> ooh, that wasn't racy ;)
>

Hmm,  i agree.
Seems some lock is needed, actually i think this code only support one
backend at the same.
So it's less risky.

>> +                     (*frontswap_ops.init)(i);
>> +     }
>> +     /* We MUST have backend_registered called _after_ the frontswap_init's
>> +      * have been called. Otherwise __frontswap_store might fail. */
>
> Comment makes no sense - backend_registered is not a function.
>
> Also, let's lay the comments out conventionally please:
>
>         /*
>          * We MUST have backend_registered called _after_ the frontswap_init's
>          * have been called. Otherwise __frontswap_store might fail.
>          */
>
>
>> +     barrier();
>> +     backend_registered = true;
>>       return old;
>>  }
>>  EXPORT_SYMBOL(frontswap_register_ops);
>>
>> ...
>>
>> @@ -226,12 +266,15 @@ void __frontswap_invalidate_area(unsigned type)
>>  {
>>       struct swap_info_struct *sis = swap_info[type];
>>
>> -     BUG_ON(sis == NULL);
>> -     if (sis->frontswap_map == NULL)
>> -             return;
>> -     frontswap_ops.invalidate_area(type);
>> -     atomic_set(&sis->frontswap_pages, 0);
>> -     memset(sis->frontswap_map, 0, sis->max / sizeof(long));
>> +     if (backend_registered) {
>> +             BUG_ON(sis == NULL);
>> +             if (sis->frontswap_map == NULL)
>> +                     return;
>> +             (*frontswap_ops.invalidate_area)(type);
>> +             atomic_set(&sis->frontswap_pages, 0);
>> +             memset(sis->frontswap_map, 0, sis->max / sizeof(long));
>> +     }
>> +     clear_bit(type, need_init);
>>  }
>>  EXPORT_SYMBOL(__frontswap_invalidate_area);
>>
>> @@ -364,6 +407,9 @@ static int __init init_frontswap(void)
>>       debugfs_create_u64("invalidates", S_IRUGO,
>>                               root, &frontswap_invalidates);
>>  #endif
>> +     bitmap_zero(need_init, MAX_SWAPFILES);
>
> unneeded?
>
>> +     frontswap_enabled = 1;
>>       return 0;
>>  }
>>
>> ...
>>

-- 
Thanks,
--Bob

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/8] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
  2012-11-19  0:53     ` Bob Liu
@ 2012-11-19 22:25       ` Andrew Morton
  2012-11-27 21:26         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 16+ messages in thread
From: Andrew Morton @ 2012-11-19 22:25 UTC (permalink / raw)
  To: Bob Liu
  Cc: Konrad Rzeszutek Wilk, sjenning, dan.magenheimer, devel,
	linux-kernel, linux-mm, ngupta, minchan, mgorman, fschmaus,
	andor.daam, ilendir

On Mon, 19 Nov 2012 08:53:46 +0800
Bob Liu <lliubbo@gmail.com> wrote:

> On Sat, Nov 17, 2012 at 7:16 AM, Andrew Morton
> <akpm@linux-foundation.org> wrote:
> > On Wed, 14 Nov 2012 13:57:06 -0500
> > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> >
> >> From: Dan Magenheimer <dan.magenheimer@oracle.com>
> >>
> >> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> >> built/loaded as modules rather than built-in and enabled by a boot parameter,
> >> this patch provides "lazy initialization", allowing backends to register to
> >> frontswap even after swapon was run. Before a backend registers all calls
> >> to init are recorded and the creation of tmem_pools delayed until a backend
> >> registers or until a frontswap put is attempted.
> >>
> >>
> >> ...
> >>
> >> --- a/mm/frontswap.c
> >> +++ b/mm/frontswap.c
> >> @@ -80,6 +80,18 @@ static inline void inc_frontswap_succ_stores(void) { }
> >>  static inline void inc_frontswap_failed_stores(void) { }
> >>  static inline void inc_frontswap_invalidates(void) { }
> >>  #endif
> >> +
> >> +/*
> >> + * When no backend is registered all calls to init are registered and
> >
> > What is "init"?  Spell it out fully, please.
> >
> 
> I think it's frontswap_init().
> swapon will call frontswap_init() and in it we need to call init
> function of backends with some parameters
> like swap_type.

Well, let's improve that comment please.

> >> + * remembered but fail to create tmem_pools. When a backend registers with
> >> + * frontswap the previous calls to init are executed to create tmem_pools
> >> + * and set the respective poolids.
> >
> > Again, seems really hacky.  Why can't we just change callers so they
> > call things in the correct order?
> >
> 
> I don't think so, because it asynchronous.
> 
> The original idea was to make backends like zcache/tmem modularization.
> So that it's more convenient and flexible to use and testing.
> 
> But currently callers like swapon only invoke frontswap_init() once,
> it fail if backend not registered.
> We have no way to notify swap to call frontswap_init() again when
> backend registered in some random time
>  in future.

We could add such a way?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/8] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
  2012-11-19 22:25       ` Andrew Morton
@ 2012-11-27 21:26         ` Konrad Rzeszutek Wilk
  2013-01-30 15:55           ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2012-11-27 21:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Bob Liu, sjenning, dan.magenheimer, devel, linux-kernel,
	linux-mm, ngupta, minchan, mgorman, fschmaus, andor.daam,
	ilendir

On Mon, Nov 19, 2012 at 02:25:16PM -0800, Andrew Morton wrote:
> On Mon, 19 Nov 2012 08:53:46 +0800
> Bob Liu <lliubbo@gmail.com> wrote:
> 
> > On Sat, Nov 17, 2012 at 7:16 AM, Andrew Morton
> > <akpm@linux-foundation.org> wrote:
> > > On Wed, 14 Nov 2012 13:57:06 -0500
> > > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > >
> > >> From: Dan Magenheimer <dan.magenheimer@oracle.com>
> > >>
> > >> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> > >> built/loaded as modules rather than built-in and enabled by a boot parameter,
> > >> this patch provides "lazy initialization", allowing backends to register to
> > >> frontswap even after swapon was run. Before a backend registers all calls
> > >> to init are recorded and the creation of tmem_pools delayed until a backend
> > >> registers or until a frontswap put is attempted.
> > >>
> > >>
> > >> ...
> > >>
> > >> --- a/mm/frontswap.c
> > >> +++ b/mm/frontswap.c
> > >> @@ -80,6 +80,18 @@ static inline void inc_frontswap_succ_stores(void) { }
> > >>  static inline void inc_frontswap_failed_stores(void) { }
> > >>  static inline void inc_frontswap_invalidates(void) { }
> > >>  #endif
> > >> +
> > >> +/*
> > >> + * When no backend is registered all calls to init are registered and
> > >
> > > What is "init"?  Spell it out fully, please.
> > >
> > 
> > I think it's frontswap_init().
> > swapon will call frontswap_init() and in it we need to call init
> > function of backends with some parameters
> > like swap_type.
> 
> Well, let's improve that comment please.
> 
> > >> + * remembered but fail to create tmem_pools. When a backend registers with
> > >> + * frontswap the previous calls to init are executed to create tmem_pools
> > >> + * and set the respective poolids.
> > >
> > > Again, seems really hacky.  Why can't we just change callers so they
> > > call things in the correct order?
> > >
> > 
> > I don't think so, because it asynchronous.
> > 
> > The original idea was to make backends like zcache/tmem modularization.
> > So that it's more convenient and flexible to use and testing.
> > 
> > But currently callers like swapon only invoke frontswap_init() once,
> > it fail if backend not registered.
> > We have no way to notify swap to call frontswap_init() again when
> > backend registered in some random time
> >  in future.
> 
> We could add such a way?

Hey Andrew,

Sorry for the late email. Right at as you posted your questions I went on vacation :-)
Let me respond to your email and rebase the patch per your comments/ideas this week.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/8] mm: frontswap: lazy initialization to allow tmem backends to build/run as modules
  2012-11-27 21:26         ` Konrad Rzeszutek Wilk
@ 2013-01-30 15:55           ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-01-30 15:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Bob Liu, sjenning, dan.magenheimer, devel, linux-kernel,
	linux-mm, ngupta, minchan, mgorman, fschmaus, andor.daam,
	ilendir

On Tue, Nov 27, 2012 at 04:26:17PM -0500, Konrad Rzeszutek Wilk wrote:
> On Mon, Nov 19, 2012 at 02:25:16PM -0800, Andrew Morton wrote:
> > On Mon, 19 Nov 2012 08:53:46 +0800
> > Bob Liu <lliubbo@gmail.com> wrote:
> > 
> > > On Sat, Nov 17, 2012 at 7:16 AM, Andrew Morton
> > > <akpm@linux-foundation.org> wrote:
> > > > On Wed, 14 Nov 2012 13:57:06 -0500
> > > > Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > > >
> > > >> From: Dan Magenheimer <dan.magenheimer@oracle.com>
> > > >>
> > > >> With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> > > >> built/loaded as modules rather than built-in and enabled by a boot parameter,
> > > >> this patch provides "lazy initialization", allowing backends to register to
> > > >> frontswap even after swapon was run. Before a backend registers all calls
> > > >> to init are recorded and the creation of tmem_pools delayed until a backend
> > > >> registers or until a frontswap put is attempted.
> > > >>
> > > >>
> > > >> ...
> > > >>
> > > >> --- a/mm/frontswap.c
> > > >> +++ b/mm/frontswap.c
> > > >> @@ -80,6 +80,18 @@ static inline void inc_frontswap_succ_stores(void) { }
> > > >>  static inline void inc_frontswap_failed_stores(void) { }
> > > >>  static inline void inc_frontswap_invalidates(void) { }
> > > >>  #endif
> > > >> +
> > > >> +/*
> > > >> + * When no backend is registered all calls to init are registered and
> > > >
> > > > What is "init"?  Spell it out fully, please.
> > > >
> > > 
> > > I think it's frontswap_init().
> > > swapon will call frontswap_init() and in it we need to call init
> > > function of backends with some parameters
> > > like swap_type.
> > 
> > Well, let's improve that comment please.
> > 
> > > >> + * remembered but fail to create tmem_pools. When a backend registers with
> > > >> + * frontswap the previous calls to init are executed to create tmem_pools
> > > >> + * and set the respective poolids.
> > > >
> > > > Again, seems really hacky.  Why can't we just change callers so they
> > > > call things in the correct order?
> > > >
> > > 
> > > I don't think so, because it asynchronous.
> > > 
> > > The original idea was to make backends like zcache/tmem modularization.
> > > So that it's more convenient and flexible to use and testing.
> > > 
> > > But currently callers like swapon only invoke frontswap_init() once,
> > > it fail if backend not registered.
> > > We have no way to notify swap to call frontswap_init() again when
> > > backend registered in some random time
> > >  in future.
> > 
> > We could add such a way?
> 
> Hey Andrew,
> 
> Sorry for the late email. Right at as you posted your questions I went on vacation :-)
> Let me respond to your email and rebase the patch per your comments/ideas this week.

"This week" turned rather into a large couple of months :-(

Please see inline patch that tries to address the comments you made.

In regards to making swap and frontswap be synchronous and support
module loading - that is a tricky thing. If we wanted the swap system to
call the 'frontswap_init' outside of 'swapon' call, one way to do this would
be to have a notifier chain - which the swap API would subscribe too. The
frontswap API upon being called frontswap_register_ops (so a backend module
has loaded) could kick of the notifier and the swap API would immediately call
frontswap_init.

Something like this:

	swap API starts, makes a call to:
		register_frontswap_notifier(&swap_fnc), wherein

		.notifier_call = swap_notifier just does:

		swap_notifier(void) {

		struct swap_info_struct *p = NULL;
		spin_lock(&swap_lock);                                                  
        	for (type = swap_list.head; type >= 0; type = swap_info[type]->next) { 
                	p = swap_info[type]; 
			frontswap_init(p->type);
		};
		spin_unlock(&swap_lock);
		}

	swapon /dev/XX , makes a call to frontswap_init. Frontswap_init
		ignores it since there are no backend.

	I/Os on the swap device, the calls to frontswap_store/load are
		all returning as there are no backend.

	modprobe zcache -> calls frontswap_register_ops().
		frontswap_register_ops-> kicks the notifier.


As opposed to what this patchset does it by not exposing a notifier but
just queing up which p->type's to call (by having a atomic bitmap) when a
backend has registered.
In this patchset we end up with:

	swap API inits..

	swapon /dev/XX, makes a call to frontswap_init. Frontswap_init
		ignores it since there are no backend, but saves away
		the proper parameters

	I/Os on the swap device, the calls to frontswap_store/load are
		all returning fast as there are no backend.

	modprobe zcache -> calls frontswap_register_ops().
		processes the frontswap_init on the queued up swap_file.
		enables backend_registered, and all I/Os now flow to the
		backend.

The difference here is that we would not queue anymore. My thinking is
go with the queue system, then also implement proper unloading mechanism
(by perhaps have a dummy frontswap_ops or an atomic or static_key
gates to inhibit further frontswap API calls), drain all the swap pages
from the backend to the "regular" swap disk (by using Seth's patchset)
and then allowing the backend to unload.

And then if we decide that bitmap queue is not appropiate (b/c the
swap system can now have more than 32 entries), then revisit this?


>From ebc1f49f6593d4105f4927839bcfdb5162206ac4 Mon Sep 17 00:00:00 2001
From: Dan Magenheimer <dan.magenheimer@oracle.com>
Date: Wed, 14 Nov 2012 18:57:06 +0000
Subject: [PATCH 2/8] mm: frontswap: lazy initialization to allow tmem
 backends to build/run as modules

With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
built/loaded as modules rather than built-in and enabled by a boot parameter,
this patch provides "lazy initialization", allowing backends to register to
frontswap even after swapon was run. Before a backend registers all calls
to init are recorded and the creation of tmem_pools delayed until a backend
registers or until a frontswap put is attempted.

Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
Signed-off-by: Andor Daam <andor.daam@googlemail.com>
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Fixes per Seth Jennings suggestions]
[v2: Removed FRONTSWAP_HAS_.. ]
[v3: Fix up per Bob Liu <lliubbo@gmail.com> recommendations]
[v4: Fix up per Andrew's comments]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 mm/frontswap.c | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 85 insertions(+), 10 deletions(-)

diff --git a/mm/frontswap.c b/mm/frontswap.c
index 2890e67..c05a9db 100644
--- a/mm/frontswap.c
+++ b/mm/frontswap.c
@@ -80,6 +80,46 @@ static inline void inc_frontswap_succ_stores(void) { }
 static inline void inc_frontswap_failed_stores(void) { }
 static inline void inc_frontswap_invalidates(void) { }
 #endif
+
+/*
+ * Due to the asynchronous nature of the backends loading potentially
+ * _after_ the swap system has been activated, we have chokepoints
+ * on all frontswap functions to not call the backend until the backend
+ * has registered.
+ *
+ * Specifically when no backend is registered (nobody called
+ * frontswap_register_ops) all calls to frontswap_init (which is done via
+ * swapon -> enable_swap_info -> frontswap_init) are registered and remembered
+ * (via the setting of need_init bitmap) but fail to create tmem_pools. When a
+ * backend registers with frontswap at some later point the previous
+ * calls to frontswap_init are executed (by iterating over the need_init
+ * bitmap) to create tmem_pools and set the respective poolids. All of that is
+ * guarded by us using atomic bit operations on the 'need_init' bitmap.
+ *
+ * This would not guards us against the user deciding to call swapoff right as
+ * we are calling the backend to initialize (so swapon is in action).
+ * Fortunatly for us, the swapon_mutex has been taked by the callee so we are
+ * OK. The other scenario where calls to frontswap_store (called via
+ * swap_writepage) is racing with frontswap_invalidate_area (called via
+ * swapoff) is again guarded by the swap subsystem.
+ *
+ * While no backend is registered all calls to frontswap_[store|load|
+ * invalidate_area|invalidate_page] are ignored or fail.
+ *
+ * The time between the backend being registered and the swap file system
+ * calling the backend (via the frontswap_* functions) is indeterminate as
+ * backend_registered is not atomic_t (or a value guarded by a spinlock).
+ * That is OK as we are comfortable missing some of these calls to the newly
+ * registered backend.
+ *
+ * Obviously the opposite (unloading the backend) must be done after all
+ * the frontswap_[store|load|invalidate_area|invalidate_page] start
+ * ignorning or failing the requests - at which point backend_registered
+ * would have to be made in some fashion atomic.
+ */
+static DECLARE_BITMAP(need_init, MAX_SWAPFILES);
+static bool backend_registered __read_mostly;
+
 /*
  * Register operations for frontswap, returning previous thus allowing
  * detection of multiple backends and possible nesting.
@@ -87,9 +127,22 @@ static inline void inc_frontswap_invalidates(void) { }
 struct frontswap_ops frontswap_register_ops(struct frontswap_ops *ops)
 {
 	struct frontswap_ops old = frontswap_ops;
+	int i;
 
 	frontswap_ops = *ops;
 	frontswap_enabled = true;
+
+	for (i = 0; i < MAX_SWAPFILES; i++) {
+		if (test_and_clear_bit(i, need_init))
+			(*frontswap_ops.init)(i);
+	}
+	/*
+	 * We MUST have backend_registered set _after_ the frontswap_init's
+	 * have been called. Otherwise __frontswap_store might fail. Hence
+	 * the barrier to make sure compiler does not re-order us.
+	 */
+	barrier();
+	backend_registered = true;
 	return old;
 }
 EXPORT_SYMBOL(frontswap_register_ops);
@@ -119,10 +172,17 @@ void __frontswap_init(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	BUG_ON(sis == NULL);
-	if (sis->frontswap_map == NULL)
-		return;
-	frontswap_ops.init(type);
+	if (backend_registered) {
+		BUG_ON(sis == NULL);
+		if (sis->frontswap_map == NULL)
+			return;
+		(*frontswap_ops.init)(type);
+	}
+	else {
+		BUG_ON(type > MAX_SWAPFILES);
+		set_bit(type, need_init);
+	}
+
 }
 EXPORT_SYMBOL(__frontswap_init);
 
@@ -147,6 +207,11 @@ int __frontswap_store(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
+	if (!backend_registered) {
+		inc_frontswap_failed_stores();
+		return ret;
+	}
+
 	BUG_ON(!PageLocked(page));
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
@@ -186,6 +251,9 @@ int __frontswap_load(struct page *page)
 	struct swap_info_struct *sis = swap_info[type];
 	pgoff_t offset = swp_offset(entry);
 
+	if (!backend_registered)
+		return ret;
+
 	BUG_ON(!PageLocked(page));
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset))
@@ -209,6 +277,9 @@ void __frontswap_invalidate_page(unsigned type, pgoff_t offset)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
+	if (!backend_registered)
+		return;
+
 	BUG_ON(sis == NULL);
 	if (frontswap_test(sis, offset)) {
 		frontswap_ops.invalidate_page(type, offset);
@@ -226,12 +297,15 @@ void __frontswap_invalidate_area(unsigned type)
 {
 	struct swap_info_struct *sis = swap_info[type];
 
-	BUG_ON(sis == NULL);
-	if (sis->frontswap_map == NULL)
-		return;
-	frontswap_ops.invalidate_area(type);
-	atomic_set(&sis->frontswap_pages, 0);
-	memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+	if (backend_registered) {
+		BUG_ON(sis == NULL);
+		if (sis->frontswap_map == NULL)
+			return;
+		(*frontswap_ops.invalidate_area)(type);
+		atomic_set(&sis->frontswap_pages, 0);
+		memset(sis->frontswap_map, 0, sis->max / sizeof(long));
+	}
+	clear_bit(type, need_init);
 }
 EXPORT_SYMBOL(__frontswap_invalidate_area);
 
@@ -364,6 +438,7 @@ static int __init init_frontswap(void)
 	debugfs_create_u64("invalidates", S_IRUGO,
 				root, &frontswap_invalidates);
 #endif
+	frontswap_enabled = 1;
 	return 0;
 }
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/8] mm: cleancache: lazy initialization to allow tmem backends to build/run as modules
  2012-11-16 23:10   ` Andrew Morton
@ 2013-01-30 15:57     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 16+ messages in thread
From: Konrad Rzeszutek Wilk @ 2013-01-30 15:57 UTC (permalink / raw)
  To: Andrew Morton
  Cc: sjenning, dan.magenheimer, devel, linux-kernel, linux-mm, ngupta,
	minchan, mgorman, fschmaus, andor.daam, ilendir

On Fri, Nov 16, 2012 at 03:10:49PM -0800, Andrew Morton wrote:
> On Wed, 14 Nov 2012 13:57:05 -0500
> Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> 
> > From: Dan Magenheimer <dan.magenheimer@oracle.com>
> > 
> > With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
> > built/loaded as modules rather than built-in and enabled by a boot parameter,
> > this patch provides "lazy initialization", allowing backends to register to
> > cleancache even after filesystems were mounted. Calls to init_fs and
> > init_shared_fs are remembered as fake poolids but no real tmem_pools created.
> > On backend registration the fake poolids are mapped to real poolids and
> > respective tmem_pools.
> 
> What is your merge plan/path for this work?
> 
> >
> > ...
> >
> > + * When no backend is registered all calls to init_fs and init_shard_fs
> 
> "init_shared_fs"
> 
> > + * are registered and fake poolids are given to the respective
> > + * super block but no tmem_pools are created. When a backend
> > + * registers with cleancache the previous calls to init_fs and
> > + * init_shared_fs are executed to create tmem_pools and set the
> > + * respective poolids. While no backend is registered all "puts",
> > + * "gets" and "flushes" are ignored or fail.
> 
> The comment could use all 80 cols..
> 
> >
> > ...
> >
> >  struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
> >  {
> >  	struct cleancache_ops old = cleancache_ops;
> > +	int i;
> >  
> >  	cleancache_ops = *ops;
> > -	cleancache_enabled = 1;
> > +
> > +	backend_registered = true;
> > +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> > +		if (fs_poolid_map[i] == FS_NO_BACKEND)
> > +			fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
> > +		if (shared_fs_poolid_map[i] == FS_NO_BACKEND)
> > +			shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
> > +					(uuids[i], PAGE_SIZE);
> > +	}
> >  	return old;
> >  }
> 
> I never noticed before - this function returns a large structure by
> value.  That's really really unusual in the kernel.  I see no problem
> with it per-se, but it might generate awful code.
> 
> Also, this function has no locking and is blatantly racy.
> 
> >  EXPORT_SYMBOL(cleancache_register_ops);
> > @@ -61,15 +91,38 @@ EXPORT_SYMBOL(cleancache_register_ops);
> >  /* Called by a cleancache-enabled filesystem at time of mount */
> >  void __cleancache_init_fs(struct super_block *sb)
> >  {
> > -	sb->cleancache_poolid = (*cleancache_ops.init_fs)(PAGE_SIZE);
> > +	int i;
> > +
> > +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> > +		if (fs_poolid_map[i] == FS_UNKNOWN) {
> > +			sb->cleancache_poolid = i + FAKE_FS_POOLID_OFFSET;
> > +			if (backend_registered)
> > +				fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
> > +			else
> > +				fs_poolid_map[i] = FS_NO_BACKEND;
> > +			break;
> > +		}
> > +	}
> >  }
> >  EXPORT_SYMBOL(__cleancache_init_fs);
> 
> This also looks wildly racy.
> 
> >  /* Called by a cleancache-enabled clustered filesystem at time of mount */
> >  void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
> >  {
> > -	sb->cleancache_poolid =
> > -		(*cleancache_ops.init_shared_fs)(uuid, PAGE_SIZE);
> > +	int i;
> > +
> > +	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
> > +		if (shared_fs_poolid_map[i] == FS_UNKNOWN) {
> > +			sb->cleancache_poolid = i + FAKE_SHARED_FS_POOLID_OFFSET;
> > +			uuids[i] = uuid;
> > +			if (backend_registered)
> > +				shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
> > +						(uuid, PAGE_SIZE);
> > +			else
> > +				shared_fs_poolid_map[i] = FS_NO_BACKEND;
> > +			break;
> > +		}
> > +	}
> >  }
> >  EXPORT_SYMBOL(__cleancache_init_shared_fs);
> 
> Again, a huge mess if two threads execute this concurrently.
> 
> > @@ -99,6 +152,19 @@ static int cleancache_get_key(struct inode *inode,
> >  }
> >  
> >  /*
> > + * Returns a pool_id that is associated with a given fake poolid.
> 
> Was there a comment anywhere which tells the reader what a "fake poolid" is?
> 
> > + */
> > +static int get_poolid_from_fake(int fake_pool_id)
> > +{
> > +	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET)
> > +		return shared_fs_poolid_map[fake_pool_id -
> > +			FAKE_SHARED_FS_POOLID_OFFSET];
> > +	else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET)
> > +		return fs_poolid_map[fake_pool_id - FAKE_FS_POOLID_OFFSET];
> > +	return FS_NO_BACKEND;
> > +}
> > +
> > +/*
> >   * "Get" data from cleancache associated with the poolid/inode/index
> >   * that were specified when the data was put to cleanache and, if
> >   * successful, use it to fill the specified page with data and return 0.
> > @@ -109,17 +175,26 @@ int __cleancache_get_page(struct page *page)
> >  {
> >  	int ret = -1;
> >  	int pool_id;
> > +	int fake_pool_id;
> >  	struct cleancache_filekey key = { .u.key = { 0 } };
> >  
> > +	if (!backend_registered) {
> > +		cleancache_failed_gets++;
> > +		goto out;
> > +	}
> 
> Races everywhere...
> 
> >  	VM_BUG_ON(!PageLocked(page));
> > -	pool_id = page->mapping->host->i_sb->cleancache_poolid;
> > -	if (pool_id < 0)
> > +	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
> > +	if (fake_pool_id < 0)
> >  		goto out;
> > +	pool_id = get_poolid_from_fake(fake_pool_id);
> >  
> >  	if (cleancache_get_key(page->mapping->host, &key) < 0)
> >  		goto out;
> >  
> > -	ret = (*cleancache_ops.get_page)(pool_id, key, page->index, page);
> > +	if (pool_id >= 0)
> > +		ret = (*cleancache_ops.get_page)(pool_id,
> > +				key, page->index, page);
> >  	if (ret == 0)
> >  		cleancache_succ_gets++;
> >  	else
> > @@ -138,12 +213,23 @@ EXPORT_SYMBOL(__cleancache_get_page);
> >  void __cleancache_put_page(struct page *page)
> >  {
> >  	int pool_id;
> > +	int fake_pool_id;
> >  	struct cleancache_filekey key = { .u.key = { 0 } };
> >  
> > +	if (!backend_registered) {
> > +		cleancache_puts++;
> > +		return;
> > +	}
> 
> More..
> 
> >  	VM_BUG_ON(!PageLocked(page));
> > -	pool_id = page->mapping->host->i_sb->cleancache_poolid;
> > +	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
> > +	if (fake_pool_id < 0)
> > +		return;
> > +
> > +	pool_id = get_poolid_from_fake(fake_pool_id);
> > +
> >  	if (pool_id >= 0 &&
> > -	      cleancache_get_key(page->mapping->host, &key) >= 0) {
> > +		cleancache_get_key(page->mapping->host, &key) >= 0) {
> >  		(*cleancache_ops.put_page)(pool_id, key, page->index, page);
> >  		cleancache_puts++;
> >  	}
> >
> > ...
> >
> 
> I don't understand the timing flow here, nor the existing constraints
> on what can be done and when, but....
> 
> The whole thing looks really hacky?  Why do we need to remember all
> this stuff for later on?  What prevents us from simply synchonously
> doing whatever we need to do when someone wants to register a backend?
> 
> Maybe a little ascii time chart/flow diagram would help.

This patch hopefully answers the questions and comments. It has a bit
of a a), then b), then c) type chart to illustrate the issue of backends
registered asynchronously.

>From aa306144b1be66cc3e53299a454e8fe2506e19c0 Mon Sep 17 00:00:00 2001
From: Dan Magenheimer <dan.magenheimer@oracle.com>
Date: Wed, 14 Nov 2012 13:57:05 -0500
Subject: [PATCH 1/8] mm: cleancache: lazy initialization to allow tmem
 backends to build/run as modules

With the goal of allowing tmem backends (zcache, ramster, Xen tmem) to be
built/loaded as modules rather than built-in and enabled by a boot parameter,
this patch provides "lazy initialization", allowing backends to register to
cleancache even after filesystems were mounted. Calls to init_fs and
init_shared_fs are remembered as fake poolids but no real tmem_pools created.
On backend registration the fake poolids are mapped to real poolids and
respective tmem_pools.

Signed-off-by: Stefan Hengelein <ilendir@googlemail.com>
Signed-off-by: Florian Schmaus <fschmaus@gmail.com>
Signed-off-by: Andor Daam <andor.daam@googlemail.com>
Signed-off-by: Dan Magenheimer <dan.magenheimer@oracle.com>
[v1: Minor fixes: used #define for some values and bools]
[v2: Removed CLEANCACHE_HAS_LAZY_INIT]
[v3: Added more comments, added a lock for [shared_|]fs_poolid_map]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 mm/cleancache.c | 240 +++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 219 insertions(+), 21 deletions(-)

diff --git a/mm/cleancache.c b/mm/cleancache.c
index 32e6f41..e4dc314 100644
--- a/mm/cleancache.c
+++ b/mm/cleancache.c
@@ -45,15 +45,99 @@ static u64 cleancache_puts;
 static u64 cleancache_invalidates;
 
 /*
- * register operations for cleancache, returning previous thus allowing
- * detection of multiple backends and possible nesting
+ * When no backend is registered all calls to init_fs and init_shared_fs
+ * are registered and fake poolids (FAKE_FS_POOLID_OFFSET or
+ * FAKE_SHARED_FS_POOLID_OFFSET, plus offset in the respective array
+ * [shared_|]fs_poolid_map) are given to the respective super block
+ * (sb->cleancache_poolid) and no tmem_pools are created. When a backend
+ * registers with cleancache the previous calls to init_fs and init_shared_fs
+ * are executed to create tmem_pools and set the respective poolids. While no
+ * backend is registered all "puts", "gets" and "flushes" are ignored or failed.
+ */
+#define MAX_INITIALIZABLE_FS 32
+#define FAKE_FS_POOLID_OFFSET 1000
+#define FAKE_SHARED_FS_POOLID_OFFSET 2000
+
+#define FS_NO_BACKEND (-1)
+#define FS_UNKNOWN (-2)
+static int fs_poolid_map[MAX_INITIALIZABLE_FS];
+static int shared_fs_poolid_map[MAX_INITIALIZABLE_FS];
+static char *uuids[MAX_INITIALIZABLE_FS];
+/*
+ * Mutex for the [shared_|]fs_poolid_map to guard against multiple threads
+ * invoking umount (and ending in __cleancache_invalidate_fs) and also multiple
+ * threads calling mount (and ending up in __cleancache_init_[shared|]fs).
+ */
+static DEFINE_MUTEX(poolid_mutex);
+/*
+ * When set to false (default) all calls to the cleancache functions, except
+ * the __cleancache_invalidate_fs and __cleancache_init_[shared|]fs are guarded
+ * by the if (!backend_registered) return. This means multiple threads (from
+ * different filesystems) will be checking backend_registered. The usage of a
+ * bool instead of a atomic_t or a bool guarded by a spinlock is OK - we are
+ * OK if the time between the backend's have been initialized (and
+ * backend_registered has been set to true) and when the filesystems start
+ * actually calling the backends. The inverse (when unloading) is obviously
+ * not good - but this shim does not do that (yet).
+ */
+static bool backend_registered __read_mostly;
+
+/*
+ * The backends and filesystems work all asynchronously. This is b/c the
+ * backends can be built as modules.
+ * The usual sequence of events is:
+ * 	a) mount /	-> __cleancache_init_fs is called. We set the
+ * 		[shared_|]fs_poolid_map and uuids for.
+ *
+ * 	b). user does I/Os -> we call the rest of __cleancache_* functions
+ * 		which return immediately as backend_registered is false.
+ *
+ * 	c). modprobe zcache -> cleancache_register_ops. We init the backend
+ * 		and set backend_registered to true, and for any fs_poolid_map
+ * 		(which is set by __cleancache_init_fs) we initialize the poolid.
+ *
+ * 	d). user does I/Os -> now that backend_registered is true all the
+ * 		__cleancache_* functions can call the backend. They all check
+ * 		that fs_poolid_map is valid and if so invoke the backend.
+ *
+ * 	e). umount /	-> __cleancache_invalidate_fs, the fs_poolid_map is
+ * 		reset (which is the second check in the __cleancache_* ops
+ * 		to call the backend).
+ *
+ * The sequence of event could also be c), followed by a), and d). and e). The
+ * c) would not happen anymore. There is also the chance of c), and one thread
+ * doing a) + d), and another doing e). For that case we depend on the
+ * filesystem calling __cleancache_invalidate_fs in the proper sequence (so
+ * that it handles all I/Os before it invalidates the fs (which is last part
+ * of unmounting process).
+ *
+ * Note: The acute reader will notice that there is no "rmmod zcache" case.
+ * This is b/c the functionality for that is not yet implemented and when
+ * done, will require some extra locking not yet devised.
+ */
+
+/*
+ * Register operations for cleancache, returning previous thus allowing
+ * detection of multiple backends and possible nesting.
  */
 struct cleancache_ops cleancache_register_ops(struct cleancache_ops *ops)
 {
 	struct cleancache_ops old = cleancache_ops;
+	int i;
 
+	mutex_lock(&poolid_mutex);
 	cleancache_ops = *ops;
-	cleancache_enabled = 1;
+
+	backend_registered = true;
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		if (fs_poolid_map[i] == FS_NO_BACKEND)
+			fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+		if (shared_fs_poolid_map[i] == FS_NO_BACKEND)
+			shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+					(uuids[i], PAGE_SIZE);
+	}
+out:
+	mutex_unlock(&poolid_mutex);
 	return old;
 }
 EXPORT_SYMBOL(cleancache_register_ops);
@@ -61,15 +145,42 @@ EXPORT_SYMBOL(cleancache_register_ops);
 /* Called by a cleancache-enabled filesystem at time of mount */
 void __cleancache_init_fs(struct super_block *sb)
 {
-	sb->cleancache_poolid = (*cleancache_ops.init_fs)(PAGE_SIZE);
+	int i;
+
+	mutex_lock(&poolid_mutex);
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		if (fs_poolid_map[i] == FS_UNKNOWN) {
+			sb->cleancache_poolid = i + FAKE_FS_POOLID_OFFSET;
+			if (backend_registered)
+				fs_poolid_map[i] = (*cleancache_ops.init_fs)(PAGE_SIZE);
+			else
+				fs_poolid_map[i] = FS_NO_BACKEND;
+			break;
+		}
+	}
+	mutex_unlock(&poolid_mutex);
 }
 EXPORT_SYMBOL(__cleancache_init_fs);
 
 /* Called by a cleancache-enabled clustered filesystem at time of mount */
 void __cleancache_init_shared_fs(char *uuid, struct super_block *sb)
 {
-	sb->cleancache_poolid =
-		(*cleancache_ops.init_shared_fs)(uuid, PAGE_SIZE);
+	int i;
+
+	mutex_lock(&poolid_mutex);
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		if (shared_fs_poolid_map[i] == FS_UNKNOWN) {
+			sb->cleancache_poolid = i + FAKE_SHARED_FS_POOLID_OFFSET;
+			uuids[i] = uuid;
+			if (backend_registered)
+				shared_fs_poolid_map[i] = (*cleancache_ops.init_shared_fs)
+						(uuid, PAGE_SIZE);
+			else
+				shared_fs_poolid_map[i] = FS_NO_BACKEND;
+			break;
+		}
+	}
+	mutex_unlock(&poolid_mutex);
 }
 EXPORT_SYMBOL(__cleancache_init_shared_fs);
 
@@ -99,27 +210,53 @@ static int cleancache_get_key(struct inode *inode,
 }
 
 /*
+ * Returns a pool_id that is associated with a given fake poolid.
+ */
+static int get_poolid_from_fake(int fake_pool_id)
+{
+	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET)
+		return shared_fs_poolid_map[fake_pool_id -
+			FAKE_SHARED_FS_POOLID_OFFSET];
+	else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET)
+		return fs_poolid_map[fake_pool_id - FAKE_FS_POOLID_OFFSET];
+	return FS_NO_BACKEND;
+}
+
+/*
  * "Get" data from cleancache associated with the poolid/inode/index
  * that were specified when the data was put to cleanache and, if
  * successful, use it to fill the specified page with data and return 0.
  * The pageframe is unchanged and returns -1 if the get fails.
  * Page must be locked by caller.
+ *
+ * The function has two checks before any action is taken - whether
+ * a backend is registered and whether the sb->cleancache_poolid
+ * is correct.
  */
 int __cleancache_get_page(struct page *page)
 {
 	int ret = -1;
 	int pool_id;
+	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
+	if (!backend_registered) {
+		cleancache_failed_gets++;
+		goto out;
+	}
+
 	VM_BUG_ON(!PageLocked(page));
-	pool_id = page->mapping->host->i_sb->cleancache_poolid;
-	if (pool_id < 0)
+	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
+	if (fake_pool_id < 0)
 		goto out;
+	pool_id = get_poolid_from_fake(fake_pool_id);
 
 	if (cleancache_get_key(page->mapping->host, &key) < 0)
 		goto out;
 
-	ret = (*cleancache_ops.get_page)(pool_id, key, page->index, page);
+	if (pool_id >= 0)
+		ret = (*cleancache_ops.get_page)(pool_id,
+				key, page->index, page);
 	if (ret == 0)
 		cleancache_succ_gets++;
 	else
@@ -134,16 +271,31 @@ EXPORT_SYMBOL(__cleancache_get_page);
  * (previously-obtained per-filesystem) poolid and the page's,
  * inode and page index.  Page must be locked.  Note that a put_page
  * always "succeeds", though a subsequent get_page may succeed or fail.
+ *
+ * The function has two checks before any action is taken - whether
+ * a backend is registered and whether the sb->cleancache_poolid
+ * is correct.
  */
 void __cleancache_put_page(struct page *page)
 {
 	int pool_id;
+	int fake_pool_id;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
+	if (!backend_registered) {
+		cleancache_puts++;
+		return;
+	}
+
 	VM_BUG_ON(!PageLocked(page));
-	pool_id = page->mapping->host->i_sb->cleancache_poolid;
+	fake_pool_id = page->mapping->host->i_sb->cleancache_poolid;
+	if (fake_pool_id < 0)
+		return;
+
+	pool_id = get_poolid_from_fake(fake_pool_id);
+
 	if (pool_id >= 0 &&
-	      cleancache_get_key(page->mapping->host, &key) >= 0) {
+		cleancache_get_key(page->mapping->host, &key) >= 0) {
 		(*cleancache_ops.put_page)(pool_id, key, page->index, page);
 		cleancache_puts++;
 	}
@@ -153,19 +305,31 @@ EXPORT_SYMBOL(__cleancache_put_page);
 /*
  * Invalidate any data from cleancache associated with the poolid and the
  * page's inode and page index so that a subsequent "get" will fail.
+ *
+ * The function has two checks before any action is taken - whether
+ * a backend is registered and whether the sb->cleancache_poolid
+ * is correct.
  */
 void __cleancache_invalidate_page(struct address_space *mapping,
 					struct page *page)
 {
 	/* careful... page->mapping is NULL sometimes when this is called */
-	int pool_id = mapping->host->i_sb->cleancache_poolid;
+	int pool_id;
+	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
-	if (pool_id >= 0) {
+	if (!backend_registered)
+		return;
+
+	if (fake_pool_id >= 0) {
+		pool_id = get_poolid_from_fake(fake_pool_id);
+		if (pool_id < 0)
+			return;
+
 		VM_BUG_ON(!PageLocked(page));
 		if (cleancache_get_key(mapping->host, &key) >= 0) {
 			(*cleancache_ops.invalidate_page)(pool_id,
-							  key, page->index);
+					key, page->index);
 			cleancache_invalidates++;
 		}
 	}
@@ -176,12 +340,25 @@ EXPORT_SYMBOL(__cleancache_invalidate_page);
  * Invalidate all data from cleancache associated with the poolid and the
  * mappings's inode so that all subsequent gets to this poolid/inode
  * will fail.
+ *
+ * The function has two checks before any action is taken - whether
+ * a backend is registered and whether the sb->cleancache_poolid
+ * is correct.
  */
 void __cleancache_invalidate_inode(struct address_space *mapping)
 {
-	int pool_id = mapping->host->i_sb->cleancache_poolid;
+	int pool_id;
+	int fake_pool_id = mapping->host->i_sb->cleancache_poolid;
 	struct cleancache_filekey key = { .u.key = { 0 } };
 
+	if (!backend_registered)
+		return;
+
+	if (fake_pool_id < 0)
+		return;
+
+	pool_id = get_poolid_from_fake(fake_pool_id);
+
 	if (pool_id >= 0 && cleancache_get_key(mapping->host, &key) >= 0)
 		(*cleancache_ops.invalidate_inode)(pool_id, key);
 }
@@ -189,21 +366,37 @@ EXPORT_SYMBOL(__cleancache_invalidate_inode);
 
 /*
  * Called by any cleancache-enabled filesystem at time of unmount;
- * note that pool_id is surrendered and may be reutrned by a subsequent
- * cleancache_init_fs or cleancache_init_shared_fs
+ * note that pool_id is surrendered and may be returned by a subsequent
+ * cleancache_init_fs or cleancache_init_shared_fs.
  */
 void __cleancache_invalidate_fs(struct super_block *sb)
 {
-	if (sb->cleancache_poolid >= 0) {
-		int old_poolid = sb->cleancache_poolid;
-		sb->cleancache_poolid = -1;
-		(*cleancache_ops.invalidate_fs)(old_poolid);
+	int index;
+	int fake_pool_id = sb->cleancache_poolid;
+	int old_poolid = fake_pool_id;
+
+	mutex_lock(&poolid_mutex);
+	if (fake_pool_id >= FAKE_SHARED_FS_POOLID_OFFSET) {
+		index = fake_pool_id - FAKE_SHARED_FS_POOLID_OFFSET;
+		old_poolid = shared_fs_poolid_map[index];
+		shared_fs_poolid_map[index] = FS_UNKNOWN;
+		uuids[index] = NULL;
+	} else if (fake_pool_id >= FAKE_FS_POOLID_OFFSET) {
+		index = fake_pool_id - FAKE_FS_POOLID_OFFSET;
+		old_poolid = fs_poolid_map[index];
+		fs_poolid_map[index] = FS_UNKNOWN;
 	}
+	sb->cleancache_poolid = -1;
+	if (backend_registered)
+		(*cleancache_ops.invalidate_fs)(old_poolid);
+	mutex_unlock(&poolid_mutex);
 }
 EXPORT_SYMBOL(__cleancache_invalidate_fs);
 
 static int __init init_cleancache(void)
 {
+	int i;
+
 #ifdef CONFIG_DEBUG_FS
 	struct dentry *root = debugfs_create_dir("cleancache", NULL);
 	if (root == NULL)
@@ -215,6 +408,11 @@ static int __init init_cleancache(void)
 	debugfs_create_u64("invalidates", S_IRUGO,
 				root, &cleancache_invalidates);
 #endif
+	for (i = 0; i < MAX_INITIALIZABLE_FS; i++) {
+		fs_poolid_map[i] = FS_UNKNOWN;
+		shared_fs_poolid_map[i] = FS_UNKNOWN;
+	}
+	cleancache_enabled = 1;
 	return 0;
 }
 module_init(init_cleancache)
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-01-30 15:58 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-14 18:57 [PATCH v2] enable all tmem backends to be built and loaded as modules Konrad Rzeszutek Wilk
2012-11-14 18:57 ` [PATCH 1/8] mm: cleancache: lazy initialization to allow tmem backends to build/run " Konrad Rzeszutek Wilk
2012-11-16 23:10   ` Andrew Morton
2013-01-30 15:57     ` Konrad Rzeszutek Wilk
2012-11-14 18:57 ` [PATCH 2/8] mm: frontswap: " Konrad Rzeszutek Wilk
2012-11-16 23:16   ` Andrew Morton
2012-11-19  0:53     ` Bob Liu
2012-11-19 22:25       ` Andrew Morton
2012-11-27 21:26         ` Konrad Rzeszutek Wilk
2013-01-30 15:55           ` Konrad Rzeszutek Wilk
2012-11-14 18:57 ` [PATCH 3/8] frontswap: Make frontswap_init use a pointer for the ops Konrad Rzeszutek Wilk
2012-11-14 18:57 ` [PATCH 4/8] cleancache: Make cleancache_init " Konrad Rzeszutek Wilk
2012-11-14 18:57 ` [PATCH 5/8] staging: zcache2+ramster: enable ramster to be built/loaded as a module Konrad Rzeszutek Wilk
2012-11-14 18:57 ` [PATCH 6/8] staging: zcache2+ramster: enable zcache2 " Konrad Rzeszutek Wilk
2012-11-14 18:57 ` [PATCH 7/8] xen: tmem: enable Xen tmem shim " Konrad Rzeszutek Wilk
2012-11-14 18:57 ` [PATCH 8/8] xen/tmem: Remove the subsys call Konrad Rzeszutek Wilk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).