All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] devmem/kmem/kcore fixes, cleanups and hwpoison checks
@ 2010-01-13 13:53 ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Andi Kleen, KAMEZAWA Hiroyuki, Nick Piggin,
	Hugh Dickins, Linux Memory Management List

Andrew,

Here are some patches on mem/kmem/kcore.
Most of them have been individually reviewed in LKML.

bug fixes
	[PATCH 1/8] vfs: fix too big f_pos handling
	[PATCH 2/8] devmem: check vmalloc address on kmem read/write
	[PATCH 3/8] devmem: fix kmem write bug on memory holes

simplify vread/vwrite 
	[PATCH 4/8] resources: introduce generic page_is_ram()
	[PATCH 5/8] vmalloc: simplify vread()/vwrite()

check for corrupted page
	[PATCH 6/8] hwpoison: prevent /dev/kmem from accessing hwpoison pages
	[PATCH 7/8] hwpoison: prevent /dev/mem from accessing hwpoison pages
	[PATCH 8/8] hwpoison: prevent /dev/kcore from accessing hwpoison pages

Thanks,
Fengguang


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 0/8] devmem/kmem/kcore fixes, cleanups and hwpoison checks
@ 2010-01-13 13:53 ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Andi Kleen, KAMEZAWA Hiroyuki, Nick Piggin,
	Hugh Dickins, Linux Memory Management List

Andrew,

Here are some patches on mem/kmem/kcore.
Most of them have been individually reviewed in LKML.

bug fixes
	[PATCH 1/8] vfs: fix too big f_pos handling
	[PATCH 2/8] devmem: check vmalloc address on kmem read/write
	[PATCH 3/8] devmem: fix kmem write bug on memory holes

simplify vread/vwrite 
	[PATCH 4/8] resources: introduce generic page_is_ram()
	[PATCH 5/8] vmalloc: simplify vread()/vwrite()

check for corrupted page
	[PATCH 6/8] hwpoison: prevent /dev/kmem from accessing hwpoison pages
	[PATCH 7/8] hwpoison: prevent /dev/mem from accessing hwpoison pages
	[PATCH 8/8] hwpoison: prevent /dev/kcore from accessing hwpoison pages

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 1/8] vfs: fix too big f_pos handling
  2010-01-13 13:53 ` Wu Fengguang
@ 2010-01-13 13:53   ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Heiko Carstens, KAMEZAWA Hiroyuki,
	Andi Kleen, Nick Piggin, Hugh Dickins,
	Linux Memory Management List

[-- Attachment #1: f_pos-fix --]
[-- Type: text/plain, Size: 4017 bytes --]

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Now, rw_verify_area() checsk f_pos is negative or not. And if
negative, returns -EINVAL.

But, some special files as /dev/(k)mem and /proc/<pid>/mem etc..
has negative offsets. And we can't do any access via read/write
to the file(device).

This patch introduce a flag S_VERYBIG and allow negative file
offsets.

Changelog: v4->v5
 - clean up patches dor /dev/mem.
 - rebased onto 2.6.32-rc1

Changelog: v3->v4
 - make changes in mem.c aligned.
 - change __negative_fpos_check() to return int. 
 - fixed bug in "pos" check.
 - added comments.

Changelog: v2->v3
 - fixed bug in rw_verify_area (it cannot be compiled)

CC: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 drivers/char/mem.c |    4 ++++
 fs/proc/base.c     |    2 ++
 fs/read_write.c    |   22 ++++++++++++++++++++--
 include/linux/fs.h |    2 ++
 4 files changed, 28 insertions(+), 2 deletions(-)

--- linux-mm.orig/fs/read_write.c	2010-01-13 21:23:04.000000000 +0800
+++ linux-mm/fs/read_write.c	2010-01-13 21:23:52.000000000 +0800
@@ -205,6 +205,21 @@ bad:
 }
 #endif
 
+static int
+__negative_fpos_check(struct inode *inode, loff_t pos, size_t count)
+{
+	/*
+	 * pos or pos+count is negative here, check overflow.
+	 * too big "count" will be caught in rw_verify_area().
+	 */
+	if ((pos < 0) && (pos + count < pos))
+		return -EOVERFLOW;
+	/* If !VERYBIG inode, negative pos(pos+count) is not allowed */
+	if (!IS_VERYBIG(inode))
+		return -EINVAL;
+	return 0;
+}
+
 /*
  * rw_verify_area doesn't like huge counts. We limit
  * them to something that fits in "int" so that others
@@ -222,8 +237,11 @@ int rw_verify_area(int read_write, struc
 	if (unlikely((ssize_t) count < 0))
 		return retval;
 	pos = *ppos;
-	if (unlikely((pos < 0) || (loff_t) (pos + count) < 0))
-		return retval;
+	if (unlikely((pos < 0) || (loff_t) (pos + count) < 0)) {
+		retval = __negative_fpos_check(inode, pos, count);
+		if (retval)
+			return retval;
+	}
 
 	if (unlikely(inode->i_flock && mandatory_lock(inode))) {
 		retval = locks_mandatory_area(
--- linux-mm.orig/include/linux/fs.h	2010-01-13 21:23:04.000000000 +0800
+++ linux-mm/include/linux/fs.h	2010-01-13 21:31:02.000000000 +0800
@@ -235,6 +235,7 @@ struct inodes_stat_t {
 #define S_NOCMTIME	128	/* Do not update file c/mtime */
 #define S_SWAPFILE	256	/* Do not truncate: swapon got its bmaps */
 #define S_PRIVATE	512	/* Inode is fs-internal */
+#define S_VERYBIG	1024	/* Inode is huge: treat loff_t as unsigned */
 
 /*
  * Note that nosuid etc flags are inode-specific: setting some file-system
@@ -269,6 +270,7 @@ struct inodes_stat_t {
 #define IS_NOCMTIME(inode)	((inode)->i_flags & S_NOCMTIME)
 #define IS_SWAPFILE(inode)	((inode)->i_flags & S_SWAPFILE)
 #define IS_PRIVATE(inode)	((inode)->i_flags & S_PRIVATE)
+#define IS_VERYBIG(inode)	((inode)->i_flags & S_VERYBIG)
 
 /* the read-only stuff doesn't really belong here, but any other place is
    probably as bad and I don't want to create yet another include file. */
--- linux-mm.orig/drivers/char/mem.c	2010-01-13 21:23:11.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2010-01-13 21:27:28.000000000 +0800
@@ -861,6 +861,10 @@ static int memory_open(struct inode *ino
 	if (dev->dev_info)
 		filp->f_mapping->backing_dev_info = dev->dev_info;
 
+	/* Is /dev/mem or /dev/kmem ? */
+	if (dev->dev_info == &directly_mappable_cdev_bdi)
+		inode->i_flags |= S_VERYBIG;
+
 	if (dev->fops->open)
 		return dev->fops->open(inode, filp);
 
--- linux-mm.orig/fs/proc/base.c	2010-01-13 21:23:04.000000000 +0800
+++ linux-mm/fs/proc/base.c	2010-01-13 21:27:51.000000000 +0800
@@ -861,6 +861,8 @@ static const struct file_operations proc
 static int mem_open(struct inode* inode, struct file* file)
 {
 	file->private_data = (void*)((long)current->self_exec_id);
+	/* this file is read only and we can catch out-of-range */
+	inode->i_flags |= S_VERYBIG;
 	return 0;
 }
 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 1/8] vfs: fix too big f_pos handling
@ 2010-01-13 13:53   ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Heiko Carstens, KAMEZAWA Hiroyuki,
	Andi Kleen, Nick Piggin, Hugh Dickins,
	Linux Memory Management List

[-- Attachment #1: f_pos-fix --]
[-- Type: text/plain, Size: 4242 bytes --]

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Now, rw_verify_area() checsk f_pos is negative or not. And if
negative, returns -EINVAL.

But, some special files as /dev/(k)mem and /proc/<pid>/mem etc..
has negative offsets. And we can't do any access via read/write
to the file(device).

This patch introduce a flag S_VERYBIG and allow negative file
offsets.

Changelog: v4->v5
 - clean up patches dor /dev/mem.
 - rebased onto 2.6.32-rc1

Changelog: v3->v4
 - make changes in mem.c aligned.
 - change __negative_fpos_check() to return int. 
 - fixed bug in "pos" check.
 - added comments.

Changelog: v2->v3
 - fixed bug in rw_verify_area (it cannot be compiled)

CC: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 drivers/char/mem.c |    4 ++++
 fs/proc/base.c     |    2 ++
 fs/read_write.c    |   22 ++++++++++++++++++++--
 include/linux/fs.h |    2 ++
 4 files changed, 28 insertions(+), 2 deletions(-)

--- linux-mm.orig/fs/read_write.c	2010-01-13 21:23:04.000000000 +0800
+++ linux-mm/fs/read_write.c	2010-01-13 21:23:52.000000000 +0800
@@ -205,6 +205,21 @@ bad:
 }
 #endif
 
+static int
+__negative_fpos_check(struct inode *inode, loff_t pos, size_t count)
+{
+	/*
+	 * pos or pos+count is negative here, check overflow.
+	 * too big "count" will be caught in rw_verify_area().
+	 */
+	if ((pos < 0) && (pos + count < pos))
+		return -EOVERFLOW;
+	/* If !VERYBIG inode, negative pos(pos+count) is not allowed */
+	if (!IS_VERYBIG(inode))
+		return -EINVAL;
+	return 0;
+}
+
 /*
  * rw_verify_area doesn't like huge counts. We limit
  * them to something that fits in "int" so that others
@@ -222,8 +237,11 @@ int rw_verify_area(int read_write, struc
 	if (unlikely((ssize_t) count < 0))
 		return retval;
 	pos = *ppos;
-	if (unlikely((pos < 0) || (loff_t) (pos + count) < 0))
-		return retval;
+	if (unlikely((pos < 0) || (loff_t) (pos + count) < 0)) {
+		retval = __negative_fpos_check(inode, pos, count);
+		if (retval)
+			return retval;
+	}
 
 	if (unlikely(inode->i_flock && mandatory_lock(inode))) {
 		retval = locks_mandatory_area(
--- linux-mm.orig/include/linux/fs.h	2010-01-13 21:23:04.000000000 +0800
+++ linux-mm/include/linux/fs.h	2010-01-13 21:31:02.000000000 +0800
@@ -235,6 +235,7 @@ struct inodes_stat_t {
 #define S_NOCMTIME	128	/* Do not update file c/mtime */
 #define S_SWAPFILE	256	/* Do not truncate: swapon got its bmaps */
 #define S_PRIVATE	512	/* Inode is fs-internal */
+#define S_VERYBIG	1024	/* Inode is huge: treat loff_t as unsigned */
 
 /*
  * Note that nosuid etc flags are inode-specific: setting some file-system
@@ -269,6 +270,7 @@ struct inodes_stat_t {
 #define IS_NOCMTIME(inode)	((inode)->i_flags & S_NOCMTIME)
 #define IS_SWAPFILE(inode)	((inode)->i_flags & S_SWAPFILE)
 #define IS_PRIVATE(inode)	((inode)->i_flags & S_PRIVATE)
+#define IS_VERYBIG(inode)	((inode)->i_flags & S_VERYBIG)
 
 /* the read-only stuff doesn't really belong here, but any other place is
    probably as bad and I don't want to create yet another include file. */
--- linux-mm.orig/drivers/char/mem.c	2010-01-13 21:23:11.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2010-01-13 21:27:28.000000000 +0800
@@ -861,6 +861,10 @@ static int memory_open(struct inode *ino
 	if (dev->dev_info)
 		filp->f_mapping->backing_dev_info = dev->dev_info;
 
+	/* Is /dev/mem or /dev/kmem ? */
+	if (dev->dev_info == &directly_mappable_cdev_bdi)
+		inode->i_flags |= S_VERYBIG;
+
 	if (dev->fops->open)
 		return dev->fops->open(inode, filp);
 
--- linux-mm.orig/fs/proc/base.c	2010-01-13 21:23:04.000000000 +0800
+++ linux-mm/fs/proc/base.c	2010-01-13 21:27:51.000000000 +0800
@@ -861,6 +861,8 @@ static const struct file_operations proc
 static int mem_open(struct inode* inode, struct file* file)
 {
 	file->private_data = (void*)((long)current->self_exec_id);
+	/* this file is read only and we can catch out-of-range */
+	inode->i_flags |= S_VERYBIG;
 	return 0;
 }
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 2/8] devmem: check vmalloc address on kmem read/write
  2010-01-13 13:53 ` Wu Fengguang
@ 2010-01-13 13:53   ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Kelly Bowa, Greg Kroah-Hartman, Hugh Dickins,
	stable, KAMEZAWA Hiroyuki, Andi Kleen, Nick Piggin,
	Linux Memory Management List

[-- Attachment #1: vmalloc-addr-fix.patch --]
[-- Type: text/plain, Size: 2742 bytes --]

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Otherwise vmalloc_to_page() will BUG().

This also makes the kmem read/write implementation aligned with mem(4):
"References to nonexistent locations cause errors to be returned." Here
we return -ENXIO (inspired by Hugh) if no bytes have been transfered
to/from user space, otherwise return partial read/write results.

CC: Kelly Bowa <kmb@tuxedu.org>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: <stable@kernel.org>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c |   28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

--- linux-mm.orig/drivers/char/mem.c	2010-01-11 10:22:35.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2010-01-11 10:32:32.000000000 +0800
@@ -395,6 +395,7 @@ static ssize_t read_kmem(struct file *fi
 	unsigned long p = *ppos;
 	ssize_t low_count, read, sz;
 	char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */
+	int err = 0;
 
 	read = 0;
 	if (p < (unsigned long) high_memory) {
@@ -441,12 +442,16 @@ static ssize_t read_kmem(struct file *fi
 			return -ENOMEM;
 		while (count > 0) {
 			sz = size_inside_page(p, count);
+			if (!is_vmalloc_or_module_addr((void *)p)) {
+				err = -ENXIO;
+				break;
+			}
 			sz = vread(kbuf, (char *)p, sz);
 			if (!sz)
 				break;
 			if (copy_to_user(buf, kbuf, sz)) {
-				free_page((unsigned long)kbuf);
-				return -EFAULT;
+				err = -EFAULT;
+				break;
 			}
 			count -= sz;
 			buf += sz;
@@ -455,8 +460,8 @@ static ssize_t read_kmem(struct file *fi
 		}
 		free_page((unsigned long)kbuf);
 	}
- 	*ppos = p;
- 	return read;
+	*ppos = p;
+	return read ? read : err;
 }
 
 
@@ -520,6 +525,7 @@ static ssize_t write_kmem(struct file * 
 	ssize_t wrote = 0;
 	ssize_t virtr = 0;
 	char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */
+	int err = 0;
 
 	if (p < (unsigned long) high_memory) {
 		unsigned long to_write = min_t(unsigned long, count,
@@ -540,12 +546,14 @@ static ssize_t write_kmem(struct file * 
 			unsigned long sz = size_inside_page(p, count);
 			unsigned long n;
 
+			if (!is_vmalloc_or_module_addr((void *)p)) {
+				err = -ENXIO;
+				break;
+			}
 			n = copy_from_user(kbuf, buf, sz);
 			if (n) {
-				if (wrote + virtr)
-					break;
-				free_page((unsigned long)kbuf);
-				return -EFAULT;
+				err = -EFAULT;
+				break;
 			}
 			sz = vwrite(kbuf, (char *)p, sz);
 			count -= sz;
@@ -556,8 +564,8 @@ static ssize_t write_kmem(struct file * 
 		free_page((unsigned long)kbuf);
 	}
 
- 	*ppos = p;
- 	return virtr + wrote;
+	*ppos = p;
+	return virtr + wrote ? : err;
 }
 #endif
 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 2/8] devmem: check vmalloc address on kmem read/write
@ 2010-01-13 13:53   ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Kelly Bowa, Greg Kroah-Hartman, Hugh Dickins,
	stable, KAMEZAWA Hiroyuki, Andi Kleen, Nick Piggin,
	Linux Memory Management List

[-- Attachment #1: vmalloc-addr-fix.patch --]
[-- Type: text/plain, Size: 2967 bytes --]

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Otherwise vmalloc_to_page() will BUG().

This also makes the kmem read/write implementation aligned with mem(4):
"References to nonexistent locations cause errors to be returned." Here
we return -ENXIO (inspired by Hugh) if no bytes have been transfered
to/from user space, otherwise return partial read/write results.

CC: Kelly Bowa <kmb@tuxedu.org>
CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: <stable@kernel.org>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c |   28 ++++++++++++++++++----------
 1 file changed, 18 insertions(+), 10 deletions(-)

--- linux-mm.orig/drivers/char/mem.c	2010-01-11 10:22:35.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2010-01-11 10:32:32.000000000 +0800
@@ -395,6 +395,7 @@ static ssize_t read_kmem(struct file *fi
 	unsigned long p = *ppos;
 	ssize_t low_count, read, sz;
 	char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */
+	int err = 0;
 
 	read = 0;
 	if (p < (unsigned long) high_memory) {
@@ -441,12 +442,16 @@ static ssize_t read_kmem(struct file *fi
 			return -ENOMEM;
 		while (count > 0) {
 			sz = size_inside_page(p, count);
+			if (!is_vmalloc_or_module_addr((void *)p)) {
+				err = -ENXIO;
+				break;
+			}
 			sz = vread(kbuf, (char *)p, sz);
 			if (!sz)
 				break;
 			if (copy_to_user(buf, kbuf, sz)) {
-				free_page((unsigned long)kbuf);
-				return -EFAULT;
+				err = -EFAULT;
+				break;
 			}
 			count -= sz;
 			buf += sz;
@@ -455,8 +460,8 @@ static ssize_t read_kmem(struct file *fi
 		}
 		free_page((unsigned long)kbuf);
 	}
- 	*ppos = p;
- 	return read;
+	*ppos = p;
+	return read ? read : err;
 }
 
 
@@ -520,6 +525,7 @@ static ssize_t write_kmem(struct file * 
 	ssize_t wrote = 0;
 	ssize_t virtr = 0;
 	char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */
+	int err = 0;
 
 	if (p < (unsigned long) high_memory) {
 		unsigned long to_write = min_t(unsigned long, count,
@@ -540,12 +546,14 @@ static ssize_t write_kmem(struct file * 
 			unsigned long sz = size_inside_page(p, count);
 			unsigned long n;
 
+			if (!is_vmalloc_or_module_addr((void *)p)) {
+				err = -ENXIO;
+				break;
+			}
 			n = copy_from_user(kbuf, buf, sz);
 			if (n) {
-				if (wrote + virtr)
-					break;
-				free_page((unsigned long)kbuf);
-				return -EFAULT;
+				err = -EFAULT;
+				break;
 			}
 			sz = vwrite(kbuf, (char *)p, sz);
 			count -= sz;
@@ -556,8 +564,8 @@ static ssize_t write_kmem(struct file * 
 		free_page((unsigned long)kbuf);
 	}
 
- 	*ppos = p;
- 	return virtr + wrote;
+	*ppos = p;
+	return virtr + wrote ? : err;
 }
 #endif
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 3/8] devmem: fix kmem write bug on memory holes
  2010-01-13 13:53 ` Wu Fengguang
@ 2010-01-13 13:53   ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Kelly Bowa, Andi Kleen,
	Benjamin Herrenschmidt, Christoph Lameter, Ingo Molnar,
	Tejun Heo, Nick Piggin, KAMEZAWA Hiroyuki, stable, Hugh Dickins,
	Linux Memory Management List

[-- Attachment #1: vwrite-fix.patch --]
[-- Type: text/plain, Size: 1105 bytes --]

write_kmem() used to assume vwrite() always return the full buffer length.
However now vwrite() could return 0 to indicate memory hole. This creates
a bug that "buf" is not advanced accordingly.

Fix it to simply ignore the return value, hence the memory hole.

CC: Kelly Bowa <kmb@tuxedu.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Tejun Heo <tj@kernel.org>
CC: Nick Piggin <npiggin@suse.de>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: <stable@kernel.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-mm.orig/drivers/char/mem.c	2010-01-11 10:32:32.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2010-01-11 10:32:34.000000000 +0800
@@ -555,7 +555,7 @@ static ssize_t write_kmem(struct file * 
 				err = -EFAULT;
 				break;
 			}
-			sz = vwrite(kbuf, (char *)p, sz);
+			vwrite(kbuf, (char *)p, sz);
 			count -= sz;
 			buf += sz;
 			virtr += sz;



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 3/8] devmem: fix kmem write bug on memory holes
@ 2010-01-13 13:53   ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Kelly Bowa, Andi Kleen,
	Benjamin Herrenschmidt, Christoph Lameter, Ingo Molnar,
	Tejun Heo, Nick Piggin, KAMEZAWA Hiroyuki, stable, Hugh Dickins,
	Linux Memory Management List

[-- Attachment #1: vwrite-fix.patch --]
[-- Type: text/plain, Size: 1330 bytes --]

write_kmem() used to assume vwrite() always return the full buffer length.
However now vwrite() could return 0 to indicate memory hole. This creates
a bug that "buf" is not advanced accordingly.

Fix it to simply ignore the return value, hence the memory hole.

CC: Kelly Bowa <kmb@tuxedu.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Tejun Heo <tj@kernel.org>
CC: Nick Piggin <npiggin@suse.de>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: <stable@kernel.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-mm.orig/drivers/char/mem.c	2010-01-11 10:32:32.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2010-01-11 10:32:34.000000000 +0800
@@ -555,7 +555,7 @@ static ssize_t write_kmem(struct file * 
 				err = -EFAULT;
 				break;
 			}
-			sz = vwrite(kbuf, (char *)p, sz);
+			vwrite(kbuf, (char *)p, sz);
 			count -= sz;
 			buf += sz;
 			virtr += sz;


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 4/8] resources: introduce generic page_is_ram()
@ 2010-01-13 13:53   ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Chen Liqin, Lennox Wu, Ralf Baechle,
	linux-mips, KAMEZAWA Hiroyuki, Andi Kleen, Nick Piggin,
	Hugh Dickins, Linux Memory Management List

[-- Attachment #1: page-is-ram.patch --]
[-- Type: text/plain, Size: 2303 bytes --]

It's based on walk_system_ram_range(), for archs that don't have
their own page_is_ram().

The static verions in MIPS and SCORE are also made global.

CC: Chen Liqin <liqin.chen@sunplusct.com>
CC: Lennox Wu <lennox.wu@gmail.com>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: linux-mips@linux-mips.org
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> 
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 arch/mips/mm/init.c    |    2 +-
 arch/score/mm/init.c   |    2 +-
 include/linux/ioport.h |    2 ++
 kernel/resource.c      |   10 ++++++++++
 4 files changed, 14 insertions(+), 2 deletions(-)

--- linux-mm.orig/kernel/resource.c	2010-01-10 10:11:53.000000000 +0800
+++ linux-mm/kernel/resource.c	2010-01-10 10:15:33.000000000 +0800
@@ -297,6 +297,16 @@ int walk_system_ram_range(unsigned long 
 
 #endif
 
+static int __is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
+{
+	return 24;
+}
+
+int __attribute__((weak)) page_is_ram(unsigned long pfn)
+{
+	return 24 == walk_system_ram_range(pfn, 1, NULL, __is_ram);
+}
+
 /*
  * Find empty slot in the resource tree given range and alignment.
  */
--- linux-mm.orig/include/linux/ioport.h	2010-01-10 10:11:53.000000000 +0800
+++ linux-mm/include/linux/ioport.h	2010-01-10 10:11:54.000000000 +0800
@@ -188,5 +188,7 @@ extern int
 walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 		void *arg, int (*func)(unsigned long, unsigned long, void *));
 
+extern int page_is_ram(unsigned long pfn);
+
 #endif /* __ASSEMBLY__ */
 #endif	/* _LINUX_IOPORT_H */
--- linux-mm.orig/arch/score/mm/init.c	2010-01-10 10:35:38.000000000 +0800
+++ linux-mm/arch/score/mm/init.c	2010-01-10 10:38:04.000000000 +0800
@@ -59,7 +59,7 @@ static unsigned long setup_zero_page(voi
 }
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES
-static int __init page_is_ram(unsigned long pagenr)
+int page_is_ram(unsigned long pagenr)
 {
 	if (pagenr >= min_low_pfn && pagenr < max_low_pfn)
 		return 1;
--- linux-mm.orig/arch/mips/mm/init.c	2010-01-10 10:37:22.000000000 +0800
+++ linux-mm/arch/mips/mm/init.c	2010-01-10 10:37:26.000000000 +0800
@@ -298,7 +298,7 @@ void __init fixrange_init(unsigned long 
 }
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES
-static int __init page_is_ram(unsigned long pagenr)
+int page_is_ram(unsigned long pagenr)
 {
 	int i;
 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 4/8] resources: introduce generic page_is_ram()
@ 2010-01-13 13:53   ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Chen Liqin, Lennox Wu, Ralf Baechle,
	linux-mips, KAMEZAWA Hiroyuki, Andi Kleen, Nick Piggin,
	Hugh Dickins, Linux Memory Management List

[-- Attachment #1: page-is-ram.patch --]
[-- Type: text/plain, Size: 2301 bytes --]

It's based on walk_system_ram_range(), for archs that don't have
their own page_is_ram().

The static verions in MIPS and SCORE are also made global.

CC: Chen Liqin <liqin.chen@sunplusct.com>
CC: Lennox Wu <lennox.wu@gmail.com>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: linux-mips@linux-mips.org
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> 
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 arch/mips/mm/init.c    |    2 +-
 arch/score/mm/init.c   |    2 +-
 include/linux/ioport.h |    2 ++
 kernel/resource.c      |   10 ++++++++++
 4 files changed, 14 insertions(+), 2 deletions(-)

--- linux-mm.orig/kernel/resource.c	2010-01-10 10:11:53.000000000 +0800
+++ linux-mm/kernel/resource.c	2010-01-10 10:15:33.000000000 +0800
@@ -297,6 +297,16 @@ int walk_system_ram_range(unsigned long 
 
 #endif
 
+static int __is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
+{
+	return 24;
+}
+
+int __attribute__((weak)) page_is_ram(unsigned long pfn)
+{
+	return 24 == walk_system_ram_range(pfn, 1, NULL, __is_ram);
+}
+
 /*
  * Find empty slot in the resource tree given range and alignment.
  */
--- linux-mm.orig/include/linux/ioport.h	2010-01-10 10:11:53.000000000 +0800
+++ linux-mm/include/linux/ioport.h	2010-01-10 10:11:54.000000000 +0800
@@ -188,5 +188,7 @@ extern int
 walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 		void *arg, int (*func)(unsigned long, unsigned long, void *));
 
+extern int page_is_ram(unsigned long pfn);
+
 #endif /* __ASSEMBLY__ */
 #endif	/* _LINUX_IOPORT_H */
--- linux-mm.orig/arch/score/mm/init.c	2010-01-10 10:35:38.000000000 +0800
+++ linux-mm/arch/score/mm/init.c	2010-01-10 10:38:04.000000000 +0800
@@ -59,7 +59,7 @@ static unsigned long setup_zero_page(voi
 }
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES
-static int __init page_is_ram(unsigned long pagenr)
+int page_is_ram(unsigned long pagenr)
 {
 	if (pagenr >= min_low_pfn && pagenr < max_low_pfn)
 		return 1;
--- linux-mm.orig/arch/mips/mm/init.c	2010-01-10 10:37:22.000000000 +0800
+++ linux-mm/arch/mips/mm/init.c	2010-01-10 10:37:26.000000000 +0800
@@ -298,7 +298,7 @@ void __init fixrange_init(unsigned long 
 }
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES
-static int __init page_is_ram(unsigned long pagenr)
+int page_is_ram(unsigned long pagenr)
 {
 	int i;
 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 4/8] resources: introduce generic page_is_ram()
@ 2010-01-13 13:53   ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Chen Liqin, Lennox Wu, Ralf Baechle,
	linux-mips, KAMEZAWA Hiroyuki, Andi Kleen, Nick Piggin,
	Hugh Dickins, Linux Memory Management List

[-- Attachment #1: page-is-ram.patch --]
[-- Type: text/plain, Size: 2528 bytes --]

It's based on walk_system_ram_range(), for archs that don't have
their own page_is_ram().

The static verions in MIPS and SCORE are also made global.

CC: Chen Liqin <liqin.chen@sunplusct.com>
CC: Lennox Wu <lennox.wu@gmail.com>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: linux-mips@linux-mips.org
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> 
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 arch/mips/mm/init.c    |    2 +-
 arch/score/mm/init.c   |    2 +-
 include/linux/ioport.h |    2 ++
 kernel/resource.c      |   10 ++++++++++
 4 files changed, 14 insertions(+), 2 deletions(-)

--- linux-mm.orig/kernel/resource.c	2010-01-10 10:11:53.000000000 +0800
+++ linux-mm/kernel/resource.c	2010-01-10 10:15:33.000000000 +0800
@@ -297,6 +297,16 @@ int walk_system_ram_range(unsigned long 
 
 #endif
 
+static int __is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
+{
+	return 24;
+}
+
+int __attribute__((weak)) page_is_ram(unsigned long pfn)
+{
+	return 24 == walk_system_ram_range(pfn, 1, NULL, __is_ram);
+}
+
 /*
  * Find empty slot in the resource tree given range and alignment.
  */
--- linux-mm.orig/include/linux/ioport.h	2010-01-10 10:11:53.000000000 +0800
+++ linux-mm/include/linux/ioport.h	2010-01-10 10:11:54.000000000 +0800
@@ -188,5 +188,7 @@ extern int
 walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 		void *arg, int (*func)(unsigned long, unsigned long, void *));
 
+extern int page_is_ram(unsigned long pfn);
+
 #endif /* __ASSEMBLY__ */
 #endif	/* _LINUX_IOPORT_H */
--- linux-mm.orig/arch/score/mm/init.c	2010-01-10 10:35:38.000000000 +0800
+++ linux-mm/arch/score/mm/init.c	2010-01-10 10:38:04.000000000 +0800
@@ -59,7 +59,7 @@ static unsigned long setup_zero_page(voi
 }
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES
-static int __init page_is_ram(unsigned long pagenr)
+int page_is_ram(unsigned long pagenr)
 {
 	if (pagenr >= min_low_pfn && pagenr < max_low_pfn)
 		return 1;
--- linux-mm.orig/arch/mips/mm/init.c	2010-01-10 10:37:22.000000000 +0800
+++ linux-mm/arch/mips/mm/init.c	2010-01-10 10:37:26.000000000 +0800
@@ -298,7 +298,7 @@ void __init fixrange_init(unsigned long 
 }
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES
-static int __init page_is_ram(unsigned long pagenr)
+int page_is_ram(unsigned long pagenr)
 {
 	int i;
 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5/8] vmalloc: simplify vread()/vwrite()
  2010-01-13 13:53 ` Wu Fengguang
@ 2010-01-13 13:53   ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Tejun Heo, Ingo Molnar, Nick Piggin,
	Andi Kleen, Hugh Dickins, Christoph Lameter, KAMEZAWA Hiroyuki,
	Linux Memory Management List

[-- Attachment #1: vread-vwrite-simplify.patch --]
[-- Type: text/plain, Size: 11866 bytes --]

vread()/vwrite() is only called from kcore/kmem to access one page at a time.
So the logic can be vastly simplified.

The changes are:
- remove the vmlist walk and rely solely on vmalloc_to_page()
- replace the VM_IOREMAP check with (page && page_is_ram(pfn))
- rename to vread_page()/vwrite_page()

The page_is_ram() check is necessary because kmap_atomic() is not
designed to work with non-RAM pages.

Note that even for a RAM page, we don't own the page, and cannot assume
it's a _PAGE_CACHE_WB page.

CC: Tejun Heo <tj@kernel.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Nick Piggin <npiggin@suse.de>
CC: Andi Kleen <andi@firstfloor.org> 
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c      |    8 -
 fs/proc/kcore.c         |    2 
 include/linux/vmalloc.h |    6 
 mm/vmalloc.c            |  230 ++++++++------------------------------
 4 files changed, 58 insertions(+), 188 deletions(-)

--- linux-mm.orig/mm/vmalloc.c	2010-01-13 21:23:05.000000000 +0800
+++ linux-mm/mm/vmalloc.c	2010-01-13 21:25:38.000000000 +0800
@@ -1646,232 +1646,102 @@ void *vmalloc_32_user(unsigned long size
 }
 EXPORT_SYMBOL(vmalloc_32_user);
 
-/*
- * small helper routine , copy contents to buf from addr.
- * If the page is not present, fill zero.
- */
-
-static int aligned_vread(char *buf, char *addr, unsigned long count)
-{
-	struct page *p;
-	int copied = 0;
-
-	while (count) {
-		unsigned long offset, length;
-
-		offset = (unsigned long)addr & ~PAGE_MASK;
-		length = PAGE_SIZE - offset;
-		if (length > count)
-			length = count;
-		p = vmalloc_to_page(addr);
-		/*
-		 * To do safe access to this _mapped_ area, we need
-		 * lock. But adding lock here means that we need to add
-		 * overhead of vmalloc()/vfree() calles for this _debug_
-		 * interface, rarely used. Instead of that, we'll use
-		 * kmap() and get small overhead in this access function.
-		 */
-		if (p) {
-			/*
-			 * we can expect USER0 is not used (see vread/vwrite's
-			 * function description)
-			 */
-			void *map = kmap_atomic(p, KM_USER0);
-			memcpy(buf, map + offset, length);
-			kunmap_atomic(map, KM_USER0);
-		} else
-			memset(buf, 0, length);
-
-		addr += length;
-		buf += length;
-		copied += length;
-		count -= length;
-	}
-	return copied;
-}
-
-static int aligned_vwrite(char *buf, char *addr, unsigned long count)
-{
-	struct page *p;
-	int copied = 0;
-
-	while (count) {
-		unsigned long offset, length;
-
-		offset = (unsigned long)addr & ~PAGE_MASK;
-		length = PAGE_SIZE - offset;
-		if (length > count)
-			length = count;
-		p = vmalloc_to_page(addr);
-		/*
-		 * To do safe access to this _mapped_ area, we need
-		 * lock. But adding lock here means that we need to add
-		 * overhead of vmalloc()/vfree() calles for this _debug_
-		 * interface, rarely used. Instead of that, we'll use
-		 * kmap() and get small overhead in this access function.
-		 */
-		if (p) {
-			/*
-			 * we can expect USER0 is not used (see vread/vwrite's
-			 * function description)
-			 */
-			void *map = kmap_atomic(p, KM_USER0);
-			memcpy(map + offset, buf, length);
-			kunmap_atomic(map, KM_USER0);
-		}
-		addr += length;
-		buf += length;
-		copied += length;
-		count -= length;
-	}
-	return copied;
-}
-
 /**
- *	vread() -  read vmalloc area in a safe way.
+ *	vread_page() -  read vmalloc area in a safe way.
  *	@buf:		buffer for reading data
  *	@addr:		vm address.
- *	@count:		number of bytes to be read.
+ *	@count:		number of bytes to read inside the page.
  *
- *	Returns # of bytes which addr and buf should be increased.
- *	(same number to @count). Returns 0 if [addr...addr+count) doesn't
- *	includes any intersect with alive vmalloc area.
+ *	Returns # of bytes copied on success.
+ *	Returns 0 if @addr is not vmalloc'ed, or is mapped to non-RAM.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from that area to a given buffer. If the given memory range
  *	of [addr...addr+count) includes some valid address, data is copied to
  *	proper area of @buf. If there are memory holes, they'll be zero-filled.
- *	IOREMAP area is treated as memory hole and no copy is done.
  *
- *	If [addr...addr+count) doesn't includes any intersects with alive
- *	vm_struct area, returns 0.
  *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
  *	the caller should guarantee KM_USER0 is not used.
  *
- *	Note: In usual ops, vread() is never necessary because the caller
+ *	Note: In usual ops, vread_page() is never necessary because the caller
  *	should know vmalloc() area is valid and can use memcpy().
  *	This is for routines which have to access vmalloc area without
- *	any informaion, as /dev/kmem.
+ *	any informaion, as /dev/kmem and /dev/kcore.
  *
  */
 
-long vread(char *buf, char *addr, unsigned long count)
+int vread_page(char *buf, char *addr, unsigned int count)
 {
-	struct vm_struct *tmp;
-	char *vaddr, *buf_start = buf;
-	unsigned long buflen = count;
-	unsigned long n;
-
-	/* Don't allow overflow */
-	if ((unsigned long) addr + count < count)
-		count = -(unsigned long) addr;
+	struct page *p;
+	void *map;
+	int offset = (unsigned long)addr & (PAGE_SIZE - 1);
 
-	read_lock(&vmlist_lock);
-	for (tmp = vmlist; count && tmp; tmp = tmp->next) {
-		vaddr = (char *) tmp->addr;
-		if (addr >= vaddr + tmp->size - PAGE_SIZE)
-			continue;
-		while (addr < vaddr) {
-			if (count == 0)
-				goto finished;
-			*buf = '\0';
-			buf++;
-			addr++;
-			count--;
-		}
-		n = vaddr + tmp->size - PAGE_SIZE - addr;
-		if (n > count)
-			n = count;
-		if (!(tmp->flags & VM_IOREMAP))
-			aligned_vread(buf, addr, n);
-		else /* IOREMAP area is treated as memory hole */
-			memset(buf, 0, n);
-		buf += n;
-		addr += n;
-		count -= n;
-	}
-finished:
-	read_unlock(&vmlist_lock);
+	/* Assume subpage access */
+	BUG_ON(count > PAGE_SIZE - offset);
 
-	if (buf == buf_start)
+	p = vmalloc_to_page(addr);
+	if (!p || !page_is_ram(page_to_pfn(p))) {
+		memset(buf, 0, count);
 		return 0;
-	/* zero-fill memory holes */
-	if (buf != buf_start + buflen)
-		memset(buf, 0, buflen - (buf - buf_start));
+	}
 
-	return buflen;
+	/*
+	 * To do safe access to this _mapped_ area, we need
+	 * lock. But adding lock here means that we need to add
+	 * overhead of vmalloc()/vfree() calles for this _debug_
+	 * interface, rarely used. Instead of that, we'll use
+	 * kmap() and get small overhead in this access function.
+	 */
+	map = kmap_atomic(p, KM_USER0);
+	memcpy(buf, map + offset, count);
+	kunmap_atomic(map, KM_USER0);
+
+	return count;
 }
 
 /**
- *	vwrite() -  write vmalloc area in a safe way.
+ *	vwrite_page() -  write vmalloc area in a safe way.
  *	@buf:		buffer for source data
  *	@addr:		vm address.
- *	@count:		number of bytes to be read.
+ *	@count:		number of bytes to write inside the page.
  *
- *	Returns # of bytes which addr and buf should be incresed.
- *	(same number to @count).
- *	If [addr...addr+count) doesn't includes any intersect with valid
- *	vmalloc area, returns 0.
+ *	Returns # of bytes copied on success.
+ *	Returns 0 if @addr is not vmalloc'ed, or is mapped to non-RAM.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from a buffer to the given addr. If specified range of
  *	[addr...addr+count) includes some valid address, data is copied from
  *	proper area of @buf. If there are memory holes, no copy to hole.
- *	IOREMAP area is treated as memory hole and no copy is done.
  *
- *	If [addr...addr+count) doesn't includes any intersects with alive
- *	vm_struct area, returns 0.
  *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
  *	the caller should guarantee KM_USER0 is not used.
  *
- *	Note: In usual ops, vwrite() is never necessary because the caller
+ *	Note: In usual ops, vwrite_page() is never necessary because the caller
  *	should know vmalloc() area is valid and can use memcpy().
  *	This is for routines which have to access vmalloc area without
  *	any informaion, as /dev/kmem.
- *
- *	The caller should guarantee KM_USER1 is not used.
  */
 
-long vwrite(char *buf, char *addr, unsigned long count)
+int vwrite_page(char *buf, char *addr, unsigned int count)
 {
-	struct vm_struct *tmp;
-	char *vaddr;
-	unsigned long n, buflen;
-	int copied = 0;
-
-	/* Don't allow overflow */
-	if ((unsigned long) addr + count < count)
-		count = -(unsigned long) addr;
-	buflen = count;
+	struct page *p;
+	void *map;
+	int offset = (unsigned long)addr & (PAGE_SIZE - 1);
 
-	read_lock(&vmlist_lock);
-	for (tmp = vmlist; count && tmp; tmp = tmp->next) {
-		vaddr = (char *) tmp->addr;
-		if (addr >= vaddr + tmp->size - PAGE_SIZE)
-			continue;
-		while (addr < vaddr) {
-			if (count == 0)
-				goto finished;
-			buf++;
-			addr++;
-			count--;
-		}
-		n = vaddr + tmp->size - PAGE_SIZE - addr;
-		if (n > count)
-			n = count;
-		if (!(tmp->flags & VM_IOREMAP)) {
-			aligned_vwrite(buf, addr, n);
-			copied++;
-		}
-		buf += n;
-		addr += n;
-		count -= n;
-	}
-finished:
-	read_unlock(&vmlist_lock);
-	if (!copied)
+	/* Assume subpage access */
+	BUG_ON(count > PAGE_SIZE - offset);
+
+	p = vmalloc_to_page(addr);
+	if (!p)
+		return 0;
+	if (!page_is_ram(page_to_pfn(p)))
 		return 0;
-	return buflen;
+
+	map = kmap_atomic(p, KM_USER0);
+	memcpy(map + offset, buf, count);
+	kunmap_atomic(map, KM_USER0);
+
+	return count;
 }
 
 /**
--- linux-mm.orig/drivers/char/mem.c	2010-01-13 21:23:58.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2010-01-13 21:26:10.000000000 +0800
@@ -394,7 +394,7 @@ static ssize_t read_kmem(struct file *fi
 {
 	unsigned long p = *ppos;
 	ssize_t low_count, read, sz;
-	char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */
+	char *kbuf;	/* k-addr because vread_page() does kmap_atomic */
 	int err = 0;
 
 	read = 0;
@@ -446,7 +446,7 @@ static ssize_t read_kmem(struct file *fi
 				err = -ENXIO;
 				break;
 			}
-			sz = vread(kbuf, (char *)p, sz);
+			sz = vread_page(kbuf, (char *)p, sz);
 			if (!sz)
 				break;
 			if (copy_to_user(buf, kbuf, sz)) {
@@ -524,7 +524,7 @@ static ssize_t write_kmem(struct file * 
 	unsigned long p = *ppos;
 	ssize_t wrote = 0;
 	ssize_t virtr = 0;
-	char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */
+	char *kbuf;	/* k-addr because vwrite_page() does kmap_atomic */
 	int err = 0;
 
 	if (p < (unsigned long) high_memory) {
@@ -555,7 +555,7 @@ static ssize_t write_kmem(struct file * 
 				err = -EFAULT;
 				break;
 			}
-			vwrite(kbuf, (char *)p, sz);
+			vwrite_page(kbuf, (char *)p, sz);
 			count -= sz;
 			buf += sz;
 			virtr += sz;
--- linux-mm.orig/fs/proc/kcore.c	2010-01-13 21:23:05.000000000 +0800
+++ linux-mm/fs/proc/kcore.c	2010-01-13 21:24:00.000000000 +0800
@@ -499,7 +499,7 @@ read_kcore(struct file *file, char __use
 			elf_buf = kzalloc(tsz, GFP_KERNEL);
 			if (!elf_buf)
 				return -ENOMEM;
-			vread(elf_buf, (char *)start, tsz);
+			vread_page(elf_buf, (char *)start, tsz);
 			/* we have to zero-fill user buffer even if no read */
 			if (copy_to_user(buffer, elf_buf, tsz)) {
 				kfree(elf_buf);
--- linux-mm.orig/include/linux/vmalloc.h	2010-01-13 21:23:05.000000000 +0800
+++ linux-mm/include/linux/vmalloc.h	2010-01-13 21:24:00.000000000 +0800
@@ -104,9 +104,9 @@ extern void unmap_kernel_range(unsigned 
 extern struct vm_struct *alloc_vm_area(size_t size);
 extern void free_vm_area(struct vm_struct *area);
 
-/* for /dev/kmem */
-extern long vread(char *buf, char *addr, unsigned long count);
-extern long vwrite(char *buf, char *addr, unsigned long count);
+/* for /dev/kmem and /proc/kcore */
+extern int vread_page(char *buf, char *addr, unsigned int count);
+extern int vwrite_page(char *buf, char *addr, unsigned int count);
 
 /*
  *	Internals.  Dont't use..



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 5/8] vmalloc: simplify vread()/vwrite()
@ 2010-01-13 13:53   ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Tejun Heo, Ingo Molnar, Nick Piggin,
	Andi Kleen, Hugh Dickins, Christoph Lameter, KAMEZAWA Hiroyuki,
	Linux Memory Management List

[-- Attachment #1: vread-vwrite-simplify.patch --]
[-- Type: text/plain, Size: 12091 bytes --]

vread()/vwrite() is only called from kcore/kmem to access one page at a time.
So the logic can be vastly simplified.

The changes are:
- remove the vmlist walk and rely solely on vmalloc_to_page()
- replace the VM_IOREMAP check with (page && page_is_ram(pfn))
- rename to vread_page()/vwrite_page()

The page_is_ram() check is necessary because kmap_atomic() is not
designed to work with non-RAM pages.

Note that even for a RAM page, we don't own the page, and cannot assume
it's a _PAGE_CACHE_WB page.

CC: Tejun Heo <tj@kernel.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Nick Piggin <npiggin@suse.de>
CC: Andi Kleen <andi@firstfloor.org> 
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c      |    8 -
 fs/proc/kcore.c         |    2 
 include/linux/vmalloc.h |    6 
 mm/vmalloc.c            |  230 ++++++++------------------------------
 4 files changed, 58 insertions(+), 188 deletions(-)

--- linux-mm.orig/mm/vmalloc.c	2010-01-13 21:23:05.000000000 +0800
+++ linux-mm/mm/vmalloc.c	2010-01-13 21:25:38.000000000 +0800
@@ -1646,232 +1646,102 @@ void *vmalloc_32_user(unsigned long size
 }
 EXPORT_SYMBOL(vmalloc_32_user);
 
-/*
- * small helper routine , copy contents to buf from addr.
- * If the page is not present, fill zero.
- */
-
-static int aligned_vread(char *buf, char *addr, unsigned long count)
-{
-	struct page *p;
-	int copied = 0;
-
-	while (count) {
-		unsigned long offset, length;
-
-		offset = (unsigned long)addr & ~PAGE_MASK;
-		length = PAGE_SIZE - offset;
-		if (length > count)
-			length = count;
-		p = vmalloc_to_page(addr);
-		/*
-		 * To do safe access to this _mapped_ area, we need
-		 * lock. But adding lock here means that we need to add
-		 * overhead of vmalloc()/vfree() calles for this _debug_
-		 * interface, rarely used. Instead of that, we'll use
-		 * kmap() and get small overhead in this access function.
-		 */
-		if (p) {
-			/*
-			 * we can expect USER0 is not used (see vread/vwrite's
-			 * function description)
-			 */
-			void *map = kmap_atomic(p, KM_USER0);
-			memcpy(buf, map + offset, length);
-			kunmap_atomic(map, KM_USER0);
-		} else
-			memset(buf, 0, length);
-
-		addr += length;
-		buf += length;
-		copied += length;
-		count -= length;
-	}
-	return copied;
-}
-
-static int aligned_vwrite(char *buf, char *addr, unsigned long count)
-{
-	struct page *p;
-	int copied = 0;
-
-	while (count) {
-		unsigned long offset, length;
-
-		offset = (unsigned long)addr & ~PAGE_MASK;
-		length = PAGE_SIZE - offset;
-		if (length > count)
-			length = count;
-		p = vmalloc_to_page(addr);
-		/*
-		 * To do safe access to this _mapped_ area, we need
-		 * lock. But adding lock here means that we need to add
-		 * overhead of vmalloc()/vfree() calles for this _debug_
-		 * interface, rarely used. Instead of that, we'll use
-		 * kmap() and get small overhead in this access function.
-		 */
-		if (p) {
-			/*
-			 * we can expect USER0 is not used (see vread/vwrite's
-			 * function description)
-			 */
-			void *map = kmap_atomic(p, KM_USER0);
-			memcpy(map + offset, buf, length);
-			kunmap_atomic(map, KM_USER0);
-		}
-		addr += length;
-		buf += length;
-		copied += length;
-		count -= length;
-	}
-	return copied;
-}
-
 /**
- *	vread() -  read vmalloc area in a safe way.
+ *	vread_page() -  read vmalloc area in a safe way.
  *	@buf:		buffer for reading data
  *	@addr:		vm address.
- *	@count:		number of bytes to be read.
+ *	@count:		number of bytes to read inside the page.
  *
- *	Returns # of bytes which addr and buf should be increased.
- *	(same number to @count). Returns 0 if [addr...addr+count) doesn't
- *	includes any intersect with alive vmalloc area.
+ *	Returns # of bytes copied on success.
+ *	Returns 0 if @addr is not vmalloc'ed, or is mapped to non-RAM.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from that area to a given buffer. If the given memory range
  *	of [addr...addr+count) includes some valid address, data is copied to
  *	proper area of @buf. If there are memory holes, they'll be zero-filled.
- *	IOREMAP area is treated as memory hole and no copy is done.
  *
- *	If [addr...addr+count) doesn't includes any intersects with alive
- *	vm_struct area, returns 0.
  *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
  *	the caller should guarantee KM_USER0 is not used.
  *
- *	Note: In usual ops, vread() is never necessary because the caller
+ *	Note: In usual ops, vread_page() is never necessary because the caller
  *	should know vmalloc() area is valid and can use memcpy().
  *	This is for routines which have to access vmalloc area without
- *	any informaion, as /dev/kmem.
+ *	any informaion, as /dev/kmem and /dev/kcore.
  *
  */
 
-long vread(char *buf, char *addr, unsigned long count)
+int vread_page(char *buf, char *addr, unsigned int count)
 {
-	struct vm_struct *tmp;
-	char *vaddr, *buf_start = buf;
-	unsigned long buflen = count;
-	unsigned long n;
-
-	/* Don't allow overflow */
-	if ((unsigned long) addr + count < count)
-		count = -(unsigned long) addr;
+	struct page *p;
+	void *map;
+	int offset = (unsigned long)addr & (PAGE_SIZE - 1);
 
-	read_lock(&vmlist_lock);
-	for (tmp = vmlist; count && tmp; tmp = tmp->next) {
-		vaddr = (char *) tmp->addr;
-		if (addr >= vaddr + tmp->size - PAGE_SIZE)
-			continue;
-		while (addr < vaddr) {
-			if (count == 0)
-				goto finished;
-			*buf = '\0';
-			buf++;
-			addr++;
-			count--;
-		}
-		n = vaddr + tmp->size - PAGE_SIZE - addr;
-		if (n > count)
-			n = count;
-		if (!(tmp->flags & VM_IOREMAP))
-			aligned_vread(buf, addr, n);
-		else /* IOREMAP area is treated as memory hole */
-			memset(buf, 0, n);
-		buf += n;
-		addr += n;
-		count -= n;
-	}
-finished:
-	read_unlock(&vmlist_lock);
+	/* Assume subpage access */
+	BUG_ON(count > PAGE_SIZE - offset);
 
-	if (buf == buf_start)
+	p = vmalloc_to_page(addr);
+	if (!p || !page_is_ram(page_to_pfn(p))) {
+		memset(buf, 0, count);
 		return 0;
-	/* zero-fill memory holes */
-	if (buf != buf_start + buflen)
-		memset(buf, 0, buflen - (buf - buf_start));
+	}
 
-	return buflen;
+	/*
+	 * To do safe access to this _mapped_ area, we need
+	 * lock. But adding lock here means that we need to add
+	 * overhead of vmalloc()/vfree() calles for this _debug_
+	 * interface, rarely used. Instead of that, we'll use
+	 * kmap() and get small overhead in this access function.
+	 */
+	map = kmap_atomic(p, KM_USER0);
+	memcpy(buf, map + offset, count);
+	kunmap_atomic(map, KM_USER0);
+
+	return count;
 }
 
 /**
- *	vwrite() -  write vmalloc area in a safe way.
+ *	vwrite_page() -  write vmalloc area in a safe way.
  *	@buf:		buffer for source data
  *	@addr:		vm address.
- *	@count:		number of bytes to be read.
+ *	@count:		number of bytes to write inside the page.
  *
- *	Returns # of bytes which addr and buf should be incresed.
- *	(same number to @count).
- *	If [addr...addr+count) doesn't includes any intersect with valid
- *	vmalloc area, returns 0.
+ *	Returns # of bytes copied on success.
+ *	Returns 0 if @addr is not vmalloc'ed, or is mapped to non-RAM.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from a buffer to the given addr. If specified range of
  *	[addr...addr+count) includes some valid address, data is copied from
  *	proper area of @buf. If there are memory holes, no copy to hole.
- *	IOREMAP area is treated as memory hole and no copy is done.
  *
- *	If [addr...addr+count) doesn't includes any intersects with alive
- *	vm_struct area, returns 0.
  *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
  *	the caller should guarantee KM_USER0 is not used.
  *
- *	Note: In usual ops, vwrite() is never necessary because the caller
+ *	Note: In usual ops, vwrite_page() is never necessary because the caller
  *	should know vmalloc() area is valid and can use memcpy().
  *	This is for routines which have to access vmalloc area without
  *	any informaion, as /dev/kmem.
- *
- *	The caller should guarantee KM_USER1 is not used.
  */
 
-long vwrite(char *buf, char *addr, unsigned long count)
+int vwrite_page(char *buf, char *addr, unsigned int count)
 {
-	struct vm_struct *tmp;
-	char *vaddr;
-	unsigned long n, buflen;
-	int copied = 0;
-
-	/* Don't allow overflow */
-	if ((unsigned long) addr + count < count)
-		count = -(unsigned long) addr;
-	buflen = count;
+	struct page *p;
+	void *map;
+	int offset = (unsigned long)addr & (PAGE_SIZE - 1);
 
-	read_lock(&vmlist_lock);
-	for (tmp = vmlist; count && tmp; tmp = tmp->next) {
-		vaddr = (char *) tmp->addr;
-		if (addr >= vaddr + tmp->size - PAGE_SIZE)
-			continue;
-		while (addr < vaddr) {
-			if (count == 0)
-				goto finished;
-			buf++;
-			addr++;
-			count--;
-		}
-		n = vaddr + tmp->size - PAGE_SIZE - addr;
-		if (n > count)
-			n = count;
-		if (!(tmp->flags & VM_IOREMAP)) {
-			aligned_vwrite(buf, addr, n);
-			copied++;
-		}
-		buf += n;
-		addr += n;
-		count -= n;
-	}
-finished:
-	read_unlock(&vmlist_lock);
-	if (!copied)
+	/* Assume subpage access */
+	BUG_ON(count > PAGE_SIZE - offset);
+
+	p = vmalloc_to_page(addr);
+	if (!p)
+		return 0;
+	if (!page_is_ram(page_to_pfn(p)))
 		return 0;
-	return buflen;
+
+	map = kmap_atomic(p, KM_USER0);
+	memcpy(map + offset, buf, count);
+	kunmap_atomic(map, KM_USER0);
+
+	return count;
 }
 
 /**
--- linux-mm.orig/drivers/char/mem.c	2010-01-13 21:23:58.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2010-01-13 21:26:10.000000000 +0800
@@ -394,7 +394,7 @@ static ssize_t read_kmem(struct file *fi
 {
 	unsigned long p = *ppos;
 	ssize_t low_count, read, sz;
-	char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */
+	char *kbuf;	/* k-addr because vread_page() does kmap_atomic */
 	int err = 0;
 
 	read = 0;
@@ -446,7 +446,7 @@ static ssize_t read_kmem(struct file *fi
 				err = -ENXIO;
 				break;
 			}
-			sz = vread(kbuf, (char *)p, sz);
+			sz = vread_page(kbuf, (char *)p, sz);
 			if (!sz)
 				break;
 			if (copy_to_user(buf, kbuf, sz)) {
@@ -524,7 +524,7 @@ static ssize_t write_kmem(struct file * 
 	unsigned long p = *ppos;
 	ssize_t wrote = 0;
 	ssize_t virtr = 0;
-	char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */
+	char *kbuf;	/* k-addr because vwrite_page() does kmap_atomic */
 	int err = 0;
 
 	if (p < (unsigned long) high_memory) {
@@ -555,7 +555,7 @@ static ssize_t write_kmem(struct file * 
 				err = -EFAULT;
 				break;
 			}
-			vwrite(kbuf, (char *)p, sz);
+			vwrite_page(kbuf, (char *)p, sz);
 			count -= sz;
 			buf += sz;
 			virtr += sz;
--- linux-mm.orig/fs/proc/kcore.c	2010-01-13 21:23:05.000000000 +0800
+++ linux-mm/fs/proc/kcore.c	2010-01-13 21:24:00.000000000 +0800
@@ -499,7 +499,7 @@ read_kcore(struct file *file, char __use
 			elf_buf = kzalloc(tsz, GFP_KERNEL);
 			if (!elf_buf)
 				return -ENOMEM;
-			vread(elf_buf, (char *)start, tsz);
+			vread_page(elf_buf, (char *)start, tsz);
 			/* we have to zero-fill user buffer even if no read */
 			if (copy_to_user(buffer, elf_buf, tsz)) {
 				kfree(elf_buf);
--- linux-mm.orig/include/linux/vmalloc.h	2010-01-13 21:23:05.000000000 +0800
+++ linux-mm/include/linux/vmalloc.h	2010-01-13 21:24:00.000000000 +0800
@@ -104,9 +104,9 @@ extern void unmap_kernel_range(unsigned 
 extern struct vm_struct *alloc_vm_area(size_t size);
 extern void free_vm_area(struct vm_struct *area);
 
-/* for /dev/kmem */
-extern long vread(char *buf, char *addr, unsigned long count);
-extern long vwrite(char *buf, char *addr, unsigned long count);
+/* for /dev/kmem and /proc/kcore */
+extern int vread_page(char *buf, char *addr, unsigned int count);
+extern int vwrite_page(char *buf, char *addr, unsigned int count);
 
 /*
  *	Internals.  Dont't use..


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 6/8] hwpoison: prevent /dev/kmem from accessing hwpoison pages
  2010-01-13 13:53 ` Wu Fengguang
@ 2010-01-13 13:53   ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Kelly Bowa, Greg KH, Andi Kleen,
	Benjamin Herrenschmidt, Christoph Lameter, Ingo Molnar,
	Tejun Heo, Nick Piggin, KAMEZAWA Hiroyuki, Hugh Dickins,
	Linux Memory Management List

[-- Attachment #1: hwpoison-dev-kmem.patch --]
[-- Type: text/plain, Size: 3761 bytes --]

When /dev/kmem read()/write() encounters hwpoison page, stop it
and return the amount of work done till now, or return -EIO if
nothing have been copied.

CC: Kelly Bowa <kmb@tuxedu.org>
CC: Greg KH <greg@kroah.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Tejun Heo <tj@kernel.org>
CC: Nick Piggin <npiggin@suse.de>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c |   26 ++++++++++++++++++++------
 mm/vmalloc.c       |    8 ++++++++
 2 files changed, 28 insertions(+), 6 deletions(-)

--- linux-mm.orig/drivers/char/mem.c	2010-01-11 10:32:39.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2010-01-11 10:32:42.000000000 +0800
@@ -426,6 +426,9 @@ static ssize_t read_kmem(struct file *fi
 			 */
 			kbuf = xlate_dev_kmem_ptr((char *)p);
 
+			if (unlikely(virt_addr_valid(kbuf) &&
+				     PageHWPoison(virt_to_page(kbuf))))
+				return -EIO;
 			if (copy_to_user(buf, kbuf, sz))
 				return -EFAULT;
 			buf += sz;
@@ -447,8 +450,10 @@ static ssize_t read_kmem(struct file *fi
 				break;
 			}
 			sz = vread_page(kbuf, (char *)p, sz);
-			if (!sz)
+			if (sz <= 0) {
+				err = sz;
 				break;
+			}
 			if (copy_to_user(buf, kbuf, sz)) {
 				err = -EFAULT;
 				break;
@@ -471,6 +476,7 @@ do_write_kmem(unsigned long p, const cha
 {
 	ssize_t written, sz;
 	unsigned long copied;
+	int err = 0;
 
 	written = 0;
 #ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED
@@ -497,13 +503,19 @@ do_write_kmem(unsigned long p, const cha
 		 */
 		ptr = xlate_dev_kmem_ptr((char *)p);
 
+		if (unlikely(virt_addr_valid(ptr) &&
+			     PageHWPoison(virt_to_page(ptr)))) {
+			err = -EIO;
+			break;
+		}
+
 		copied = copy_from_user(ptr, buf, sz);
 		if (copied) {
 			written += sz - copied;
-			if (written)
-				break;
-			return -EFAULT;
+			err = -EFAULT;
+			break;
 		}
+
 		buf += sz;
 		p += sz;
 		count -= sz;
@@ -511,7 +523,7 @@ do_write_kmem(unsigned long p, const cha
 	}
 
 	*ppos += written;
-	return written;
+	return written ? written : err;
 }
 
 
@@ -555,7 +567,9 @@ static ssize_t write_kmem(struct file * 
 				err = -EFAULT;
 				break;
 			}
-			vwrite_page(kbuf, (char *)p, sz);
+			err = vwrite_page(kbuf, (char *)p, sz);
+			if (err < 0)
+				break;
 			count -= sz;
 			buf += sz;
 			virtr += sz;
--- linux-mm.orig/mm/vmalloc.c	2010-01-11 10:32:39.000000000 +0800
+++ linux-mm/mm/vmalloc.c	2010-01-11 10:33:21.000000000 +0800
@@ -1654,6 +1654,7 @@ EXPORT_SYMBOL(vmalloc_32_user);
  *
  *	Returns # of bytes copied on success.
  *	Returns 0 if @addr is not vmalloc'ed, or is mapped to non-RAM.
+ *	Returns -EIO if the mapped page is corrupted.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from that area to a given buffer. If the given memory range
@@ -1684,6 +1685,10 @@ int vread_page(char *buf, char *addr, un
 		memset(buf, 0, count);
 		return 0;
 	}
+	if (PageHWPoison(p)) {
+		memset(buf, 0, count);
+		return -EIO;
+	}
 
 	/*
 	 * To do safe access to this _mapped_ area, we need
@@ -1707,6 +1712,7 @@ int vread_page(char *buf, char *addr, un
  *
  *	Returns # of bytes copied on success.
  *	Returns 0 if @addr is not vmalloc'ed, or is mapped to non-RAM.
+ *	Returns -EIO if the mapped page is corrupted.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from a buffer to the given addr. If specified range of
@@ -1736,6 +1742,8 @@ int vwrite_page(char *buf, char *addr, u
 		return 0;
 	if (!page_is_ram(page_to_pfn(p)))
 		return 0;
+	if (PageHWPoison(p))
+		return -EIO;
 
 	map = kmap_atomic(p, KM_USER0);
 	memcpy(map + offset, buf, count);



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 6/8] hwpoison: prevent /dev/kmem from accessing hwpoison pages
@ 2010-01-13 13:53   ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Kelly Bowa, Greg KH, Andi Kleen,
	Benjamin Herrenschmidt, Christoph Lameter, Ingo Molnar,
	Tejun Heo, Nick Piggin, KAMEZAWA Hiroyuki, Hugh Dickins,
	Linux Memory Management List

[-- Attachment #1: hwpoison-dev-kmem.patch --]
[-- Type: text/plain, Size: 3986 bytes --]

When /dev/kmem read()/write() encounters hwpoison page, stop it
and return the amount of work done till now, or return -EIO if
nothing have been copied.

CC: Kelly Bowa <kmb@tuxedu.org>
CC: Greg KH <greg@kroah.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Tejun Heo <tj@kernel.org>
CC: Nick Piggin <npiggin@suse.de>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c |   26 ++++++++++++++++++++------
 mm/vmalloc.c       |    8 ++++++++
 2 files changed, 28 insertions(+), 6 deletions(-)

--- linux-mm.orig/drivers/char/mem.c	2010-01-11 10:32:39.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2010-01-11 10:32:42.000000000 +0800
@@ -426,6 +426,9 @@ static ssize_t read_kmem(struct file *fi
 			 */
 			kbuf = xlate_dev_kmem_ptr((char *)p);
 
+			if (unlikely(virt_addr_valid(kbuf) &&
+				     PageHWPoison(virt_to_page(kbuf))))
+				return -EIO;
 			if (copy_to_user(buf, kbuf, sz))
 				return -EFAULT;
 			buf += sz;
@@ -447,8 +450,10 @@ static ssize_t read_kmem(struct file *fi
 				break;
 			}
 			sz = vread_page(kbuf, (char *)p, sz);
-			if (!sz)
+			if (sz <= 0) {
+				err = sz;
 				break;
+			}
 			if (copy_to_user(buf, kbuf, sz)) {
 				err = -EFAULT;
 				break;
@@ -471,6 +476,7 @@ do_write_kmem(unsigned long p, const cha
 {
 	ssize_t written, sz;
 	unsigned long copied;
+	int err = 0;
 
 	written = 0;
 #ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED
@@ -497,13 +503,19 @@ do_write_kmem(unsigned long p, const cha
 		 */
 		ptr = xlate_dev_kmem_ptr((char *)p);
 
+		if (unlikely(virt_addr_valid(ptr) &&
+			     PageHWPoison(virt_to_page(ptr)))) {
+			err = -EIO;
+			break;
+		}
+
 		copied = copy_from_user(ptr, buf, sz);
 		if (copied) {
 			written += sz - copied;
-			if (written)
-				break;
-			return -EFAULT;
+			err = -EFAULT;
+			break;
 		}
+
 		buf += sz;
 		p += sz;
 		count -= sz;
@@ -511,7 +523,7 @@ do_write_kmem(unsigned long p, const cha
 	}
 
 	*ppos += written;
-	return written;
+	return written ? written : err;
 }
 
 
@@ -555,7 +567,9 @@ static ssize_t write_kmem(struct file * 
 				err = -EFAULT;
 				break;
 			}
-			vwrite_page(kbuf, (char *)p, sz);
+			err = vwrite_page(kbuf, (char *)p, sz);
+			if (err < 0)
+				break;
 			count -= sz;
 			buf += sz;
 			virtr += sz;
--- linux-mm.orig/mm/vmalloc.c	2010-01-11 10:32:39.000000000 +0800
+++ linux-mm/mm/vmalloc.c	2010-01-11 10:33:21.000000000 +0800
@@ -1654,6 +1654,7 @@ EXPORT_SYMBOL(vmalloc_32_user);
  *
  *	Returns # of bytes copied on success.
  *	Returns 0 if @addr is not vmalloc'ed, or is mapped to non-RAM.
+ *	Returns -EIO if the mapped page is corrupted.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from that area to a given buffer. If the given memory range
@@ -1684,6 +1685,10 @@ int vread_page(char *buf, char *addr, un
 		memset(buf, 0, count);
 		return 0;
 	}
+	if (PageHWPoison(p)) {
+		memset(buf, 0, count);
+		return -EIO;
+	}
 
 	/*
 	 * To do safe access to this _mapped_ area, we need
@@ -1707,6 +1712,7 @@ int vread_page(char *buf, char *addr, un
  *
  *	Returns # of bytes copied on success.
  *	Returns 0 if @addr is not vmalloc'ed, or is mapped to non-RAM.
+ *	Returns -EIO if the mapped page is corrupted.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from a buffer to the given addr. If specified range of
@@ -1736,6 +1742,8 @@ int vwrite_page(char *buf, char *addr, u
 		return 0;
 	if (!page_is_ram(page_to_pfn(p)))
 		return 0;
+	if (PageHWPoison(p))
+		return -EIO;
 
 	map = kmap_atomic(p, KM_USER0);
 	memcpy(map + offset, buf, count);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 7/8] hwpoison: prevent /dev/mem from accessing hwpoison pages
  2010-01-13 13:53 ` Wu Fengguang
@ 2010-01-13 13:53   ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Kelly Bowa, Greg KH, KAMEZAWA Hiroyuki,
	Andi Kleen, Nick Piggin, Hugh Dickins,
	Linux Memory Management List

[-- Attachment #1: hwpoison-dev-mem.patch --]
[-- Type: text/plain, Size: 3425 bytes --]

Return EIO when user space tries to read/write/mmap hwpoison pages
via the /dev/mem interface.

The approach: rename range_is_allowed() to devmem_check_pfn_range(), and
add PageHWPoison() test in it. This function will be called for the whole
mmap() range, or page by page for read()/write(). So it would fail the
mmap() request as a whole, and return partial results for read()/write().

CC: Kelly Bowa <kmb@tuxedu.org>
CC: Greg KH <greg@kroah.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c |   39 +++++++++++++++++++++------------------
 1 file changed, 21 insertions(+), 18 deletions(-)

--- linux-mm.orig/drivers/char/mem.c	2009-12-29 10:47:00.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2009-12-29 10:54:07.000000000 +0800
@@ -89,31 +89,28 @@ static inline int valid_mmap_phys_addr_r
 }
 #endif
 
-#ifdef CONFIG_STRICT_DEVMEM
-static inline int range_is_allowed(unsigned long pfn, unsigned long size)
+static int devmem_check_pfn_range(unsigned long pfn, unsigned long bytes)
 {
 	u64 from = ((u64)pfn) << PAGE_SHIFT;
-	u64 to = from + size;
+	u64 to = from + bytes;
 	u64 cursor = from;
 
 	while (cursor < to) {
+#ifdef CONFIG_STRICT_DEVMEM
 		if (!devmem_is_allowed(pfn)) {
 			printk(KERN_INFO
 		"Program %s tried to access /dev/mem between %Lx->%Lx.\n",
 				current->comm, from, to);
-			return 0;
+			return -EPERM;
 		}
+#endif
+		if (pfn_valid(pfn) && PageHWPoison(pfn_to_page(pfn)))
+			return -EIO;
 		cursor += PAGE_SIZE;
 		pfn++;
 	}
-	return 1;
-}
-#else
-static inline int range_is_allowed(unsigned long pfn, unsigned long size)
-{
-	return 1;
+	return 0;
 }
-#endif
 
 void __attribute__((weak)) unxlate_dev_mem_ptr(unsigned long phys, void *addr)
 {
@@ -150,11 +147,13 @@ static ssize_t read_mem(struct file * fi
 
 	while (count > 0) {
 		unsigned long remaining;
+		int err;
 
 		sz = size_inside_page(p, count);
 
-		if (!range_is_allowed(p >> PAGE_SHIFT, count))
-			return -EPERM;
+		err = devmem_check_pfn_range(p >> PAGE_SHIFT, count);
+		if (err)
+			return err;
 
 		/*
 		 * On ia64 if a page has been mapped somewhere as
@@ -184,9 +183,10 @@ static ssize_t write_mem(struct file * f
 			 size_t count, loff_t *ppos)
 {
 	unsigned long p = *ppos;
-	ssize_t written, sz;
 	unsigned long copied;
+	ssize_t written, sz;
 	void *ptr;
+	int err;
 
 	if (!valid_phys_addr_range(p, count))
 		return -EFAULT;
@@ -208,8 +208,9 @@ static ssize_t write_mem(struct file * f
 	while (count > 0) {
 		sz = size_inside_page(p, count);
 
-		if (!range_is_allowed(p >> PAGE_SHIFT, sz))
-			return -EPERM;
+		err = devmem_check_pfn_range(p >> PAGE_SHIFT, sz);
+		if (err)
+			return err;
 
 		/*
 		 * On ia64 if a page has been mapped somewhere as
@@ -297,6 +298,7 @@ static const struct vm_operations_struct
 static int mmap_mem(struct file * file, struct vm_area_struct * vma)
 {
 	size_t size = vma->vm_end - vma->vm_start;
+	int err;
 
 	if (!valid_mmap_phys_addr_range(vma->vm_pgoff, size))
 		return -EINVAL;
@@ -304,8 +306,9 @@ static int mmap_mem(struct file * file, 
 	if (!private_mapping_ok(vma))
 		return -ENOSYS;
 
-	if (!range_is_allowed(vma->vm_pgoff, size))
-		return -EPERM;
+	err = devmem_check_pfn_range(vma->vm_pgoff, size);
+	if (err)
+		return err;
 
 	if (!phys_mem_access_prot_allowed(file, vma->vm_pgoff, size,
 						&vma->vm_page_prot))



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 7/8] hwpoison: prevent /dev/mem from accessing hwpoison pages
@ 2010-01-13 13:53   ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Kelly Bowa, Greg KH, KAMEZAWA Hiroyuki,
	Andi Kleen, Nick Piggin, Hugh Dickins,
	Linux Memory Management List

[-- Attachment #1: hwpoison-dev-mem.patch --]
[-- Type: text/plain, Size: 3650 bytes --]

Return EIO when user space tries to read/write/mmap hwpoison pages
via the /dev/mem interface.

The approach: rename range_is_allowed() to devmem_check_pfn_range(), and
add PageHWPoison() test in it. This function will be called for the whole
mmap() range, or page by page for read()/write(). So it would fail the
mmap() request as a whole, and return partial results for read()/write().

CC: Kelly Bowa <kmb@tuxedu.org>
CC: Greg KH <greg@kroah.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c |   39 +++++++++++++++++++++------------------
 1 file changed, 21 insertions(+), 18 deletions(-)

--- linux-mm.orig/drivers/char/mem.c	2009-12-29 10:47:00.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2009-12-29 10:54:07.000000000 +0800
@@ -89,31 +89,28 @@ static inline int valid_mmap_phys_addr_r
 }
 #endif
 
-#ifdef CONFIG_STRICT_DEVMEM
-static inline int range_is_allowed(unsigned long pfn, unsigned long size)
+static int devmem_check_pfn_range(unsigned long pfn, unsigned long bytes)
 {
 	u64 from = ((u64)pfn) << PAGE_SHIFT;
-	u64 to = from + size;
+	u64 to = from + bytes;
 	u64 cursor = from;
 
 	while (cursor < to) {
+#ifdef CONFIG_STRICT_DEVMEM
 		if (!devmem_is_allowed(pfn)) {
 			printk(KERN_INFO
 		"Program %s tried to access /dev/mem between %Lx->%Lx.\n",
 				current->comm, from, to);
-			return 0;
+			return -EPERM;
 		}
+#endif
+		if (pfn_valid(pfn) && PageHWPoison(pfn_to_page(pfn)))
+			return -EIO;
 		cursor += PAGE_SIZE;
 		pfn++;
 	}
-	return 1;
-}
-#else
-static inline int range_is_allowed(unsigned long pfn, unsigned long size)
-{
-	return 1;
+	return 0;
 }
-#endif
 
 void __attribute__((weak)) unxlate_dev_mem_ptr(unsigned long phys, void *addr)
 {
@@ -150,11 +147,13 @@ static ssize_t read_mem(struct file * fi
 
 	while (count > 0) {
 		unsigned long remaining;
+		int err;
 
 		sz = size_inside_page(p, count);
 
-		if (!range_is_allowed(p >> PAGE_SHIFT, count))
-			return -EPERM;
+		err = devmem_check_pfn_range(p >> PAGE_SHIFT, count);
+		if (err)
+			return err;
 
 		/*
 		 * On ia64 if a page has been mapped somewhere as
@@ -184,9 +183,10 @@ static ssize_t write_mem(struct file * f
 			 size_t count, loff_t *ppos)
 {
 	unsigned long p = *ppos;
-	ssize_t written, sz;
 	unsigned long copied;
+	ssize_t written, sz;
 	void *ptr;
+	int err;
 
 	if (!valid_phys_addr_range(p, count))
 		return -EFAULT;
@@ -208,8 +208,9 @@ static ssize_t write_mem(struct file * f
 	while (count > 0) {
 		sz = size_inside_page(p, count);
 
-		if (!range_is_allowed(p >> PAGE_SHIFT, sz))
-			return -EPERM;
+		err = devmem_check_pfn_range(p >> PAGE_SHIFT, sz);
+		if (err)
+			return err;
 
 		/*
 		 * On ia64 if a page has been mapped somewhere as
@@ -297,6 +298,7 @@ static const struct vm_operations_struct
 static int mmap_mem(struct file * file, struct vm_area_struct * vma)
 {
 	size_t size = vma->vm_end - vma->vm_start;
+	int err;
 
 	if (!valid_mmap_phys_addr_range(vma->vm_pgoff, size))
 		return -EINVAL;
@@ -304,8 +306,9 @@ static int mmap_mem(struct file * file, 
 	if (!private_mapping_ok(vma))
 		return -ENOSYS;
 
-	if (!range_is_allowed(vma->vm_pgoff, size))
-		return -EPERM;
+	err = devmem_check_pfn_range(vma->vm_pgoff, size);
+	if (err)
+		return err;
 
 	if (!phys_mem_access_prot_allowed(file, vma->vm_pgoff, size,
 						&vma->vm_page_prot))


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 8/8] hwpoison: prevent /dev/kcore from accessing hwpoison pages
  2010-01-13 13:53 ` Wu Fengguang
@ 2010-01-13 13:53   ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Ingo Molnar, Andi Kleen, Pekka Enberg,
	KAMEZAWA Hiroyuki, Nick Piggin, Hugh Dickins,
	Linux Memory Management List

[-- Attachment #1: hwpoison-kcore.patch --]
[-- Type: text/plain, Size: 1452 bytes --]

Silently fill buffer with zeros when encounter hwpoison pages
(accessing the hwpoison page content is deadly).

This patch does not cover X86_32 - which has a dumb kern_addr_valid().
It is unlikely anyone run a 32bit kernel will care about the hwpoison
feature - its usable memory is limited.

CC: Ingo Molnar <mingo@elte.hu>
CC: Andi Kleen <andi@firstfloor.org> 
CC: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 arch/x86/mm/init_64.c |   16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

--- linux-mm.orig/arch/x86/mm/init_64.c	2010-01-13 21:23:04.000000000 +0800
+++ linux-mm/arch/x86/mm/init_64.c	2010-01-13 21:25:32.000000000 +0800
@@ -825,6 +825,7 @@ int __init reserve_bootmem_generic(unsig
 int kern_addr_valid(unsigned long addr)
 {
 	unsigned long above = ((long)addr) >> __VIRTUAL_MASK_SHIFT;
+	unsigned long pfn;
 	pgd_t *pgd;
 	pud_t *pud;
 	pmd_t *pmd;
@@ -845,14 +846,23 @@ int kern_addr_valid(unsigned long addr)
 	if (pmd_none(*pmd))
 		return 0;
 
-	if (pmd_large(*pmd))
-		return pfn_valid(pmd_pfn(*pmd));
+	if (pmd_large(*pmd)) {
+		pfn = pmd_pfn(*pmd);
+		pfn += pte_index(addr);
+		goto check_pfn;
+	}
 
 	pte = pte_offset_kernel(pmd, addr);
 	if (pte_none(*pte))
 		return 0;
 
-	return pfn_valid(pte_pfn(*pte));
+	pfn = pte_pfn(*pte);
+check_pfn:
+	if (!pfn_valid(pfn))
+		return 0;
+	if (PageHWPoison(pfn_to_page(pfn)))
+		return 0;
+	return 1;
 }
 
 /*



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 8/8] hwpoison: prevent /dev/kcore from accessing hwpoison pages
@ 2010-01-13 13:53   ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-13 13:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wu Fengguang, LKML, Ingo Molnar, Andi Kleen, Pekka Enberg,
	KAMEZAWA Hiroyuki, Nick Piggin, Hugh Dickins,
	Linux Memory Management List

[-- Attachment #1: hwpoison-kcore.patch --]
[-- Type: text/plain, Size: 1677 bytes --]

Silently fill buffer with zeros when encounter hwpoison pages
(accessing the hwpoison page content is deadly).

This patch does not cover X86_32 - which has a dumb kern_addr_valid().
It is unlikely anyone run a 32bit kernel will care about the hwpoison
feature - its usable memory is limited.

CC: Ingo Molnar <mingo@elte.hu>
CC: Andi Kleen <andi@firstfloor.org> 
CC: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 arch/x86/mm/init_64.c |   16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

--- linux-mm.orig/arch/x86/mm/init_64.c	2010-01-13 21:23:04.000000000 +0800
+++ linux-mm/arch/x86/mm/init_64.c	2010-01-13 21:25:32.000000000 +0800
@@ -825,6 +825,7 @@ int __init reserve_bootmem_generic(unsig
 int kern_addr_valid(unsigned long addr)
 {
 	unsigned long above = ((long)addr) >> __VIRTUAL_MASK_SHIFT;
+	unsigned long pfn;
 	pgd_t *pgd;
 	pud_t *pud;
 	pmd_t *pmd;
@@ -845,14 +846,23 @@ int kern_addr_valid(unsigned long addr)
 	if (pmd_none(*pmd))
 		return 0;
 
-	if (pmd_large(*pmd))
-		return pfn_valid(pmd_pfn(*pmd));
+	if (pmd_large(*pmd)) {
+		pfn = pmd_pfn(*pmd);
+		pfn += pte_index(addr);
+		goto check_pfn;
+	}
 
 	pte = pte_offset_kernel(pmd, addr);
 	if (pte_none(*pte))
 		return 0;
 
-	return pfn_valid(pte_pfn(*pte));
+	pfn = pte_pfn(*pte);
+check_pfn:
+	if (!pfn_valid(pfn))
+		return 0;
+	if (PageHWPoison(pfn_to_page(pfn)))
+		return 0;
+	return 1;
 }
 
 /*


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 8/8] hwpoison: prevent /dev/kcore from accessing hwpoison pages
  2010-01-13 13:53   ` Wu Fengguang
@ 2010-01-13 14:23     ` Américo Wang
  -1 siblings, 0 replies; 41+ messages in thread
From: Américo Wang @ 2010-01-13 14:23 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, LKML, Ingo Molnar, Andi Kleen, Pekka Enberg,
	KAMEZAWA Hiroyuki, Nick Piggin, Hugh Dickins,
	Linux Memory Management List


Your $subject, I think you mean /proc/kcore...

On Wed, Jan 13, 2010 at 09:53:13PM +0800, Wu Fengguang wrote:
>Silently fill buffer with zeros when encounter hwpoison pages
>(accessing the hwpoison page content is deadly).
>
>This patch does not cover X86_32 - which has a dumb kern_addr_valid().
>It is unlikely anyone run a 32bit kernel will care about the hwpoison
>feature - its usable memory is limited.
>
>CC: Ingo Molnar <mingo@elte.hu>
>CC: Andi Kleen <andi@firstfloor.org> 
>CC: Pekka Enberg <penberg@cs.helsinki.fi>
>Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>

This patch looks fine for me.
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>

>---
> arch/x86/mm/init_64.c |   16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
>--- linux-mm.orig/arch/x86/mm/init_64.c	2010-01-13 21:23:04.000000000 +0800
>+++ linux-mm/arch/x86/mm/init_64.c	2010-01-13 21:25:32.000000000 +0800
>@@ -825,6 +825,7 @@ int __init reserve_bootmem_generic(unsig
> int kern_addr_valid(unsigned long addr)
> {
> 	unsigned long above = ((long)addr) >> __VIRTUAL_MASK_SHIFT;
>+	unsigned long pfn;
> 	pgd_t *pgd;
> 	pud_t *pud;
> 	pmd_t *pmd;
>@@ -845,14 +846,23 @@ int kern_addr_valid(unsigned long addr)
> 	if (pmd_none(*pmd))
> 		return 0;
> 
>-	if (pmd_large(*pmd))
>-		return pfn_valid(pmd_pfn(*pmd));
>+	if (pmd_large(*pmd)) {
>+		pfn = pmd_pfn(*pmd);
>+		pfn += pte_index(addr);
>+		goto check_pfn;
>+	}
> 
> 	pte = pte_offset_kernel(pmd, addr);
> 	if (pte_none(*pte))
> 		return 0;
> 
>-	return pfn_valid(pte_pfn(*pte));
>+	pfn = pte_pfn(*pte);
>+check_pfn:
>+	if (!pfn_valid(pfn))
>+		return 0;
>+	if (PageHWPoison(pfn_to_page(pfn)))
>+		return 0;
>+	return 1;
> }
> 
> /*
>
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/

-- 
Live like a child, think like the god.
 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 8/8] hwpoison: prevent /dev/kcore from accessing hwpoison pages
@ 2010-01-13 14:23     ` Américo Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Américo Wang @ 2010-01-13 14:23 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, LKML, Ingo Molnar, Andi Kleen, Pekka Enberg,
	KAMEZAWA Hiroyuki, Nick Piggin, Hugh Dickins,
	Linux Memory Management List


Your $subject, I think you mean /proc/kcore...

On Wed, Jan 13, 2010 at 09:53:13PM +0800, Wu Fengguang wrote:
>Silently fill buffer with zeros when encounter hwpoison pages
>(accessing the hwpoison page content is deadly).
>
>This patch does not cover X86_32 - which has a dumb kern_addr_valid().
>It is unlikely anyone run a 32bit kernel will care about the hwpoison
>feature - its usable memory is limited.
>
>CC: Ingo Molnar <mingo@elte.hu>
>CC: Andi Kleen <andi@firstfloor.org> 
>CC: Pekka Enberg <penberg@cs.helsinki.fi>
>Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>

This patch looks fine for me.
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>

>---
> arch/x86/mm/init_64.c |   16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
>--- linux-mm.orig/arch/x86/mm/init_64.c	2010-01-13 21:23:04.000000000 +0800
>+++ linux-mm/arch/x86/mm/init_64.c	2010-01-13 21:25:32.000000000 +0800
>@@ -825,6 +825,7 @@ int __init reserve_bootmem_generic(unsig
> int kern_addr_valid(unsigned long addr)
> {
> 	unsigned long above = ((long)addr) >> __VIRTUAL_MASK_SHIFT;
>+	unsigned long pfn;
> 	pgd_t *pgd;
> 	pud_t *pud;
> 	pmd_t *pmd;
>@@ -845,14 +846,23 @@ int kern_addr_valid(unsigned long addr)
> 	if (pmd_none(*pmd))
> 		return 0;
> 
>-	if (pmd_large(*pmd))
>-		return pfn_valid(pmd_pfn(*pmd));
>+	if (pmd_large(*pmd)) {
>+		pfn = pmd_pfn(*pmd);
>+		pfn += pte_index(addr);
>+		goto check_pfn;
>+	}
> 
> 	pte = pte_offset_kernel(pmd, addr);
> 	if (pte_none(*pte))
> 		return 0;
> 
>-	return pfn_valid(pte_pfn(*pte));
>+	pfn = pte_pfn(*pte);
>+check_pfn:
>+	if (!pfn_valid(pfn))
>+		return 0;
>+	if (PageHWPoison(pfn_to_page(pfn)))
>+		return 0;
>+	return 1;
> }
> 
> /*
>
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/

-- 
Live like a child, think like the god.
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/8] resources: introduce generic page_is_ram()
  2010-01-13 13:53   ` Wu Fengguang
@ 2010-01-13 14:29     ` Américo Wang
  -1 siblings, 0 replies; 41+ messages in thread
From: Américo Wang @ 2010-01-13 14:29 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, LKML, Chen Liqin, Lennox Wu, Ralf Baechle,
	linux-mips, KAMEZAWA Hiroyuki, Andi Kleen, Nick Piggin,
	Hugh Dickins, Linux Memory Management List

On Wed, Jan 13, 2010 at 09:53:09PM +0800, Wu Fengguang wrote:
>It's based on walk_system_ram_range(), for archs that don't have
>their own page_is_ram().
>
>The static verions in MIPS and SCORE are also made global.
>
>CC: Chen Liqin <liqin.chen@sunplusct.com>
>CC: Lennox Wu <lennox.wu@gmail.com>
>CC: Ralf Baechle <ralf@linux-mips.org>
>CC: linux-mips@linux-mips.org
>CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> 
>Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
>---
> arch/mips/mm/init.c    |    2 +-
> arch/score/mm/init.c   |    2 +-
> include/linux/ioport.h |    2 ++
> kernel/resource.c      |   10 ++++++++++
> 4 files changed, 14 insertions(+), 2 deletions(-)
>
>--- linux-mm.orig/kernel/resource.c	2010-01-10 10:11:53.000000000 +0800
>+++ linux-mm/kernel/resource.c	2010-01-10 10:15:33.000000000 +0800
>@@ -297,6 +297,16 @@ int walk_system_ram_range(unsigned long 
> 
> #endif
> 
>+static int __is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
>+{
>+	return 24;
>+}
>+
>+int __attribute__((weak)) page_is_ram(unsigned long pfn)
>+{
>+	return 24 == walk_system_ram_range(pfn, 1, NULL, __is_ram);
>+}


Why do you choose 24 instead of using a macro expressing its meaning?


>+
> /*
>  * Find empty slot in the resource tree given range and alignment.
>  */
>--- linux-mm.orig/include/linux/ioport.h	2010-01-10 10:11:53.000000000 +0800
>+++ linux-mm/include/linux/ioport.h	2010-01-10 10:11:54.000000000 +0800
>@@ -188,5 +188,7 @@ extern int
> walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
> 		void *arg, int (*func)(unsigned long, unsigned long, void *));
> 
>+extern int page_is_ram(unsigned long pfn);
>+
> #endif /* __ASSEMBLY__ */
> #endif	/* _LINUX_IOPORT_H */
>--- linux-mm.orig/arch/score/mm/init.c	2010-01-10 10:35:38.000000000 +0800
>+++ linux-mm/arch/score/mm/init.c	2010-01-10 10:38:04.000000000 +0800
>@@ -59,7 +59,7 @@ static unsigned long setup_zero_page(voi
> }
> 
> #ifndef CONFIG_NEED_MULTIPLE_NODES
>-static int __init page_is_ram(unsigned long pagenr)
>+int page_is_ram(unsigned long pagenr)
> {
> 	if (pagenr >= min_low_pfn && pagenr < max_low_pfn)
> 		return 1;
>--- linux-mm.orig/arch/mips/mm/init.c	2010-01-10 10:37:22.000000000 +0800
>+++ linux-mm/arch/mips/mm/init.c	2010-01-10 10:37:26.000000000 +0800
>@@ -298,7 +298,7 @@ void __init fixrange_init(unsigned long 
> }
> 
> #ifndef CONFIG_NEED_MULTIPLE_NODES
>-static int __init page_is_ram(unsigned long pagenr)
>+int page_is_ram(unsigned long pagenr)
> {
> 	int i;
> 
>
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/

-- 
Live like a child, think like the god.
 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/8] resources: introduce generic page_is_ram()
@ 2010-01-13 14:29     ` Américo Wang
  0 siblings, 0 replies; 41+ messages in thread
From: Américo Wang @ 2010-01-13 14:29 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, LKML, Chen Liqin, Lennox Wu, Ralf Baechle,
	linux-mips, KAMEZAWA Hiroyuki, Andi Kleen, Nick Piggin,
	Hugh Dickins, Linux Memory Management List

On Wed, Jan 13, 2010 at 09:53:09PM +0800, Wu Fengguang wrote:
>It's based on walk_system_ram_range(), for archs that don't have
>their own page_is_ram().
>
>The static verions in MIPS and SCORE are also made global.
>
>CC: Chen Liqin <liqin.chen@sunplusct.com>
>CC: Lennox Wu <lennox.wu@gmail.com>
>CC: Ralf Baechle <ralf@linux-mips.org>
>CC: linux-mips@linux-mips.org
>CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> 
>Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
>---
> arch/mips/mm/init.c    |    2 +-
> arch/score/mm/init.c   |    2 +-
> include/linux/ioport.h |    2 ++
> kernel/resource.c      |   10 ++++++++++
> 4 files changed, 14 insertions(+), 2 deletions(-)
>
>--- linux-mm.orig/kernel/resource.c	2010-01-10 10:11:53.000000000 +0800
>+++ linux-mm/kernel/resource.c	2010-01-10 10:15:33.000000000 +0800
>@@ -297,6 +297,16 @@ int walk_system_ram_range(unsigned long 
> 
> #endif
> 
>+static int __is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
>+{
>+	return 24;
>+}
>+
>+int __attribute__((weak)) page_is_ram(unsigned long pfn)
>+{
>+	return 24 == walk_system_ram_range(pfn, 1, NULL, __is_ram);
>+}


Why do you choose 24 instead of using a macro expressing its meaning?


>+
> /*
>  * Find empty slot in the resource tree given range and alignment.
>  */
>--- linux-mm.orig/include/linux/ioport.h	2010-01-10 10:11:53.000000000 +0800
>+++ linux-mm/include/linux/ioport.h	2010-01-10 10:11:54.000000000 +0800
>@@ -188,5 +188,7 @@ extern int
> walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
> 		void *arg, int (*func)(unsigned long, unsigned long, void *));
> 
>+extern int page_is_ram(unsigned long pfn);
>+
> #endif /* __ASSEMBLY__ */
> #endif	/* _LINUX_IOPORT_H */
>--- linux-mm.orig/arch/score/mm/init.c	2010-01-10 10:35:38.000000000 +0800
>+++ linux-mm/arch/score/mm/init.c	2010-01-10 10:38:04.000000000 +0800
>@@ -59,7 +59,7 @@ static unsigned long setup_zero_page(voi
> }
> 
> #ifndef CONFIG_NEED_MULTIPLE_NODES
>-static int __init page_is_ram(unsigned long pagenr)
>+int page_is_ram(unsigned long pagenr)
> {
> 	if (pagenr >= min_low_pfn && pagenr < max_low_pfn)
> 		return 1;
>--- linux-mm.orig/arch/mips/mm/init.c	2010-01-10 10:37:22.000000000 +0800
>+++ linux-mm/arch/mips/mm/init.c	2010-01-10 10:37:26.000000000 +0800
>@@ -298,7 +298,7 @@ void __init fixrange_init(unsigned long 
> }
> 
> #ifndef CONFIG_NEED_MULTIPLE_NODES
>-static int __init page_is_ram(unsigned long pagenr)
>+int page_is_ram(unsigned long pagenr)
> {
> 	int i;
> 
>
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/

-- 
Live like a child, think like the god.
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/8] resources: introduce generic page_is_ram()
  2010-01-13 14:29     ` Américo Wang
@ 2010-01-14  3:29       ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-14  3:29 UTC (permalink / raw)
  To: Américo Wang
  Cc: Andrew Morton, LKML, Chen Liqin, Lennox Wu, Ralf Baechle,
	linux-mips, KAMEZAWA Hiroyuki, Andi Kleen, Nick Piggin,
	Hugh Dickins, Linux Memory Management List

On Wed, Jan 13, 2010 at 10:29:23PM +0800, Américo Wang wrote:
> On Wed, Jan 13, 2010 at 09:53:09PM +0800, Wu Fengguang wrote:
> >It's based on walk_system_ram_range(), for archs that don't have
> >their own page_is_ram().
> >
> >The static verions in MIPS and SCORE are also made global.
> >
> >CC: Chen Liqin <liqin.chen@sunplusct.com>
> >CC: Lennox Wu <lennox.wu@gmail.com>
> >CC: Ralf Baechle <ralf@linux-mips.org>
> >CC: linux-mips@linux-mips.org
> >CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> 
> >Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> >---
> > arch/mips/mm/init.c    |    2 +-
> > arch/score/mm/init.c   |    2 +-
> > include/linux/ioport.h |    2 ++
> > kernel/resource.c      |   10 ++++++++++
> > 4 files changed, 14 insertions(+), 2 deletions(-)
> >
> >--- linux-mm.orig/kernel/resource.c	2010-01-10 10:11:53.000000000 +0800
> >+++ linux-mm/kernel/resource.c	2010-01-10 10:15:33.000000000 +0800
> >@@ -297,6 +297,16 @@ int walk_system_ram_range(unsigned long 
> > 
> > #endif
> > 
> >+static int __is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
> >+{
> >+	return 24;
> >+}
> >+
> >+int __attribute__((weak)) page_is_ram(unsigned long pfn)
> >+{
> >+	return 24 == walk_system_ram_range(pfn, 1, NULL, __is_ram);
> >+}
> 
> 
> Why do you choose 24 instead of using a macro expressing its meaning?

Hmm, I thought they are close enough to be obvious.
Anyway this should look better:

resources: introduce generic page_is_ram()

It's based on walk_system_ram_range(), for archs that don't have
their own page_is_ram().

The static verions in MIPS and SCORE are also made global.

CC: Chen Liqin <liqin.chen@sunplusct.com>
CC: Lennox Wu <lennox.wu@gmail.com>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: Américo Wang <xiyou.wangcong@gmail.com>
CC: linux-mips@linux-mips.org
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> 
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 arch/mips/mm/init.c    |    2 +-
 arch/score/mm/init.c   |    2 +-
 include/linux/ioport.h |    2 ++
 kernel/resource.c      |   11 +++++++++++
 4 files changed, 15 insertions(+), 2 deletions(-)

--- linux-mm.orig/kernel/resource.c	2010-01-13 21:27:28.000000000 +0800
+++ linux-mm/kernel/resource.c	2010-01-14 11:28:20.000000000 +0800
@@ -297,6 +297,17 @@ int walk_system_ram_range(unsigned long 
 
 #endif
 
+#define PAGE_IS_RAM	24
+static int __is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
+{
+	return PAGE_IS_RAM;
+}
+int __attribute__((weak)) page_is_ram(unsigned long pfn)
+{
+	return PAGE_IS_RAM == walk_system_ram_range(pfn, 1, NULL, __is_ram);
+}
+#undef PAGE_IS_RAM
+
 /*
  * Find empty slot in the resource tree given range and alignment.
  */
--- linux-mm.orig/include/linux/ioport.h	2010-01-13 21:27:28.000000000 +0800
+++ linux-mm/include/linux/ioport.h	2010-01-13 21:44:50.000000000 +0800
@@ -188,5 +188,7 @@ extern int
 walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 		void *arg, int (*func)(unsigned long, unsigned long, void *));
 
+extern int page_is_ram(unsigned long pfn);
+
 #endif /* __ASSEMBLY__ */
 #endif	/* _LINUX_IOPORT_H */
--- linux-mm.orig/arch/score/mm/init.c	2010-01-13 21:27:28.000000000 +0800
+++ linux-mm/arch/score/mm/init.c	2010-01-13 21:44:50.000000000 +0800
@@ -59,7 +59,7 @@ static unsigned long setup_zero_page(voi
 }
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES
-static int __init page_is_ram(unsigned long pagenr)
+int page_is_ram(unsigned long pagenr)
 {
 	if (pagenr >= min_low_pfn && pagenr < max_low_pfn)
 		return 1;
--- linux-mm.orig/arch/mips/mm/init.c	2010-01-13 21:27:28.000000000 +0800
+++ linux-mm/arch/mips/mm/init.c	2010-01-13 21:44:50.000000000 +0800
@@ -298,7 +298,7 @@ void __init fixrange_init(unsigned long 
 }
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES
-static int __init page_is_ram(unsigned long pagenr)
+int page_is_ram(unsigned long pagenr)
 {
 	int i;
 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 4/8] resources: introduce generic page_is_ram()
@ 2010-01-14  3:29       ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-14  3:29 UTC (permalink / raw)
  To: Américo Wang
  Cc: Andrew Morton, LKML, Chen Liqin, Lennox Wu, Ralf Baechle,
	linux-mips, KAMEZAWA Hiroyuki, Andi Kleen, Nick Piggin,
	Hugh Dickins, Linux Memory Management List

On Wed, Jan 13, 2010 at 10:29:23PM +0800, AmA(C)rico Wang wrote:
> On Wed, Jan 13, 2010 at 09:53:09PM +0800, Wu Fengguang wrote:
> >It's based on walk_system_ram_range(), for archs that don't have
> >their own page_is_ram().
> >
> >The static verions in MIPS and SCORE are also made global.
> >
> >CC: Chen Liqin <liqin.chen@sunplusct.com>
> >CC: Lennox Wu <lennox.wu@gmail.com>
> >CC: Ralf Baechle <ralf@linux-mips.org>
> >CC: linux-mips@linux-mips.org
> >CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> 
> >Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> >---
> > arch/mips/mm/init.c    |    2 +-
> > arch/score/mm/init.c   |    2 +-
> > include/linux/ioport.h |    2 ++
> > kernel/resource.c      |   10 ++++++++++
> > 4 files changed, 14 insertions(+), 2 deletions(-)
> >
> >--- linux-mm.orig/kernel/resource.c	2010-01-10 10:11:53.000000000 +0800
> >+++ linux-mm/kernel/resource.c	2010-01-10 10:15:33.000000000 +0800
> >@@ -297,6 +297,16 @@ int walk_system_ram_range(unsigned long 
> > 
> > #endif
> > 
> >+static int __is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
> >+{
> >+	return 24;
> >+}
> >+
> >+int __attribute__((weak)) page_is_ram(unsigned long pfn)
> >+{
> >+	return 24 == walk_system_ram_range(pfn, 1, NULL, __is_ram);
> >+}
> 
> 
> Why do you choose 24 instead of using a macro expressing its meaning?

Hmm, I thought they are close enough to be obvious.
Anyway this should look better:

resources: introduce generic page_is_ram()

It's based on walk_system_ram_range(), for archs that don't have
their own page_is_ram().

The static verions in MIPS and SCORE are also made global.

CC: Chen Liqin <liqin.chen@sunplusct.com>
CC: Lennox Wu <lennox.wu@gmail.com>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: AmA(C)rico Wang <xiyou.wangcong@gmail.com>
CC: linux-mips@linux-mips.org
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> 
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 arch/mips/mm/init.c    |    2 +-
 arch/score/mm/init.c   |    2 +-
 include/linux/ioport.h |    2 ++
 kernel/resource.c      |   11 +++++++++++
 4 files changed, 15 insertions(+), 2 deletions(-)

--- linux-mm.orig/kernel/resource.c	2010-01-13 21:27:28.000000000 +0800
+++ linux-mm/kernel/resource.c	2010-01-14 11:28:20.000000000 +0800
@@ -297,6 +297,17 @@ int walk_system_ram_range(unsigned long 
 
 #endif
 
+#define PAGE_IS_RAM	24
+static int __is_ram(unsigned long pfn, unsigned long nr_pages, void *arg)
+{
+	return PAGE_IS_RAM;
+}
+int __attribute__((weak)) page_is_ram(unsigned long pfn)
+{
+	return PAGE_IS_RAM == walk_system_ram_range(pfn, 1, NULL, __is_ram);
+}
+#undef PAGE_IS_RAM
+
 /*
  * Find empty slot in the resource tree given range and alignment.
  */
--- linux-mm.orig/include/linux/ioport.h	2010-01-13 21:27:28.000000000 +0800
+++ linux-mm/include/linux/ioport.h	2010-01-13 21:44:50.000000000 +0800
@@ -188,5 +188,7 @@ extern int
 walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 		void *arg, int (*func)(unsigned long, unsigned long, void *));
 
+extern int page_is_ram(unsigned long pfn);
+
 #endif /* __ASSEMBLY__ */
 #endif	/* _LINUX_IOPORT_H */
--- linux-mm.orig/arch/score/mm/init.c	2010-01-13 21:27:28.000000000 +0800
+++ linux-mm/arch/score/mm/init.c	2010-01-13 21:44:50.000000000 +0800
@@ -59,7 +59,7 @@ static unsigned long setup_zero_page(voi
 }
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES
-static int __init page_is_ram(unsigned long pagenr)
+int page_is_ram(unsigned long pagenr)
 {
 	if (pagenr >= min_low_pfn && pagenr < max_low_pfn)
 		return 1;
--- linux-mm.orig/arch/mips/mm/init.c	2010-01-13 21:27:28.000000000 +0800
+++ linux-mm/arch/mips/mm/init.c	2010-01-13 21:44:50.000000000 +0800
@@ -298,7 +298,7 @@ void __init fixrange_init(unsigned long 
 }
 
 #ifndef CONFIG_NEED_MULTIPLE_NODES
-static int __init page_is_ram(unsigned long pagenr)
+int page_is_ram(unsigned long pagenr)
 {
 	int i;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
  2010-01-13 13:53   ` Wu Fengguang
@ 2010-01-14 12:45     ` Nick Piggin
  -1 siblings, 0 replies; 41+ messages in thread
From: Nick Piggin @ 2010-01-14 12:45 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, LKML, Tejun Heo, Ingo Molnar, Andi Kleen,
	Hugh Dickins, Christoph Lameter, KAMEZAWA Hiroyuki,
	Linux Memory Management List

On Wed, Jan 13, 2010 at 09:53:10PM +0800, Wu Fengguang wrote:
> vread()/vwrite() is only called from kcore/kmem to access one page at a time.
> So the logic can be vastly simplified.
> 
> The changes are:
> - remove the vmlist walk and rely solely on vmalloc_to_page()
> - replace the VM_IOREMAP check with (page && page_is_ram(pfn))
> - rename to vread_page()/vwrite_page()
> 
> The page_is_ram() check is necessary because kmap_atomic() is not
> designed to work with non-RAM pages.

I don't know if you can really do this. Previously vmlist_lock would be
taken, which will prevent these vm areas from being freed.

 
> Note that even for a RAM page, we don't own the page, and cannot assume
> it's a _PAGE_CACHE_WB page.

So why is this not a problem for your patch? I don't see how you handle
it.

What's the problem with the current code, exactly? I would prefer that
you continue using the same vmlist locking and checking for validating
addresses.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
@ 2010-01-14 12:45     ` Nick Piggin
  0 siblings, 0 replies; 41+ messages in thread
From: Nick Piggin @ 2010-01-14 12:45 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, LKML, Tejun Heo, Ingo Molnar, Andi Kleen,
	Hugh Dickins, Christoph Lameter, KAMEZAWA Hiroyuki,
	Linux Memory Management List

On Wed, Jan 13, 2010 at 09:53:10PM +0800, Wu Fengguang wrote:
> vread()/vwrite() is only called from kcore/kmem to access one page at a time.
> So the logic can be vastly simplified.
> 
> The changes are:
> - remove the vmlist walk and rely solely on vmalloc_to_page()
> - replace the VM_IOREMAP check with (page && page_is_ram(pfn))
> - rename to vread_page()/vwrite_page()
> 
> The page_is_ram() check is necessary because kmap_atomic() is not
> designed to work with non-RAM pages.

I don't know if you can really do this. Previously vmlist_lock would be
taken, which will prevent these vm areas from being freed.

 
> Note that even for a RAM page, we don't own the page, and cannot assume
> it's a _PAGE_CACHE_WB page.

So why is this not a problem for your patch? I don't see how you handle
it.

What's the problem with the current code, exactly? I would prefer that
you continue using the same vmlist locking and checking for validating
addresses.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
  2010-01-14 12:45     ` Nick Piggin
@ 2010-01-18 13:35       ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-18 13:35 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Andrew Morton, LKML, Tejun Heo, Ingo Molnar, Andi Kleen,
	Hugh Dickins, Christoph Lameter, KAMEZAWA Hiroyuki,
	Linux Memory Management List

On Thu, Jan 14, 2010 at 05:45:26AM -0700, Nick Piggin wrote:
> On Wed, Jan 13, 2010 at 09:53:10PM +0800, Wu Fengguang wrote:
> > vread()/vwrite() is only called from kcore/kmem to access one page at a time.
> > So the logic can be vastly simplified.
> > 
> > The changes are:
> > - remove the vmlist walk and rely solely on vmalloc_to_page()
> > - replace the VM_IOREMAP check with (page && page_is_ram(pfn))
> > - rename to vread_page()/vwrite_page()
> > 
> > The page_is_ram() check is necessary because kmap_atomic() is not
> > designed to work with non-RAM pages.
> 
> I don't know if you can really do this. Previously vmlist_lock would be
> taken, which will prevent these vm areas from being freed.
>  
> > Note that even for a RAM page, we don't own the page, and cannot assume
> > it's a _PAGE_CACHE_WB page.
> 
> So why is this not a problem for your patch? I don't see how you handle
> it.

Sorry I didn't handle it. Just hope to catch attentions from someone
(ie. you :).

It's not a problem for x86_64 at all. For others I wonder if any
driver will vmalloc HIGHMEM pages with !_PAGE_CACHE_WB attribute..

So I noted the possible problem and leave it alone.

> What's the problem with the current code, exactly? I would prefer that

- unnecessary complexity to handle multi-page case, since it's always
  called to access one single page;

- the kmap_atomic() cache consistency problem, which I expressed some
  concern (without further action)

> you continue using the same vmlist locking and checking for validating
> addresses.

It's a reasonable suggestion. Kame, would you agree on killing the
kmap_atomic() and revert to the vmlist walk?

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
@ 2010-01-18 13:35       ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-18 13:35 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Andrew Morton, LKML, Tejun Heo, Ingo Molnar, Andi Kleen,
	Hugh Dickins, Christoph Lameter, KAMEZAWA Hiroyuki,
	Linux Memory Management List

On Thu, Jan 14, 2010 at 05:45:26AM -0700, Nick Piggin wrote:
> On Wed, Jan 13, 2010 at 09:53:10PM +0800, Wu Fengguang wrote:
> > vread()/vwrite() is only called from kcore/kmem to access one page at a time.
> > So the logic can be vastly simplified.
> > 
> > The changes are:
> > - remove the vmlist walk and rely solely on vmalloc_to_page()
> > - replace the VM_IOREMAP check with (page && page_is_ram(pfn))
> > - rename to vread_page()/vwrite_page()
> > 
> > The page_is_ram() check is necessary because kmap_atomic() is not
> > designed to work with non-RAM pages.
> 
> I don't know if you can really do this. Previously vmlist_lock would be
> taken, which will prevent these vm areas from being freed.
>  
> > Note that even for a RAM page, we don't own the page, and cannot assume
> > it's a _PAGE_CACHE_WB page.
> 
> So why is this not a problem for your patch? I don't see how you handle
> it.

Sorry I didn't handle it. Just hope to catch attentions from someone
(ie. you :).

It's not a problem for x86_64 at all. For others I wonder if any
driver will vmalloc HIGHMEM pages with !_PAGE_CACHE_WB attribute..

So I noted the possible problem and leave it alone.

> What's the problem with the current code, exactly? I would prefer that

- unnecessary complexity to handle multi-page case, since it's always
  called to access one single page;

- the kmap_atomic() cache consistency problem, which I expressed some
  concern (without further action)

> you continue using the same vmlist locking and checking for validating
> addresses.

It's a reasonable suggestion. Kame, would you agree on killing the
kmap_atomic() and revert to the vmlist walk?

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
  2010-01-18 13:35       ` Wu Fengguang
@ 2010-01-18 14:23         ` Nick Piggin
  -1 siblings, 0 replies; 41+ messages in thread
From: Nick Piggin @ 2010-01-18 14:23 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, LKML, Tejun Heo, Ingo Molnar, Andi Kleen,
	Hugh Dickins, Christoph Lameter, KAMEZAWA Hiroyuki,
	Linux Memory Management List

On Mon, Jan 18, 2010 at 09:35:12PM +0800, Wu Fengguang wrote:
> On Thu, Jan 14, 2010 at 05:45:26AM -0700, Nick Piggin wrote:
> > On Wed, Jan 13, 2010 at 09:53:10PM +0800, Wu Fengguang wrote:
> > > vread()/vwrite() is only called from kcore/kmem to access one page at a time.
> > > So the logic can be vastly simplified.
> > > 
> > > The changes are:
> > > - remove the vmlist walk and rely solely on vmalloc_to_page()
> > > - replace the VM_IOREMAP check with (page && page_is_ram(pfn))
> > > - rename to vread_page()/vwrite_page()
> > > 
> > > The page_is_ram() check is necessary because kmap_atomic() is not
> > > designed to work with non-RAM pages.
> > 
> > I don't know if you can really do this. Previously vmlist_lock would be
> > taken, which will prevent these vm areas from being freed.
> >  
> > > Note that even for a RAM page, we don't own the page, and cannot assume
> > > it's a _PAGE_CACHE_WB page.
> > 
> > So why is this not a problem for your patch? I don't see how you handle
> > it.
> 
> Sorry I didn't handle it. Just hope to catch attentions from someone
> (ie. you :).
> 
> It's not a problem for x86_64 at all. For others I wonder if any
> driver will vmalloc HIGHMEM pages with !_PAGE_CACHE_WB attribute..
> 
> So I noted the possible problem and leave it alone.

Well it doesn't need to be vmalloc. Any kind of vmap like ioremap. And
these can be accompanied by changing the caching attribute. Like agp
code, for an example. But I don't know if that ever becomes a problem
in practice.


> > What's the problem with the current code, exactly? I would prefer that
> 
> - unnecessary complexity to handle multi-page case, since it's always
>   called to access one single page;

Fair point there. It just wasn't clear what exactly is your rationale
because this was in a set of other patches.
 
> - the kmap_atomic() cache consistency problem, which I expressed some
>   concern (without further action)

Which kmap_atomic problem? Can you explain again? Virtual cache aliasing
problem you mean? Or caching attribute conflicts?

The whole thing looks stupid though, apparently kmap is used to avoid "the
lock". But the lock is already held. We should just use the vmap
address.


> > you continue using the same vmlist locking and checking for validating
> > addresses.
> 
> It's a reasonable suggestion. Kame, would you agree on killing the
> kmap_atomic() and revert to the vmlist walk?

Yes, vmlist locking is always required to have a pin on the pages, and
IMO it should be quite easy to check for IOREMAP, so we should leave
that check there to avoid the possibility of regressions.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
@ 2010-01-18 14:23         ` Nick Piggin
  0 siblings, 0 replies; 41+ messages in thread
From: Nick Piggin @ 2010-01-18 14:23 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, LKML, Tejun Heo, Ingo Molnar, Andi Kleen,
	Hugh Dickins, Christoph Lameter, KAMEZAWA Hiroyuki,
	Linux Memory Management List

On Mon, Jan 18, 2010 at 09:35:12PM +0800, Wu Fengguang wrote:
> On Thu, Jan 14, 2010 at 05:45:26AM -0700, Nick Piggin wrote:
> > On Wed, Jan 13, 2010 at 09:53:10PM +0800, Wu Fengguang wrote:
> > > vread()/vwrite() is only called from kcore/kmem to access one page at a time.
> > > So the logic can be vastly simplified.
> > > 
> > > The changes are:
> > > - remove the vmlist walk and rely solely on vmalloc_to_page()
> > > - replace the VM_IOREMAP check with (page && page_is_ram(pfn))
> > > - rename to vread_page()/vwrite_page()
> > > 
> > > The page_is_ram() check is necessary because kmap_atomic() is not
> > > designed to work with non-RAM pages.
> > 
> > I don't know if you can really do this. Previously vmlist_lock would be
> > taken, which will prevent these vm areas from being freed.
> >  
> > > Note that even for a RAM page, we don't own the page, and cannot assume
> > > it's a _PAGE_CACHE_WB page.
> > 
> > So why is this not a problem for your patch? I don't see how you handle
> > it.
> 
> Sorry I didn't handle it. Just hope to catch attentions from someone
> (ie. you :).
> 
> It's not a problem for x86_64 at all. For others I wonder if any
> driver will vmalloc HIGHMEM pages with !_PAGE_CACHE_WB attribute..
> 
> So I noted the possible problem and leave it alone.

Well it doesn't need to be vmalloc. Any kind of vmap like ioremap. And
these can be accompanied by changing the caching attribute. Like agp
code, for an example. But I don't know if that ever becomes a problem
in practice.


> > What's the problem with the current code, exactly? I would prefer that
> 
> - unnecessary complexity to handle multi-page case, since it's always
>   called to access one single page;

Fair point there. It just wasn't clear what exactly is your rationale
because this was in a set of other patches.
 
> - the kmap_atomic() cache consistency problem, which I expressed some
>   concern (without further action)

Which kmap_atomic problem? Can you explain again? Virtual cache aliasing
problem you mean? Or caching attribute conflicts?

The whole thing looks stupid though, apparently kmap is used to avoid "the
lock". But the lock is already held. We should just use the vmap
address.


> > you continue using the same vmlist locking and checking for validating
> > addresses.
> 
> It's a reasonable suggestion. Kame, would you agree on killing the
> kmap_atomic() and revert to the vmlist walk?

Yes, vmlist locking is always required to have a pin on the pages, and
IMO it should be quite easy to check for IOREMAP, so we should leave
that check there to avoid the possibility of regressions.

Thanks,
Nick

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
  2010-01-18 14:23         ` Nick Piggin
@ 2010-01-19  1:33           ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-19  1:33 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Andrew Morton, LKML, Tejun Heo, Ingo Molnar, Andi Kleen,
	Hugh Dickins, Christoph Lameter, KAMEZAWA Hiroyuki,
	Linux Memory Management List

On Mon, Jan 18, 2010 at 07:23:59AM -0700, Nick Piggin wrote:
> On Mon, Jan 18, 2010 at 09:35:12PM +0800, Wu Fengguang wrote:
> > On Thu, Jan 14, 2010 at 05:45:26AM -0700, Nick Piggin wrote:
> > > On Wed, Jan 13, 2010 at 09:53:10PM +0800, Wu Fengguang wrote:
> > > > vread()/vwrite() is only called from kcore/kmem to access one page at a time.
> > > > So the logic can be vastly simplified.
> > > > 
> > > > The changes are:
> > > > - remove the vmlist walk and rely solely on vmalloc_to_page()
> > > > - replace the VM_IOREMAP check with (page && page_is_ram(pfn))
> > > > - rename to vread_page()/vwrite_page()
> > > > 
> > > > The page_is_ram() check is necessary because kmap_atomic() is not
> > > > designed to work with non-RAM pages.
> > > 
> > > I don't know if you can really do this. Previously vmlist_lock would be
> > > taken, which will prevent these vm areas from being freed.
> > >  
> > > > Note that even for a RAM page, we don't own the page, and cannot assume
> > > > it's a _PAGE_CACHE_WB page.
> > > 
> > > So why is this not a problem for your patch? I don't see how you handle
> > > it.
> > 
> > Sorry I didn't handle it. Just hope to catch attentions from someone
> > (ie. you :).
> > 
> > It's not a problem for x86_64 at all. For others I wonder if any
> > driver will vmalloc HIGHMEM pages with !_PAGE_CACHE_WB attribute..
> > 
> > So I noted the possible problem and leave it alone.
> 
> Well it doesn't need to be vmalloc. Any kind of vmap like ioremap. And
> these can be accompanied by changing the caching attribute. Like agp
> code, for an example. But I don't know if that ever becomes a problem
> in practice.

Yes vmap in general can change caching attribute. However I only care
about vmap that maps RAM pages, since my patch treats non-RAM pages as
hole and won't access them.

> > > What's the problem with the current code, exactly? I would prefer that
> > 
> > - unnecessary complexity to handle multi-page case, since it's always
> >   called to access one single page;
> 
> Fair point there. It just wasn't clear what exactly is your rationale
> because this was in a set of other patches.
>  
> > - the kmap_atomic() cache consistency problem, which I expressed some
> >   concern (without further action)
> 
> Which kmap_atomic problem? Can you explain again? Virtual cache aliasing
> problem you mean? Or caching attribute conflicts?

kmap_atomic() assumes you own the page and always use _PAGE_CACHE_WB.
So there may be conflicts if the page was !_PAGE_CACHE_WB.

> The whole thing looks stupid though, apparently kmap is used to avoid "the
> lock". But the lock is already held. We should just use the vmap
> address.

Yes. I wonder why Kame introduced kmap_atomic() in d0107eb07 -- given
that he at the same time fixed the order of removing vm_struct and
vmap in dd32c279983b.

> > > you continue using the same vmlist locking and checking for validating
> > > addresses.
> > 
> > It's a reasonable suggestion. Kame, would you agree on killing the
> > kmap_atomic() and revert to the vmlist walk?
> 
> Yes, vmlist locking is always required to have a pin on the pages, and
> IMO it should be quite easy to check for IOREMAP, so we should leave
> that check there to avoid the possibility of regressions.

I have no problem if Kame could dismiss my question :)

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
@ 2010-01-19  1:33           ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-19  1:33 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Andrew Morton, LKML, Tejun Heo, Ingo Molnar, Andi Kleen,
	Hugh Dickins, Christoph Lameter, KAMEZAWA Hiroyuki,
	Linux Memory Management List

On Mon, Jan 18, 2010 at 07:23:59AM -0700, Nick Piggin wrote:
> On Mon, Jan 18, 2010 at 09:35:12PM +0800, Wu Fengguang wrote:
> > On Thu, Jan 14, 2010 at 05:45:26AM -0700, Nick Piggin wrote:
> > > On Wed, Jan 13, 2010 at 09:53:10PM +0800, Wu Fengguang wrote:
> > > > vread()/vwrite() is only called from kcore/kmem to access one page at a time.
> > > > So the logic can be vastly simplified.
> > > > 
> > > > The changes are:
> > > > - remove the vmlist walk and rely solely on vmalloc_to_page()
> > > > - replace the VM_IOREMAP check with (page && page_is_ram(pfn))
> > > > - rename to vread_page()/vwrite_page()
> > > > 
> > > > The page_is_ram() check is necessary because kmap_atomic() is not
> > > > designed to work with non-RAM pages.
> > > 
> > > I don't know if you can really do this. Previously vmlist_lock would be
> > > taken, which will prevent these vm areas from being freed.
> > >  
> > > > Note that even for a RAM page, we don't own the page, and cannot assume
> > > > it's a _PAGE_CACHE_WB page.
> > > 
> > > So why is this not a problem for your patch? I don't see how you handle
> > > it.
> > 
> > Sorry I didn't handle it. Just hope to catch attentions from someone
> > (ie. you :).
> > 
> > It's not a problem for x86_64 at all. For others I wonder if any
> > driver will vmalloc HIGHMEM pages with !_PAGE_CACHE_WB attribute..
> > 
> > So I noted the possible problem and leave it alone.
> 
> Well it doesn't need to be vmalloc. Any kind of vmap like ioremap. And
> these can be accompanied by changing the caching attribute. Like agp
> code, for an example. But I don't know if that ever becomes a problem
> in practice.

Yes vmap in general can change caching attribute. However I only care
about vmap that maps RAM pages, since my patch treats non-RAM pages as
hole and won't access them.

> > > What's the problem with the current code, exactly? I would prefer that
> > 
> > - unnecessary complexity to handle multi-page case, since it's always
> >   called to access one single page;
> 
> Fair point there. It just wasn't clear what exactly is your rationale
> because this was in a set of other patches.
>  
> > - the kmap_atomic() cache consistency problem, which I expressed some
> >   concern (without further action)
> 
> Which kmap_atomic problem? Can you explain again? Virtual cache aliasing
> problem you mean? Or caching attribute conflicts?

kmap_atomic() assumes you own the page and always use _PAGE_CACHE_WB.
So there may be conflicts if the page was !_PAGE_CACHE_WB.

> The whole thing looks stupid though, apparently kmap is used to avoid "the
> lock". But the lock is already held. We should just use the vmap
> address.

Yes. I wonder why Kame introduced kmap_atomic() in d0107eb07 -- given
that he at the same time fixed the order of removing vm_struct and
vmap in dd32c279983b.

> > > you continue using the same vmlist locking and checking for validating
> > > addresses.
> > 
> > It's a reasonable suggestion. Kame, would you agree on killing the
> > kmap_atomic() and revert to the vmlist walk?
> 
> Yes, vmlist locking is always required to have a pin on the pages, and
> IMO it should be quite easy to check for IOREMAP, so we should leave
> that check there to avoid the possibility of regressions.

I have no problem if Kame could dismiss my question :)

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
  2010-01-19  1:33           ` Wu Fengguang
@ 2010-01-19  2:23             ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-01-19  2:23 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Nick Piggin, Andrew Morton, LKML, Tejun Heo, Ingo Molnar,
	Andi Kleen, Hugh Dickins, Christoph Lameter,
	Linux Memory Management List

On Tue, 19 Jan 2010 09:33:03 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:
> > The whole thing looks stupid though, apparently kmap is used to avoid "the
> > lock". But the lock is already held. We should just use the vmap
> > address.
> 
> Yes. I wonder why Kame introduced kmap_atomic() in d0107eb07 -- given
> that he at the same time fixed the order of removing vm_struct and
> vmap in dd32c279983b.
> 
Hmm...I must check my thinking again before answering..

vmalloc/vmap is constructed by 2 layer.
	- vmalloc layer....guarded by vmlist_lock.
	- vmap layer   ....gurderd by purge_lock. etc.

Now, let's see how vmalloc() works. It does job in 2 steps.
vmalloc():
  - allocate vmalloc area to the list under vmlist_lock.
	- map pages.
vfree()
  - free vmalloc area from the list under vmlist_lock.
	- unmap pages under purge_lock.

Now. vread(), vwrite() just take vmlist_lock, doesn't take purge_lock().
It walks page table and find pte entry, page, kmap and access it.

Oh, yes. It seems it's safe without kmap. But My concern is percpu allocator.

It uses get_vm_area() and controls mapped pages by themselves and
map/unmap pages by with their own logic. vmalloc.c is just used for
alloc/free virtual address. 

Now, vread()/vwrite() just holds vmlist_lock() and walk page table
without no guarantee that the found page is stably mapped. So, I used kmap.

If I miss something, I'm very sorry to add such kmap.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
@ 2010-01-19  2:23             ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-01-19  2:23 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Nick Piggin, Andrew Morton, LKML, Tejun Heo, Ingo Molnar,
	Andi Kleen, Hugh Dickins, Christoph Lameter,
	Linux Memory Management List

On Tue, 19 Jan 2010 09:33:03 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:
> > The whole thing looks stupid though, apparently kmap is used to avoid "the
> > lock". But the lock is already held. We should just use the vmap
> > address.
> 
> Yes. I wonder why Kame introduced kmap_atomic() in d0107eb07 -- given
> that he at the same time fixed the order of removing vm_struct and
> vmap in dd32c279983b.
> 
Hmm...I must check my thinking again before answering..

vmalloc/vmap is constructed by 2 layer.
	- vmalloc layer....guarded by vmlist_lock.
	- vmap layer   ....gurderd by purge_lock. etc.

Now, let's see how vmalloc() works. It does job in 2 steps.
vmalloc():
  - allocate vmalloc area to the list under vmlist_lock.
	- map pages.
vfree()
  - free vmalloc area from the list under vmlist_lock.
	- unmap pages under purge_lock.

Now. vread(), vwrite() just take vmlist_lock, doesn't take purge_lock().
It walks page table and find pte entry, page, kmap and access it.

Oh, yes. It seems it's safe without kmap. But My concern is percpu allocator.

It uses get_vm_area() and controls mapped pages by themselves and
map/unmap pages by with their own logic. vmalloc.c is just used for
alloc/free virtual address. 

Now, vread()/vwrite() just holds vmlist_lock() and walk page table
without no guarantee that the found page is stably mapped. So, I used kmap.

If I miss something, I'm very sorry to add such kmap.

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
  2010-01-19  2:23             ` KAMEZAWA Hiroyuki
@ 2010-01-21  5:05               ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-21  5:05 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Nick Piggin, Andrew Morton, LKML, Tejun Heo, Ingo Molnar,
	Andi Kleen, Hugh Dickins, Christoph Lameter,
	Linux Memory Management List

On Mon, Jan 18, 2010 at 07:23:43PM -0700, KAMEZAWA Hiroyuki wrote:
> On Tue, 19 Jan 2010 09:33:03 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > The whole thing looks stupid though, apparently kmap is used to avoid "the
> > > lock". But the lock is already held. We should just use the vmap
> > > address.
> > 
> > Yes. I wonder why Kame introduced kmap_atomic() in d0107eb07 -- given
> > that he at the same time fixed the order of removing vm_struct and
> > vmap in dd32c279983b.
> > 
> Hmm...I must check my thinking again before answering..
> 
> vmalloc/vmap is constructed by 2 layer.
> 	- vmalloc layer....guarded by vmlist_lock.
> 	- vmap layer   ....gurderd by purge_lock. etc.
> 
> Now, let's see how vmalloc() works. It does job in 2 steps.
> vmalloc():
>   - allocate vmalloc area to the list under vmlist_lock.
> 	- map pages.
> vfree()
>   - free vmalloc area from the list under vmlist_lock.
> 	- unmap pages under purge_lock.
> 
> Now. vread(), vwrite() just take vmlist_lock, doesn't take purge_lock().
> It walks page table and find pte entry, page, kmap and access it.
> 
> Oh, yes. It seems it's safe without kmap. But My concern is percpu allocator.
> 
> It uses get_vm_area() and controls mapped pages by themselves and
> map/unmap pages by with their own logic. vmalloc.c is just used for
> alloc/free virtual address. 
> 
> Now, vread()/vwrite() just holds vmlist_lock() and walk page table
> without no guarantee that the found page is stably mapped. So, I used kmap.
> 
> If I miss something, I'm very sorry to add such kmap.

Ah Thanks for explanation!

I did some audit and find that

- set_memory_uc(), set_memory_array_uc(), set_pages_uc(),
  set_pages_array_uc() are called EFI code and various video drivers,
  all of them don't touch HIGHMEM RAM

- Kame: ioremap() won't allow remap of physical RAM

So kmap_atomic() is safe.  Let's just settle on this patch?

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
@ 2010-01-21  5:05               ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-21  5:05 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Nick Piggin, Andrew Morton, LKML, Tejun Heo, Ingo Molnar,
	Andi Kleen, Hugh Dickins, Christoph Lameter,
	Linux Memory Management List

On Mon, Jan 18, 2010 at 07:23:43PM -0700, KAMEZAWA Hiroyuki wrote:
> On Tue, 19 Jan 2010 09:33:03 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > The whole thing looks stupid though, apparently kmap is used to avoid "the
> > > lock". But the lock is already held. We should just use the vmap
> > > address.
> > 
> > Yes. I wonder why Kame introduced kmap_atomic() in d0107eb07 -- given
> > that he at the same time fixed the order of removing vm_struct and
> > vmap in dd32c279983b.
> > 
> Hmm...I must check my thinking again before answering..
> 
> vmalloc/vmap is constructed by 2 layer.
> 	- vmalloc layer....guarded by vmlist_lock.
> 	- vmap layer   ....gurderd by purge_lock. etc.
> 
> Now, let's see how vmalloc() works. It does job in 2 steps.
> vmalloc():
>   - allocate vmalloc area to the list under vmlist_lock.
> 	- map pages.
> vfree()
>   - free vmalloc area from the list under vmlist_lock.
> 	- unmap pages under purge_lock.
> 
> Now. vread(), vwrite() just take vmlist_lock, doesn't take purge_lock().
> It walks page table and find pte entry, page, kmap and access it.
> 
> Oh, yes. It seems it's safe without kmap. But My concern is percpu allocator.
> 
> It uses get_vm_area() and controls mapped pages by themselves and
> map/unmap pages by with their own logic. vmalloc.c is just used for
> alloc/free virtual address. 
> 
> Now, vread()/vwrite() just holds vmlist_lock() and walk page table
> without no guarantee that the found page is stably mapped. So, I used kmap.
> 
> If I miss something, I'm very sorry to add such kmap.

Ah Thanks for explanation!

I did some audit and find that

- set_memory_uc(), set_memory_array_uc(), set_pages_uc(),
  set_pages_array_uc() are called EFI code and various video drivers,
  all of them don't touch HIGHMEM RAM

- Kame: ioremap() won't allow remap of physical RAM

So kmap_atomic() is safe.  Let's just settle on this patch?

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
  2010-01-21  5:05               ` Wu Fengguang
@ 2010-01-21  5:21                 ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-01-21  5:21 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Nick Piggin, Andrew Morton, LKML, Tejun Heo, Ingo Molnar,
	Andi Kleen, Hugh Dickins, Christoph Lameter,
	Linux Memory Management List

On Thu, 21 Jan 2010 13:05:21 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Mon, Jan 18, 2010 at 07:23:43PM -0700, KAMEZAWA Hiroyuki wrote:
> > On Tue, 19 Jan 2010 09:33:03 +0800
> > Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > > The whole thing looks stupid though, apparently kmap is used to avoid "the
> > > > lock". But the lock is already held. We should just use the vmap
> > > > address.
> > > 
> > > Yes. I wonder why Kame introduced kmap_atomic() in d0107eb07 -- given
> > > that he at the same time fixed the order of removing vm_struct and
> > > vmap in dd32c279983b.
> > > 
> > Hmm...I must check my thinking again before answering..
> > 
> > vmalloc/vmap is constructed by 2 layer.
> > 	- vmalloc layer....guarded by vmlist_lock.
> > 	- vmap layer   ....gurderd by purge_lock. etc.
> > 
> > Now, let's see how vmalloc() works. It does job in 2 steps.
> > vmalloc():
> >   - allocate vmalloc area to the list under vmlist_lock.
> > 	- map pages.
> > vfree()
> >   - free vmalloc area from the list under vmlist_lock.
> > 	- unmap pages under purge_lock.
> > 
> > Now. vread(), vwrite() just take vmlist_lock, doesn't take purge_lock().
> > It walks page table and find pte entry, page, kmap and access it.
> > 
> > Oh, yes. It seems it's safe without kmap. But My concern is percpu allocator.
> > 
> > It uses get_vm_area() and controls mapped pages by themselves and
> > map/unmap pages by with their own logic. vmalloc.c is just used for
> > alloc/free virtual address. 
> > 
> > Now, vread()/vwrite() just holds vmlist_lock() and walk page table
> > without no guarantee that the found page is stably mapped. So, I used kmap.
> > 
> > If I miss something, I'm very sorry to add such kmap.
> 
> Ah Thanks for explanation!
> 
> I did some audit and find that
> 
> - set_memory_uc(), set_memory_array_uc(), set_pages_uc(),
>   set_pages_array_uc() are called EFI code and various video drivers,
>   all of them don't touch HIGHMEM RAM
> 
> - Kame: ioremap() won't allow remap of physical RAM
> 
> So kmap_atomic() is safe.  Let's just settle on this patch?
> 
I recommend you to keep check on VM_IOREMAP. That was checked far before
I started to see Linux. Some _unknown_ driver can call get_vm_area() and
map arbitrary pages there.

I'm sorry I coundn't track discussion correctly.

Thanks,
-Kame





^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
@ 2010-01-21  5:21                 ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 41+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-01-21  5:21 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Nick Piggin, Andrew Morton, LKML, Tejun Heo, Ingo Molnar,
	Andi Kleen, Hugh Dickins, Christoph Lameter,
	Linux Memory Management List

On Thu, 21 Jan 2010 13:05:21 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> On Mon, Jan 18, 2010 at 07:23:43PM -0700, KAMEZAWA Hiroyuki wrote:
> > On Tue, 19 Jan 2010 09:33:03 +0800
> > Wu Fengguang <fengguang.wu@intel.com> wrote:
> > > > The whole thing looks stupid though, apparently kmap is used to avoid "the
> > > > lock". But the lock is already held. We should just use the vmap
> > > > address.
> > > 
> > > Yes. I wonder why Kame introduced kmap_atomic() in d0107eb07 -- given
> > > that he at the same time fixed the order of removing vm_struct and
> > > vmap in dd32c279983b.
> > > 
> > Hmm...I must check my thinking again before answering..
> > 
> > vmalloc/vmap is constructed by 2 layer.
> > 	- vmalloc layer....guarded by vmlist_lock.
> > 	- vmap layer   ....gurderd by purge_lock. etc.
> > 
> > Now, let's see how vmalloc() works. It does job in 2 steps.
> > vmalloc():
> >   - allocate vmalloc area to the list under vmlist_lock.
> > 	- map pages.
> > vfree()
> >   - free vmalloc area from the list under vmlist_lock.
> > 	- unmap pages under purge_lock.
> > 
> > Now. vread(), vwrite() just take vmlist_lock, doesn't take purge_lock().
> > It walks page table and find pte entry, page, kmap and access it.
> > 
> > Oh, yes. It seems it's safe without kmap. But My concern is percpu allocator.
> > 
> > It uses get_vm_area() and controls mapped pages by themselves and
> > map/unmap pages by with their own logic. vmalloc.c is just used for
> > alloc/free virtual address. 
> > 
> > Now, vread()/vwrite() just holds vmlist_lock() and walk page table
> > without no guarantee that the found page is stably mapped. So, I used kmap.
> > 
> > If I miss something, I'm very sorry to add such kmap.
> 
> Ah Thanks for explanation!
> 
> I did some audit and find that
> 
> - set_memory_uc(), set_memory_array_uc(), set_pages_uc(),
>   set_pages_array_uc() are called EFI code and various video drivers,
>   all of them don't touch HIGHMEM RAM
> 
> - Kame: ioremap() won't allow remap of physical RAM
> 
> So kmap_atomic() is safe.  Let's just settle on this patch?
> 
I recommend you to keep check on VM_IOREMAP. That was checked far before
I started to see Linux. Some _unknown_ driver can call get_vm_area() and
map arbitrary pages there.

I'm sorry I coundn't track discussion correctly.

Thanks,
-Kame




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
  2010-01-21  5:21                 ` KAMEZAWA Hiroyuki
@ 2010-01-21  5:49                   ` Wu Fengguang
  -1 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-21  5:49 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Nick Piggin, Andrew Morton, LKML, Tejun Heo, Ingo Molnar,
	Andi Kleen, Hugh Dickins, Christoph Lameter,
	Linux Memory Management List

On Wed, Jan 20, 2010 at 10:21:06PM -0700, KAMEZAWA Hiroyuki wrote:
> On Thu, 21 Jan 2010 13:05:21 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > On Mon, Jan 18, 2010 at 07:23:43PM -0700, KAMEZAWA Hiroyuki wrote:

> > I did some audit and find that
> > 
> > - set_memory_uc(), set_memory_array_uc(), set_pages_uc(),
> >   set_pages_array_uc() are called EFI code and various video drivers,
> >   all of them don't touch HIGHMEM RAM
> > 
> > - Kame: ioremap() won't allow remap of physical RAM
> > 
> > So kmap_atomic() is safe.  Let's just settle on this patch?
> > 
> I recommend you to keep check on VM_IOREMAP. That was checked far before
> I started to see Linux. Some _unknown_ driver can call get_vm_area() and
> map arbitrary pages there.

OK, I'll turn this patch into a less radical one.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 5/8] vmalloc: simplify vread()/vwrite()
@ 2010-01-21  5:49                   ` Wu Fengguang
  0 siblings, 0 replies; 41+ messages in thread
From: Wu Fengguang @ 2010-01-21  5:49 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Nick Piggin, Andrew Morton, LKML, Tejun Heo, Ingo Molnar,
	Andi Kleen, Hugh Dickins, Christoph Lameter,
	Linux Memory Management List

On Wed, Jan 20, 2010 at 10:21:06PM -0700, KAMEZAWA Hiroyuki wrote:
> On Thu, 21 Jan 2010 13:05:21 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > On Mon, Jan 18, 2010 at 07:23:43PM -0700, KAMEZAWA Hiroyuki wrote:

> > I did some audit and find that
> > 
> > - set_memory_uc(), set_memory_array_uc(), set_pages_uc(),
> >   set_pages_array_uc() are called EFI code and various video drivers,
> >   all of them don't touch HIGHMEM RAM
> > 
> > - Kame: ioremap() won't allow remap of physical RAM
> > 
> > So kmap_atomic() is safe.  Let's just settle on this patch?
> > 
> I recommend you to keep check on VM_IOREMAP. That was checked far before
> I started to see Linux. Some _unknown_ driver can call get_vm_area() and
> map arbitrary pages there.

OK, I'll turn this patch into a less radical one.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2010-01-21  5:49 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-13 13:53 [PATCH 0/8] devmem/kmem/kcore fixes, cleanups and hwpoison checks Wu Fengguang
2010-01-13 13:53 ` Wu Fengguang
2010-01-13 13:53 ` [PATCH 1/8] vfs: fix too big f_pos handling Wu Fengguang
2010-01-13 13:53   ` Wu Fengguang
2010-01-13 13:53 ` [PATCH 2/8] devmem: check vmalloc address on kmem read/write Wu Fengguang
2010-01-13 13:53   ` Wu Fengguang
2010-01-13 13:53 ` [PATCH 3/8] devmem: fix kmem write bug on memory holes Wu Fengguang
2010-01-13 13:53   ` Wu Fengguang
2010-01-13 13:53 ` [PATCH 4/8] resources: introduce generic page_is_ram() Wu Fengguang
2010-01-13 13:53   ` Wu Fengguang
2010-01-13 13:53   ` Wu Fengguang
2010-01-13 14:29   ` Américo Wang
2010-01-13 14:29     ` Américo Wang
2010-01-14  3:29     ` Wu Fengguang
2010-01-14  3:29       ` Wu Fengguang
2010-01-13 13:53 ` [PATCH 5/8] vmalloc: simplify vread()/vwrite() Wu Fengguang
2010-01-13 13:53   ` Wu Fengguang
2010-01-14 12:45   ` Nick Piggin
2010-01-14 12:45     ` Nick Piggin
2010-01-18 13:35     ` Wu Fengguang
2010-01-18 13:35       ` Wu Fengguang
2010-01-18 14:23       ` Nick Piggin
2010-01-18 14:23         ` Nick Piggin
2010-01-19  1:33         ` Wu Fengguang
2010-01-19  1:33           ` Wu Fengguang
2010-01-19  2:23           ` KAMEZAWA Hiroyuki
2010-01-19  2:23             ` KAMEZAWA Hiroyuki
2010-01-21  5:05             ` Wu Fengguang
2010-01-21  5:05               ` Wu Fengguang
2010-01-21  5:21               ` KAMEZAWA Hiroyuki
2010-01-21  5:21                 ` KAMEZAWA Hiroyuki
2010-01-21  5:49                 ` Wu Fengguang
2010-01-21  5:49                   ` Wu Fengguang
2010-01-13 13:53 ` [PATCH 6/8] hwpoison: prevent /dev/kmem from accessing hwpoison pages Wu Fengguang
2010-01-13 13:53   ` Wu Fengguang
2010-01-13 13:53 ` [PATCH 7/8] hwpoison: prevent /dev/mem " Wu Fengguang
2010-01-13 13:53   ` Wu Fengguang
2010-01-13 13:53 ` [PATCH 8/8] hwpoison: prevent /dev/kcore " Wu Fengguang
2010-01-13 13:53   ` Wu Fengguang
2010-01-13 14:23   ` Américo Wang
2010-01-13 14:23     ` Américo Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.