All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] /proc/kmem fixes and hwpoison bits v2
@ 2009-09-16  1:39 Wu Fengguang
  2009-09-16  1:39   ` Wu Fengguang
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Wu Fengguang @ 2009-09-16  1:39 UTC (permalink / raw)
  To: Andrew Morton, KAMEZAWA Hiroyuki
  Cc: Benjamin Herrenschmidt, Christoph Lameter, Ingo Molnar,
	Tejun Heo, Nick Piggin, Wu Fengguang, LKML, linux-mm

Hi Andrew and Kame,

Here are the 3 updated bug fix/cleanup patches for kmem.  Comments are welcome.

Changes since v1:
- change vread()/vwrite() prototype (proposed by Kame)
- include Kame's is_vmalloc_or_module_addr() check and use Hugh's -ENXIO.
- I decided not to use __GFP_ZERO, since the buf will be reused in the loop,
  so it would be better for vread to zero-fill it each time.
- removed the hwpoison checks for vmalloc pages. It seems that vread/vwrite
  could be simplified to handle one single page, and the hwpoison bits can be
  considered after that.

Thanks,
Fengguang
--


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] devmem: change vread()/vwrite() prototype to return success or error code
  2009-09-16  1:39 [PATCH 0/3] /proc/kmem fixes and hwpoison bits v2 Wu Fengguang
@ 2009-09-16  1:39   ` Wu Fengguang
  2009-09-16  1:39 ` [PATCH 2/3] devmem: check vmalloc address on kmem read/write Wu Fengguang
  2009-09-16  1:39   ` Wu Fengguang
  2 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2009-09-16  1:39 UTC (permalink / raw)
  To: Andrew Morton, KAMEZAWA Hiroyuki
  Cc: Benjamin Herrenschmidt, Andi Kleen, Christoph Lameter,
	Ingo Molnar, Tejun Heo, Nick Piggin, Wu Fengguang, LKML,
	linux-mm

[-- Attachment #1: vwrite-ret.patch --]
[-- Type: text/plain, Size: 9598 bytes --]

Silently ignore all vmalloc area holes in vread()/vwrite(),
and report success (or error code in future) to the caller.

The original intention is to fix a vwrite() related bug, where
it could return 0 which cannot be handled correctly by its caller
write_kmem(). Then KAMEZAWA recommends to change the prototype
to make the semantics clear.

CC: Andi Kleen <andi@firstfloor.org>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Tejun Heo <tj@kernel.org>
CC: Nick Piggin <npiggin@suse.de>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c      |   24 ++++++++--------
 fs/proc/kcore.c         |    5 ++-
 include/linux/vmalloc.h |    6 ++--
 mm/nommu.c              |    8 ++---
 mm/vmalloc.c            |   55 +++++++++++---------------------------
 5 files changed, 40 insertions(+), 58 deletions(-)

--- linux-mm.orig/mm/vmalloc.c	2009-09-16 08:52:12.000000000 +0800
+++ linux-mm/mm/vmalloc.c	2009-09-16 09:24:11.000000000 +0800
@@ -1655,7 +1655,6 @@ EXPORT_SYMBOL(vmalloc_32_user);
 static int aligned_vread(char *buf, char *addr, unsigned long count)
 {
 	struct page *p;
-	int copied = 0;
 
 	while (count) {
 		unsigned long offset, length;
@@ -1685,16 +1684,14 @@ static int aligned_vread(char *buf, char
 
 		addr += length;
 		buf += length;
-		copied += length;
 		count -= length;
 	}
-	return copied;
+	return 0;
 }
 
 static int aligned_vwrite(char *buf, char *addr, unsigned long count)
 {
 	struct page *p;
-	int copied = 0;
 
 	while (count) {
 		unsigned long offset, length;
@@ -1722,10 +1719,9 @@ static int aligned_vwrite(char *buf, cha
 		}
 		addr += length;
 		buf += length;
-		copied += length;
 		count -= length;
 	}
-	return copied;
+	return 0;
 }
 
 /**
@@ -1734,9 +1730,7 @@ static int aligned_vwrite(char *buf, cha
  *	@addr:		vm address.
  *	@count:		number of bytes to be read.
  *
- *	Returns # of bytes which addr and buf should be increased.
- *	(same number to @count). Returns 0 if [addr...addr+count) doesn't
- *	includes any intersect with alive vmalloc area.
+ *	Returns 0 on success.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from that area to a given buffer. If the given memory range
@@ -1744,23 +1738,21 @@ static int aligned_vwrite(char *buf, cha
  *	proper area of @buf. If there are memory holes, they'll be zero-filled.
  *	IOREMAP area is treated as memory hole and no copy is done.
  *
- *	If [addr...addr+count) doesn't includes any intersects with alive
- *	vm_struct area, returns 0.
  *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
  *	the caller should guarantee KM_USER0 is not used.
  *
  *	Note: In usual ops, vread() is never necessary because the caller
  *	should know vmalloc() area is valid and can use memcpy().
  *	This is for routines which have to access vmalloc area without
- *	any informaion, as /dev/kmem.
+ *	any information, as /dev/kmem and /dev/kcore.
  *
  */
 
-long vread(char *buf, char *addr, unsigned long count)
+int vread(char *buf, char *addr, unsigned long count)
 {
 	struct vm_struct *tmp;
-	char *vaddr, *buf_start = buf;
-	unsigned long buflen = count;
+	char *vaddr;
+	char *buf_end = buf + count;
 	unsigned long n;
 
 	/* Don't allow overflow */
@@ -1794,13 +1786,11 @@ long vread(char *buf, char *addr, unsign
 finished:
 	read_unlock(&vmlist_lock);
 
-	if (buf == buf_start)
-		return 0;
 	/* zero-fill memory holes */
-	if (buf != buf_start + buflen)
-		memset(buf, 0, buflen - (buf - buf_start));
+	if (buf != buf_end)
+		memset(buf, 0, buf_end - buf);
 
-	return buflen;
+	return 0;
 }
 
 /**
@@ -1809,10 +1799,7 @@ finished:
  *	@addr:		vm address.
  *	@count:		number of bytes to be read.
  *
- *	Returns # of bytes which addr and buf should be incresed.
- *	(same number to @count).
- *	If [addr...addr+count) doesn't includes any intersect with valid
- *	vmalloc area, returns 0.
+ *	Returns 0 on success.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from a buffer to the given addr. If specified range of
@@ -1820,30 +1807,24 @@ finished:
  *	proper area of @buf. If there are memory holes, no copy to hole.
  *	IOREMAP area is treated as memory hole and no copy is done.
  *
- *	If [addr...addr+count) doesn't includes any intersects with alive
- *	vm_struct area, returns 0.
  *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
  *	the caller should guarantee KM_USER0 is not used.
  *
  *	Note: In usual ops, vwrite() is never necessary because the caller
  *	should know vmalloc() area is valid and can use memcpy().
  *	This is for routines which have to access vmalloc area without
- *	any informaion, as /dev/kmem.
- *
- *	The caller should guarantee KM_USER1 is not used.
+ *	any information, as /dev/kmem.
  */
 
-long vwrite(char *buf, char *addr, unsigned long count)
+int vwrite(char *buf, char *addr, unsigned long count)
 {
 	struct vm_struct *tmp;
 	char *vaddr;
-	unsigned long n, buflen;
-	int copied = 0;
+	unsigned long n;
 
 	/* Don't allow overflow */
 	if ((unsigned long) addr + count < count)
 		count = -(unsigned long) addr;
-	buflen = count;
 
 	read_lock(&vmlist_lock);
 	for (tmp = vmlist; count && tmp; tmp = tmp->next) {
@@ -1860,19 +1841,15 @@ long vwrite(char *buf, char *addr, unsig
 		n = vaddr + tmp->size - PAGE_SIZE - addr;
 		if (n > count)
 			n = count;
-		if (!(tmp->flags & VM_IOREMAP)) {
+		if (!(tmp->flags & VM_IOREMAP))
 			aligned_vwrite(buf, addr, n);
-			copied++;
-		}
 		buf += n;
 		addr += n;
 		count -= n;
 	}
 finished:
 	read_unlock(&vmlist_lock);
-	if (!copied)
-		return 0;
-	return buflen;
+	return 0;
 }
 
 /**
--- linux-mm.orig/include/linux/vmalloc.h	2009-09-16 08:52:12.000000000 +0800
+++ linux-mm/include/linux/vmalloc.h	2009-09-16 08:52:17.000000000 +0800
@@ -104,9 +104,9 @@ extern void unmap_kernel_range(unsigned 
 extern struct vm_struct *alloc_vm_area(size_t size);
 extern void free_vm_area(struct vm_struct *area);
 
-/* for /dev/kmem */
-extern long vread(char *buf, char *addr, unsigned long count);
-extern long vwrite(char *buf, char *addr, unsigned long count);
+/* for /dev/kmem and /dev/kcore */
+extern int vread(char *buf, char *addr, unsigned long count);
+extern int vwrite(char *buf, char *addr, unsigned long count);
 
 /*
  *	Internals.  Dont't use..
--- linux-mm.orig/drivers/char/mem.c	2009-09-16 08:52:12.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2009-09-16 09:23:00.000000000 +0800
@@ -396,6 +396,7 @@ static ssize_t read_kmem(struct file *fi
 	unsigned long p = *ppos;
 	ssize_t low_count, read, sz;
 	char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */
+	int err = 0;
 
 	read = 0;
 	if (p < (unsigned long) high_memory) {
@@ -442,12 +443,12 @@ static ssize_t read_kmem(struct file *fi
 			return -ENOMEM;
 		while (count > 0) {
 			sz = size_inside_page(p, count);
-			sz = vread(kbuf, (char *)p, sz);
-			if (!sz)
+			err = vread(kbuf, (char *)p, sz);
+			if (err)
 				break;
 			if (copy_to_user(buf, kbuf, sz)) {
-				free_page((unsigned long)kbuf);
-				return -EFAULT;
+				err = -EFAULT;
+				break;
 			}
 			count -= sz;
 			buf += sz;
@@ -457,7 +458,7 @@ static ssize_t read_kmem(struct file *fi
 		free_page((unsigned long)kbuf);
 	}
  	*ppos = p;
- 	return read;
+	return read ? read : err;
 }
 
 
@@ -521,6 +522,7 @@ static ssize_t write_kmem(struct file * 
 	ssize_t wrote = 0;
 	ssize_t virtr = 0;
 	char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */
+	int err = 0;
 
 	if (p < (unsigned long) high_memory) {
 		unsigned long to_write = min_t(unsigned long, count,
@@ -543,12 +545,12 @@ static ssize_t write_kmem(struct file * 
 
 			n = copy_from_user(kbuf, buf, sz);
 			if (n) {
-				if (wrote + virtr)
-					break;
-				free_page((unsigned long)kbuf);
-				return -EFAULT;
+				err = -EFAULT;
+				break;
 			}
-			sz = vwrite(kbuf, (char *)p, sz);
+			err = vwrite(kbuf, (char *)p, sz);
+			if (err)
+				break;
 			count -= sz;
 			buf += sz;
 			virtr += sz;
@@ -558,7 +560,7 @@ static ssize_t write_kmem(struct file * 
 	}
 
  	*ppos = p;
- 	return virtr + wrote;
+	return virtr + wrote ? : err;
 }
 #endif
 
--- linux-mm.orig/fs/proc/kcore.c	2009-09-16 08:52:12.000000000 +0800
+++ linux-mm/fs/proc/kcore.c	2009-09-16 08:52:17.000000000 +0800
@@ -492,11 +492,14 @@ read_kcore(struct file *file, char __use
 				return -EFAULT;
 		} else if (is_vmalloc_or_module_addr((void *)start)) {
 			char * elf_buf;
+			int err;
 
 			elf_buf = kzalloc(tsz, GFP_KERNEL);
 			if (!elf_buf)
 				return -ENOMEM;
-			vread(elf_buf, (char *)start, tsz);
+			err = vread(elf_buf, (char *)start, tsz);
+			if (err)
+				return err;
 			/* we have to zero-fill user buffer even if no read */
 			if (copy_to_user(buffer, elf_buf, tsz)) {
 				kfree(elf_buf);
--- linux-mm.orig/mm/nommu.c	2009-09-16 09:01:36.000000000 +0800
+++ linux-mm/mm/nommu.c	2009-09-16 09:02:02.000000000 +0800
@@ -263,20 +263,20 @@ unsigned long vmalloc_to_pfn(const void 
 }
 EXPORT_SYMBOL(vmalloc_to_pfn);
 
-long vread(char *buf, char *addr, unsigned long count)
+int vread(char *buf, char *addr, unsigned long count)
 {
 	memcpy(buf, addr, count);
-	return count;
+	return 0;
 }
 
-long vwrite(char *buf, char *addr, unsigned long count)
+int vwrite(char *buf, char *addr, unsigned long count)
 {
 	/* Don't allow overflow */
 	if ((unsigned long) addr + count < count)
 		count = -(unsigned long) addr;
 
 	memcpy(addr, buf, count);
-	return(count);
+	return 0;
 }
 
 /*

-- 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] devmem: change vread()/vwrite() prototype to return success or error code
@ 2009-09-16  1:39   ` Wu Fengguang
  0 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2009-09-16  1:39 UTC (permalink / raw)
  To: Andrew Morton, KAMEZAWA Hiroyuki
  Cc: Benjamin Herrenschmidt, Andi Kleen, Christoph Lameter,
	Ingo Molnar, Tejun Heo, Nick Piggin, Wu Fengguang, LKML,
	linux-mm

[-- Attachment #1: vwrite-ret.patch --]
[-- Type: text/plain, Size: 10069 bytes --]

Silently ignore all vmalloc area holes in vread()/vwrite(),
and report success (or error code in future) to the caller.

The original intention is to fix a vwrite() related bug, where
it could return 0 which cannot be handled correctly by its caller
write_kmem(). Then KAMEZAWA recommends to change the prototype
to make the semantics clear.

CC: Andi Kleen <andi@firstfloor.org>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Tejun Heo <tj@kernel.org>
CC: Nick Piggin <npiggin@suse.de>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c      |   24 ++++++++--------
 fs/proc/kcore.c         |    5 ++-
 include/linux/vmalloc.h |    6 ++--
 mm/nommu.c              |    8 ++---
 mm/vmalloc.c            |   55 +++++++++++---------------------------
 5 files changed, 40 insertions(+), 58 deletions(-)

--- linux-mm.orig/mm/vmalloc.c	2009-09-16 08:52:12.000000000 +0800
+++ linux-mm/mm/vmalloc.c	2009-09-16 09:24:11.000000000 +0800
@@ -1655,7 +1655,6 @@ EXPORT_SYMBOL(vmalloc_32_user);
 static int aligned_vread(char *buf, char *addr, unsigned long count)
 {
 	struct page *p;
-	int copied = 0;
 
 	while (count) {
 		unsigned long offset, length;
@@ -1685,16 +1684,14 @@ static int aligned_vread(char *buf, char
 
 		addr += length;
 		buf += length;
-		copied += length;
 		count -= length;
 	}
-	return copied;
+	return 0;
 }
 
 static int aligned_vwrite(char *buf, char *addr, unsigned long count)
 {
 	struct page *p;
-	int copied = 0;
 
 	while (count) {
 		unsigned long offset, length;
@@ -1722,10 +1719,9 @@ static int aligned_vwrite(char *buf, cha
 		}
 		addr += length;
 		buf += length;
-		copied += length;
 		count -= length;
 	}
-	return copied;
+	return 0;
 }
 
 /**
@@ -1734,9 +1730,7 @@ static int aligned_vwrite(char *buf, cha
  *	@addr:		vm address.
  *	@count:		number of bytes to be read.
  *
- *	Returns # of bytes which addr and buf should be increased.
- *	(same number to @count). Returns 0 if [addr...addr+count) doesn't
- *	includes any intersect with alive vmalloc area.
+ *	Returns 0 on success.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from that area to a given buffer. If the given memory range
@@ -1744,23 +1738,21 @@ static int aligned_vwrite(char *buf, cha
  *	proper area of @buf. If there are memory holes, they'll be zero-filled.
  *	IOREMAP area is treated as memory hole and no copy is done.
  *
- *	If [addr...addr+count) doesn't includes any intersects with alive
- *	vm_struct area, returns 0.
  *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
  *	the caller should guarantee KM_USER0 is not used.
  *
  *	Note: In usual ops, vread() is never necessary because the caller
  *	should know vmalloc() area is valid and can use memcpy().
  *	This is for routines which have to access vmalloc area without
- *	any informaion, as /dev/kmem.
+ *	any information, as /dev/kmem and /dev/kcore.
  *
  */
 
-long vread(char *buf, char *addr, unsigned long count)
+int vread(char *buf, char *addr, unsigned long count)
 {
 	struct vm_struct *tmp;
-	char *vaddr, *buf_start = buf;
-	unsigned long buflen = count;
+	char *vaddr;
+	char *buf_end = buf + count;
 	unsigned long n;
 
 	/* Don't allow overflow */
@@ -1794,13 +1786,11 @@ long vread(char *buf, char *addr, unsign
 finished:
 	read_unlock(&vmlist_lock);
 
-	if (buf == buf_start)
-		return 0;
 	/* zero-fill memory holes */
-	if (buf != buf_start + buflen)
-		memset(buf, 0, buflen - (buf - buf_start));
+	if (buf != buf_end)
+		memset(buf, 0, buf_end - buf);
 
-	return buflen;
+	return 0;
 }
 
 /**
@@ -1809,10 +1799,7 @@ finished:
  *	@addr:		vm address.
  *	@count:		number of bytes to be read.
  *
- *	Returns # of bytes which addr and buf should be incresed.
- *	(same number to @count).
- *	If [addr...addr+count) doesn't includes any intersect with valid
- *	vmalloc area, returns 0.
+ *	Returns 0 on success.
  *
  *	This function checks that addr is a valid vmalloc'ed area, and
  *	copy data from a buffer to the given addr. If specified range of
@@ -1820,30 +1807,24 @@ finished:
  *	proper area of @buf. If there are memory holes, no copy to hole.
  *	IOREMAP area is treated as memory hole and no copy is done.
  *
- *	If [addr...addr+count) doesn't includes any intersects with alive
- *	vm_struct area, returns 0.
  *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
  *	the caller should guarantee KM_USER0 is not used.
  *
  *	Note: In usual ops, vwrite() is never necessary because the caller
  *	should know vmalloc() area is valid and can use memcpy().
  *	This is for routines which have to access vmalloc area without
- *	any informaion, as /dev/kmem.
- *
- *	The caller should guarantee KM_USER1 is not used.
+ *	any information, as /dev/kmem.
  */
 
-long vwrite(char *buf, char *addr, unsigned long count)
+int vwrite(char *buf, char *addr, unsigned long count)
 {
 	struct vm_struct *tmp;
 	char *vaddr;
-	unsigned long n, buflen;
-	int copied = 0;
+	unsigned long n;
 
 	/* Don't allow overflow */
 	if ((unsigned long) addr + count < count)
 		count = -(unsigned long) addr;
-	buflen = count;
 
 	read_lock(&vmlist_lock);
 	for (tmp = vmlist; count && tmp; tmp = tmp->next) {
@@ -1860,19 +1841,15 @@ long vwrite(char *buf, char *addr, unsig
 		n = vaddr + tmp->size - PAGE_SIZE - addr;
 		if (n > count)
 			n = count;
-		if (!(tmp->flags & VM_IOREMAP)) {
+		if (!(tmp->flags & VM_IOREMAP))
 			aligned_vwrite(buf, addr, n);
-			copied++;
-		}
 		buf += n;
 		addr += n;
 		count -= n;
 	}
 finished:
 	read_unlock(&vmlist_lock);
-	if (!copied)
-		return 0;
-	return buflen;
+	return 0;
 }
 
 /**
--- linux-mm.orig/include/linux/vmalloc.h	2009-09-16 08:52:12.000000000 +0800
+++ linux-mm/include/linux/vmalloc.h	2009-09-16 08:52:17.000000000 +0800
@@ -104,9 +104,9 @@ extern void unmap_kernel_range(unsigned 
 extern struct vm_struct *alloc_vm_area(size_t size);
 extern void free_vm_area(struct vm_struct *area);
 
-/* for /dev/kmem */
-extern long vread(char *buf, char *addr, unsigned long count);
-extern long vwrite(char *buf, char *addr, unsigned long count);
+/* for /dev/kmem and /dev/kcore */
+extern int vread(char *buf, char *addr, unsigned long count);
+extern int vwrite(char *buf, char *addr, unsigned long count);
 
 /*
  *	Internals.  Dont't use..
--- linux-mm.orig/drivers/char/mem.c	2009-09-16 08:52:12.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2009-09-16 09:23:00.000000000 +0800
@@ -396,6 +396,7 @@ static ssize_t read_kmem(struct file *fi
 	unsigned long p = *ppos;
 	ssize_t low_count, read, sz;
 	char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */
+	int err = 0;
 
 	read = 0;
 	if (p < (unsigned long) high_memory) {
@@ -442,12 +443,12 @@ static ssize_t read_kmem(struct file *fi
 			return -ENOMEM;
 		while (count > 0) {
 			sz = size_inside_page(p, count);
-			sz = vread(kbuf, (char *)p, sz);
-			if (!sz)
+			err = vread(kbuf, (char *)p, sz);
+			if (err)
 				break;
 			if (copy_to_user(buf, kbuf, sz)) {
-				free_page((unsigned long)kbuf);
-				return -EFAULT;
+				err = -EFAULT;
+				break;
 			}
 			count -= sz;
 			buf += sz;
@@ -457,7 +458,7 @@ static ssize_t read_kmem(struct file *fi
 		free_page((unsigned long)kbuf);
 	}
  	*ppos = p;
- 	return read;
+	return read ? read : err;
 }
 
 
@@ -521,6 +522,7 @@ static ssize_t write_kmem(struct file * 
 	ssize_t wrote = 0;
 	ssize_t virtr = 0;
 	char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */
+	int err = 0;
 
 	if (p < (unsigned long) high_memory) {
 		unsigned long to_write = min_t(unsigned long, count,
@@ -543,12 +545,12 @@ static ssize_t write_kmem(struct file * 
 
 			n = copy_from_user(kbuf, buf, sz);
 			if (n) {
-				if (wrote + virtr)
-					break;
-				free_page((unsigned long)kbuf);
-				return -EFAULT;
+				err = -EFAULT;
+				break;
 			}
-			sz = vwrite(kbuf, (char *)p, sz);
+			err = vwrite(kbuf, (char *)p, sz);
+			if (err)
+				break;
 			count -= sz;
 			buf += sz;
 			virtr += sz;
@@ -558,7 +560,7 @@ static ssize_t write_kmem(struct file * 
 	}
 
  	*ppos = p;
- 	return virtr + wrote;
+	return virtr + wrote ? : err;
 }
 #endif
 
--- linux-mm.orig/fs/proc/kcore.c	2009-09-16 08:52:12.000000000 +0800
+++ linux-mm/fs/proc/kcore.c	2009-09-16 08:52:17.000000000 +0800
@@ -492,11 +492,14 @@ read_kcore(struct file *file, char __use
 				return -EFAULT;
 		} else if (is_vmalloc_or_module_addr((void *)start)) {
 			char * elf_buf;
+			int err;
 
 			elf_buf = kzalloc(tsz, GFP_KERNEL);
 			if (!elf_buf)
 				return -ENOMEM;
-			vread(elf_buf, (char *)start, tsz);
+			err = vread(elf_buf, (char *)start, tsz);
+			if (err)
+				return err;
 			/* we have to zero-fill user buffer even if no read */
 			if (copy_to_user(buffer, elf_buf, tsz)) {
 				kfree(elf_buf);
--- linux-mm.orig/mm/nommu.c	2009-09-16 09:01:36.000000000 +0800
+++ linux-mm/mm/nommu.c	2009-09-16 09:02:02.000000000 +0800
@@ -263,20 +263,20 @@ unsigned long vmalloc_to_pfn(const void 
 }
 EXPORT_SYMBOL(vmalloc_to_pfn);
 
-long vread(char *buf, char *addr, unsigned long count)
+int vread(char *buf, char *addr, unsigned long count)
 {
 	memcpy(buf, addr, count);
-	return count;
+	return 0;
 }
 
-long vwrite(char *buf, char *addr, unsigned long count)
+int vwrite(char *buf, char *addr, unsigned long count)
 {
 	/* Don't allow overflow */
 	if ((unsigned long) addr + count < count)
 		count = -(unsigned long) addr;
 
 	memcpy(addr, buf, count);
-	return(count);
+	return 0;
 }
 
 /*

-- 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/3] devmem: check vmalloc address on kmem read/write
  2009-09-16  1:39 [PATCH 0/3] /proc/kmem fixes and hwpoison bits v2 Wu Fengguang
  2009-09-16  1:39   ` Wu Fengguang
@ 2009-09-16  1:39 ` Wu Fengguang
  2009-09-16  2:50     ` KAMEZAWA Hiroyuki
  2009-09-16  1:39   ` Wu Fengguang
  2 siblings, 1 reply; 14+ messages in thread
From: Wu Fengguang @ 2009-09-16  1:39 UTC (permalink / raw)
  To: Andrew Morton, KAMEZAWA Hiroyuki
  Cc: Benjamin Herrenschmidt, Greg Kroah-Hartman, Hugh Dickins,
	Wu Fengguang, Christoph Lameter, Ingo Molnar, Tejun Heo,
	Nick Piggin, LKML, linux-mm

[-- Attachment #1: vmalloc-addr-fix.patch --]
[-- Type: text/plain, Size: 1389 bytes --]

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Otherwise vmalloc_to_page() will BUG().

This also makes the kmem read/write implementation aligned with mem(4):
"References to nonexistent locations cause errors to be returned." Here
we return -ENXIO (inspired by Hugh) if no bytes have been transfered
to/from user space, otherwise return partial read/write results.

CC: Greg Kroah-Hartman <gregkh@suse.de>
CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- linux-mm.orig/drivers/char/mem.c	2009-09-16 08:52:17.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2009-09-16 09:15:03.000000000 +0800
@@ -443,6 +443,10 @@ static ssize_t read_kmem(struct file *fi
 			return -ENOMEM;
 		while (count > 0) {
 			sz = size_inside_page(p, count);
+			if (!is_vmalloc_or_module_addr((void *)p)) {
+				err = -ENXIO;
+				break;
+			}
 			err = vread(kbuf, (char *)p, sz);
 			if (err)
 				break;
@@ -543,6 +547,10 @@ static ssize_t write_kmem(struct file * 
 			unsigned long sz = size_inside_page(p, count);
 			unsigned long n;
 
+			if (!is_vmalloc_or_module_addr((void *)p)) {
+				err = -ENXIO;
+				break;
+			}
 			n = copy_from_user(kbuf, buf, sz);
 			if (n) {
 				err = -EFAULT;

-- 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] HWPOISON: prevent /dev/kmem users from accessing hwpoison pages
  2009-09-16  1:39 [PATCH 0/3] /proc/kmem fixes and hwpoison bits v2 Wu Fengguang
@ 2009-09-16  1:39   ` Wu Fengguang
  2009-09-16  1:39 ` [PATCH 2/3] devmem: check vmalloc address on kmem read/write Wu Fengguang
  2009-09-16  1:39   ` Wu Fengguang
  2 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2009-09-16  1:39 UTC (permalink / raw)
  To: Andrew Morton, KAMEZAWA Hiroyuki
  Cc: Benjamin Herrenschmidt, Greg KH, Andi Kleen, Christoph Lameter,
	Ingo Molnar, Tejun Heo, Nick Piggin, Wu Fengguang, LKML,
	linux-mm

[-- Attachment #1: kmem-dev-kmem.patch --]
[-- Type: text/plain, Size: 1896 bytes --]

When /dev/kmem read()/write() encounters hwpoison page, stop it
and return the amount of work done till now.

Vmalloc pages are not checked for now, to avoid conflicts with
ongoing vread/vwrite works.

CC: Greg KH <greg@kroah.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Tejun Heo <tj@kernel.org>
CC: Nick Piggin <npiggin@suse.de>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c |   18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

--- linux-mm.orig/drivers/char/mem.c	2009-09-16 09:25:34.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2009-09-16 09:25:43.000000000 +0800
@@ -427,6 +427,9 @@ static ssize_t read_kmem(struct file *fi
 			 */
 			kbuf = xlate_dev_kmem_ptr((char *)p);
 
+			if (unlikely(virt_addr_valid(kbuf) &&
+				     PageHWPoison(virt_to_page(kbuf))))
+				return -EIO;
 			if (copy_to_user(buf, kbuf, sz))
 				return -EFAULT;
 			buf += sz;
@@ -472,6 +475,7 @@ do_write_kmem(unsigned long p, const cha
 {
 	ssize_t written, sz;
 	unsigned long copied;
+	int err = 0;
 
 	written = 0;
 #ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED
@@ -498,13 +502,19 @@ do_write_kmem(unsigned long p, const cha
 		 */
 		ptr = xlate_dev_kmem_ptr((char *)p);
 
+		if (unlikely(virt_addr_valid(ptr) &&
+			     PageHWPoison(virt_to_page(ptr)))) {
+			err = -EIO;
+			break;
+		}
+
 		copied = copy_from_user(ptr, buf, sz);
 		if (copied) {
 			written += sz - copied;
-			if (written)
-				break;
-			return -EFAULT;
+			err = -EFAULT;
+			break;
 		}
+
 		buf += sz;
 		p += sz;
 		count -= sz;
@@ -512,7 +522,7 @@ do_write_kmem(unsigned long p, const cha
 	}
 
 	*ppos += written;
-	return written;
+	return written ? written : err;
 }
 
 

-- 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/3] HWPOISON: prevent /dev/kmem users from accessing hwpoison pages
@ 2009-09-16  1:39   ` Wu Fengguang
  0 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2009-09-16  1:39 UTC (permalink / raw)
  To: Andrew Morton, KAMEZAWA Hiroyuki
  Cc: Benjamin Herrenschmidt, Greg KH, Andi Kleen, Christoph Lameter,
	Ingo Molnar, Tejun Heo, Nick Piggin, Wu Fengguang, LKML,
	linux-mm

[-- Attachment #1: kmem-dev-kmem.patch --]
[-- Type: text/plain, Size: 2367 bytes --]

When /dev/kmem read()/write() encounters hwpoison page, stop it
and return the amount of work done till now.

Vmalloc pages are not checked for now, to avoid conflicts with
ongoing vread/vwrite works.

CC: Greg KH <greg@kroah.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Christoph Lameter <cl@linux-foundation.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Tejun Heo <tj@kernel.org>
CC: Nick Piggin <npiggin@suse.de>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
---
 drivers/char/mem.c |   18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

--- linux-mm.orig/drivers/char/mem.c	2009-09-16 09:25:34.000000000 +0800
+++ linux-mm/drivers/char/mem.c	2009-09-16 09:25:43.000000000 +0800
@@ -427,6 +427,9 @@ static ssize_t read_kmem(struct file *fi
 			 */
 			kbuf = xlate_dev_kmem_ptr((char *)p);
 
+			if (unlikely(virt_addr_valid(kbuf) &&
+				     PageHWPoison(virt_to_page(kbuf))))
+				return -EIO;
 			if (copy_to_user(buf, kbuf, sz))
 				return -EFAULT;
 			buf += sz;
@@ -472,6 +475,7 @@ do_write_kmem(unsigned long p, const cha
 {
 	ssize_t written, sz;
 	unsigned long copied;
+	int err = 0;
 
 	written = 0;
 #ifdef __ARCH_HAS_NO_PAGE_ZERO_MAPPED
@@ -498,13 +502,19 @@ do_write_kmem(unsigned long p, const cha
 		 */
 		ptr = xlate_dev_kmem_ptr((char *)p);
 
+		if (unlikely(virt_addr_valid(ptr) &&
+			     PageHWPoison(virt_to_page(ptr)))) {
+			err = -EIO;
+			break;
+		}
+
 		copied = copy_from_user(ptr, buf, sz);
 		if (copied) {
 			written += sz - copied;
-			if (written)
-				break;
-			return -EFAULT;
+			err = -EFAULT;
+			break;
 		}
+
 		buf += sz;
 		p += sz;
 		count -= sz;
@@ -512,7 +522,7 @@ do_write_kmem(unsigned long p, const cha
 	}
 
 	*ppos += written;
-	return written;
+	return written ? written : err;
 }
 
 

-- 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] devmem: change vread()/vwrite() prototype to return success or error code
  2009-09-16  1:39   ` Wu Fengguang
@ 2009-09-16  2:14     ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-09-16  2:14 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, Benjamin Herrenschmidt, Andi Kleen,
	Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML,
	linux-mm

On Wed, 16 Sep 2009 09:39:40 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> Silently ignore all vmalloc area holes in vread()/vwrite(),
> and report success (or error code in future) to the caller.
> 
> The original intention is to fix a vwrite() related bug, where
> it could return 0 which cannot be handled correctly by its caller
> write_kmem(). Then KAMEZAWA recommends to change the prototype
> to make the semantics clear.
> 
> CC: Andi Kleen <andi@firstfloor.org>
> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> CC: Christoph Lameter <cl@linux-foundation.org>
> CC: Ingo Molnar <mingo@elte.hu>
> CC: Tejun Heo <tj@kernel.org>
> CC: Nick Piggin <npiggin@suse.de>
> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
>  drivers/char/mem.c      |   24 ++++++++--------
>  fs/proc/kcore.c         |    5 ++-
>  include/linux/vmalloc.h |    6 ++--
>  mm/nommu.c              |    8 ++---
>  mm/vmalloc.c            |   55 +++++++++++---------------------------
>  5 files changed, 40 insertions(+), 58 deletions(-)
> 
> --- linux-mm.orig/mm/vmalloc.c	2009-09-16 08:52:12.000000000 +0800
> +++ linux-mm/mm/vmalloc.c	2009-09-16 09:24:11.000000000 +0800
> @@ -1655,7 +1655,6 @@ EXPORT_SYMBOL(vmalloc_32_user);
>  static int aligned_vread(char *buf, char *addr, unsigned long count)
>  {
>  	struct page *p;
> -	int copied = 0;
>  
>  	while (count) {
>  		unsigned long offset, length;
> @@ -1685,16 +1684,14 @@ static int aligned_vread(char *buf, char
>  
>  		addr += length;
>  		buf += length;
> -		copied += length;
>  		count -= length;
>  	}
> -	return copied;
> +	return 0;
>  }
>  
>  static int aligned_vwrite(char *buf, char *addr, unsigned long count)
>  {
>  	struct page *p;
> -	int copied = 0;
>  
>  	while (count) {
>  		unsigned long offset, length;
> @@ -1722,10 +1719,9 @@ static int aligned_vwrite(char *buf, cha
>  		}
>  		addr += length;
>  		buf += length;
> -		copied += length;
>  		count -= length;
>  	}
> -	return copied;
> +	return 0;
>  }
>  
>  /**
> @@ -1734,9 +1730,7 @@ static int aligned_vwrite(char *buf, cha
>   *	@addr:		vm address.
>   *	@count:		number of bytes to be read.
>   *
> - *	Returns # of bytes which addr and buf should be increased.
> - *	(same number to @count). Returns 0 if [addr...addr+count) doesn't
> - *	includes any intersect with alive vmalloc area.
> + *	Returns 0 on success.

Seems to return 0 _always_. My version retunrs false if it's out-of-range.
plz return false if out-of-vmalloc-range.
Now, use vmalloc area is exported via /proc/vmallocinfo. Then, it's easy
to know "valid" vmalloc area. But users cannot know memory holes in
"valid" vmalloc area. So, vread() returns success even if it founds holes.

Another thought:
Should we purge vread/vwrite and just use copy_to/from_user ?
To do that, /proc/kcore should be revisited...
It has to copy pages one by one in aligned manner.

Anyway, it's merge window and not good season for this semantic changes of
a function.

>   *
>   *	This function checks that addr is a valid vmalloc'ed area, and
>   *	copy data from that area to a given buffer. If the given memory range
> @@ -1744,23 +1738,21 @@ static int aligned_vwrite(char *buf, cha
>   *	proper area of @buf. If there are memory holes, they'll be zero-filled.
>   *	IOREMAP area is treated as memory hole and no copy is done.
>   *
> - *	If [addr...addr+count) doesn't includes any intersects with alive
> - *	vm_struct area, returns 0.
>   *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
>   *	the caller should guarantee KM_USER0 is not used.
>   *
>   *	Note: In usual ops, vread() is never necessary because the caller
>   *	should know vmalloc() area is valid and can use memcpy().
>   *	This is for routines which have to access vmalloc area without
> - *	any informaion, as /dev/kmem.
> + *	any information, as /dev/kmem and /dev/kcore.
>   *
>   */
>  
> -long vread(char *buf, char *addr, unsigned long count)
> +int vread(char *buf, char *addr, unsigned long count)
>  {
>  	struct vm_struct *tmp;
> -	char *vaddr, *buf_start = buf;
> -	unsigned long buflen = count;
> +	char *vaddr;
> +	char *buf_end = buf + count;
>  	unsigned long n;
>  
>  	/* Don't allow overflow */
> @@ -1794,13 +1786,11 @@ long vread(char *buf, char *addr, unsign
>  finished:
>  	read_unlock(&vmlist_lock);
>  
> -	if (buf == buf_start)
> -		return 0;
>  	/* zero-fill memory holes */
> -	if (buf != buf_start + buflen)
> -		memset(buf, 0, buflen - (buf - buf_start));
> +	if (buf != buf_end)
> +		memset(buf, 0, buf_end - buf);
>  
> -	return buflen;
> +	return 0;
>  }
>  
>  /**
> @@ -1809,10 +1799,7 @@ finished:
>   *	@addr:		vm address.
>   *	@count:		number of bytes to be read.
>   *
> - *	Returns # of bytes which addr and buf should be incresed.
> - *	(same number to @count).
> - *	If [addr...addr+count) doesn't includes any intersect with valid
> - *	vmalloc area, returns 0.
> + *	Returns 0 on success.
>   *
>   *	This function checks that addr is a valid vmalloc'ed area, and
>   *	copy data from a buffer to the given addr. If specified range of
> @@ -1820,30 +1807,24 @@ finished:
>   *	proper area of @buf. If there are memory holes, no copy to hole.
>   *	IOREMAP area is treated as memory hole and no copy is done.
>   *
> - *	If [addr...addr+count) doesn't includes any intersects with alive
> - *	vm_struct area, returns 0.
>   *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
>   *	the caller should guarantee KM_USER0 is not used.
>   *
>   *	Note: In usual ops, vwrite() is never necessary because the caller
>   *	should know vmalloc() area is valid and can use memcpy().
>   *	This is for routines which have to access vmalloc area without
> - *	any informaion, as /dev/kmem.
> - *
> - *	The caller should guarantee KM_USER1 is not used.
> + *	any information, as /dev/kmem.
>   */
>  
> -long vwrite(char *buf, char *addr, unsigned long count)
> +int vwrite(char *buf, char *addr, unsigned long count)
>  {
>  	struct vm_struct *tmp;
>  	char *vaddr;
> -	unsigned long n, buflen;
> -	int copied = 0;
> +	unsigned long n;
>  
>  	/* Don't allow overflow */
>  	if ((unsigned long) addr + count < count)
>  		count = -(unsigned long) addr;
> -	buflen = count;
>  
>  	read_lock(&vmlist_lock);
>  	for (tmp = vmlist; count && tmp; tmp = tmp->next) {
> @@ -1860,19 +1841,15 @@ long vwrite(char *buf, char *addr, unsig
>  		n = vaddr + tmp->size - PAGE_SIZE - addr;
>  		if (n > count)
>  			n = count;
> -		if (!(tmp->flags & VM_IOREMAP)) {
> +		if (!(tmp->flags & VM_IOREMAP))
>  			aligned_vwrite(buf, addr, n);
> -			copied++;
> -		}
>  		buf += n;
>  		addr += n;
>  		count -= n;
>  	}
>  finished:
>  	read_unlock(&vmlist_lock);
> -	if (!copied)
> -		return 0;
> -	return buflen;
> +	return 0;
>  }
>  
>  /**
> --- linux-mm.orig/include/linux/vmalloc.h	2009-09-16 08:52:12.000000000 +0800
> +++ linux-mm/include/linux/vmalloc.h	2009-09-16 08:52:17.000000000 +0800
> @@ -104,9 +104,9 @@ extern void unmap_kernel_range(unsigned 
>  extern struct vm_struct *alloc_vm_area(size_t size);
>  extern void free_vm_area(struct vm_struct *area);
>  
> -/* for /dev/kmem */
> -extern long vread(char *buf, char *addr, unsigned long count);
> -extern long vwrite(char *buf, char *addr, unsigned long count);
> +/* for /dev/kmem and /dev/kcore */
> +extern int vread(char *buf, char *addr, unsigned long count);
> +extern int vwrite(char *buf, char *addr, unsigned long count);
>  
>  /*
>   *	Internals.  Dont't use..
> --- linux-mm.orig/drivers/char/mem.c	2009-09-16 08:52:12.000000000 +0800
> +++ linux-mm/drivers/char/mem.c	2009-09-16 09:23:00.000000000 +0800
> @@ -396,6 +396,7 @@ static ssize_t read_kmem(struct file *fi
>  	unsigned long p = *ppos;
>  	ssize_t low_count, read, sz;
>  	char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */
> +	int err = 0;
>  
>  	read = 0;
>  	if (p < (unsigned long) high_memory) {
> @@ -442,12 +443,12 @@ static ssize_t read_kmem(struct file *fi
>  			return -ENOMEM;
>  		while (count > 0) {
>  			sz = size_inside_page(p, count);
> -			sz = vread(kbuf, (char *)p, sz);
> -			if (!sz)
> +			err = vread(kbuf, (char *)p, sz);
> +			if (err)
>  				break;
>  			if (copy_to_user(buf, kbuf, sz)) {
> -				free_page((unsigned long)kbuf);
> -				return -EFAULT;
> +				err = -EFAULT;
> +				break;
>  			}
>  			count -= sz;
>  			buf += sz;
> @@ -457,7 +458,7 @@ static ssize_t read_kmem(struct file *fi
>  		free_page((unsigned long)kbuf);
>  	}
>   	*ppos = p;
> - 	return read;
> +	return read ? read : err;
>  }
>  
>  
> @@ -521,6 +522,7 @@ static ssize_t write_kmem(struct file * 
>  	ssize_t wrote = 0;
>  	ssize_t virtr = 0;
>  	char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */
> +	int err = 0;
>  
>  	if (p < (unsigned long) high_memory) {
>  		unsigned long to_write = min_t(unsigned long, count,
> @@ -543,12 +545,12 @@ static ssize_t write_kmem(struct file * 
>  
>  			n = copy_from_user(kbuf, buf, sz);
>  			if (n) {
> -				if (wrote + virtr)
> -					break;
> -				free_page((unsigned long)kbuf);
> -				return -EFAULT;
> +				err = -EFAULT;
> +				break;
>  			}
> -			sz = vwrite(kbuf, (char *)p, sz);
> +			err = vwrite(kbuf, (char *)p, sz);
> +			if (err)
> +				break;
>  			count -= sz;
>  			buf += sz;
>  			virtr += sz;
> @@ -558,7 +560,7 @@ static ssize_t write_kmem(struct file * 
>  	}
>  
>   	*ppos = p;
> - 	return virtr + wrote;
> +	return virtr + wrote ? : err;
>  }
>  #endif
>  
> --- linux-mm.orig/fs/proc/kcore.c	2009-09-16 08:52:12.000000000 +0800
> +++ linux-mm/fs/proc/kcore.c	2009-09-16 08:52:17.000000000 +0800
> @@ -492,11 +492,14 @@ read_kcore(struct file *file, char __use
>  				return -EFAULT;
>  		} else if (is_vmalloc_or_module_addr((void *)start)) {
>  			char * elf_buf;
> +			int err;
>  
>  			elf_buf = kzalloc(tsz, GFP_KERNEL);
>  			if (!elf_buf)
>  				return -ENOMEM;
> -			vread(elf_buf, (char *)start, tsz);
> +			err = vread(elf_buf, (char *)start, tsz);
> +			if (err)
> +				return err;

This is wrong.

/proc/kcore provides "coredump" image of kernel and it never faults.
Then, it doesn't check error code intentionally and just return
zero-filled buffer of reuqested length.

Thanks,
-Kame


>  			/* we have to zero-fill user buffer even if no read */
>  			if (copy_to_user(buffer, elf_buf, tsz)) {
>  				kfree(elf_buf);
> --- linux-mm.orig/mm/nommu.c	2009-09-16 09:01:36.000000000 +0800
> +++ linux-mm/mm/nommu.c	2009-09-16 09:02:02.000000000 +0800
> @@ -263,20 +263,20 @@ unsigned long vmalloc_to_pfn(const void 
>  }
>  EXPORT_SYMBOL(vmalloc_to_pfn);
>  
> -long vread(char *buf, char *addr, unsigned long count)
> +int vread(char *buf, char *addr, unsigned long count)
>  {
>  	memcpy(buf, addr, count);
> -	return count;
> +	return 0;
>  }
>  
> -long vwrite(char *buf, char *addr, unsigned long count)
> +int vwrite(char *buf, char *addr, unsigned long count)
>  {
>  	/* Don't allow overflow */
>  	if ((unsigned long) addr + count < count)
>  		count = -(unsigned long) addr;
>  
>  	memcpy(addr, buf, count);
> -	return(count);
> +	return 0;
>  }
>  
>  /*
> 
> -- 
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] devmem: change vread()/vwrite() prototype to return success or error code
@ 2009-09-16  2:14     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-09-16  2:14 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, Benjamin Herrenschmidt, Andi Kleen,
	Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML,
	linux-mm

On Wed, 16 Sep 2009 09:39:40 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> Silently ignore all vmalloc area holes in vread()/vwrite(),
> and report success (or error code in future) to the caller.
> 
> The original intention is to fix a vwrite() related bug, where
> it could return 0 which cannot be handled correctly by its caller
> write_kmem(). Then KAMEZAWA recommends to change the prototype
> to make the semantics clear.
> 
> CC: Andi Kleen <andi@firstfloor.org>
> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> CC: Christoph Lameter <cl@linux-foundation.org>
> CC: Ingo Molnar <mingo@elte.hu>
> CC: Tejun Heo <tj@kernel.org>
> CC: Nick Piggin <npiggin@suse.de>
> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
>  drivers/char/mem.c      |   24 ++++++++--------
>  fs/proc/kcore.c         |    5 ++-
>  include/linux/vmalloc.h |    6 ++--
>  mm/nommu.c              |    8 ++---
>  mm/vmalloc.c            |   55 +++++++++++---------------------------
>  5 files changed, 40 insertions(+), 58 deletions(-)
> 
> --- linux-mm.orig/mm/vmalloc.c	2009-09-16 08:52:12.000000000 +0800
> +++ linux-mm/mm/vmalloc.c	2009-09-16 09:24:11.000000000 +0800
> @@ -1655,7 +1655,6 @@ EXPORT_SYMBOL(vmalloc_32_user);
>  static int aligned_vread(char *buf, char *addr, unsigned long count)
>  {
>  	struct page *p;
> -	int copied = 0;
>  
>  	while (count) {
>  		unsigned long offset, length;
> @@ -1685,16 +1684,14 @@ static int aligned_vread(char *buf, char
>  
>  		addr += length;
>  		buf += length;
> -		copied += length;
>  		count -= length;
>  	}
> -	return copied;
> +	return 0;
>  }
>  
>  static int aligned_vwrite(char *buf, char *addr, unsigned long count)
>  {
>  	struct page *p;
> -	int copied = 0;
>  
>  	while (count) {
>  		unsigned long offset, length;
> @@ -1722,10 +1719,9 @@ static int aligned_vwrite(char *buf, cha
>  		}
>  		addr += length;
>  		buf += length;
> -		copied += length;
>  		count -= length;
>  	}
> -	return copied;
> +	return 0;
>  }
>  
>  /**
> @@ -1734,9 +1730,7 @@ static int aligned_vwrite(char *buf, cha
>   *	@addr:		vm address.
>   *	@count:		number of bytes to be read.
>   *
> - *	Returns # of bytes which addr and buf should be increased.
> - *	(same number to @count). Returns 0 if [addr...addr+count) doesn't
> - *	includes any intersect with alive vmalloc area.
> + *	Returns 0 on success.

Seems to return 0 _always_. My version retunrs false if it's out-of-range.
plz return false if out-of-vmalloc-range.
Now, use vmalloc area is exported via /proc/vmallocinfo. Then, it's easy
to know "valid" vmalloc area. But users cannot know memory holes in
"valid" vmalloc area. So, vread() returns success even if it founds holes.

Another thought:
Should we purge vread/vwrite and just use copy_to/from_user ?
To do that, /proc/kcore should be revisited...
It has to copy pages one by one in aligned manner.

Anyway, it's merge window and not good season for this semantic changes of
a function.

>   *
>   *	This function checks that addr is a valid vmalloc'ed area, and
>   *	copy data from that area to a given buffer. If the given memory range
> @@ -1744,23 +1738,21 @@ static int aligned_vwrite(char *buf, cha
>   *	proper area of @buf. If there are memory holes, they'll be zero-filled.
>   *	IOREMAP area is treated as memory hole and no copy is done.
>   *
> - *	If [addr...addr+count) doesn't includes any intersects with alive
> - *	vm_struct area, returns 0.
>   *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
>   *	the caller should guarantee KM_USER0 is not used.
>   *
>   *	Note: In usual ops, vread() is never necessary because the caller
>   *	should know vmalloc() area is valid and can use memcpy().
>   *	This is for routines which have to access vmalloc area without
> - *	any informaion, as /dev/kmem.
> + *	any information, as /dev/kmem and /dev/kcore.
>   *
>   */
>  
> -long vread(char *buf, char *addr, unsigned long count)
> +int vread(char *buf, char *addr, unsigned long count)
>  {
>  	struct vm_struct *tmp;
> -	char *vaddr, *buf_start = buf;
> -	unsigned long buflen = count;
> +	char *vaddr;
> +	char *buf_end = buf + count;
>  	unsigned long n;
>  
>  	/* Don't allow overflow */
> @@ -1794,13 +1786,11 @@ long vread(char *buf, char *addr, unsign
>  finished:
>  	read_unlock(&vmlist_lock);
>  
> -	if (buf == buf_start)
> -		return 0;
>  	/* zero-fill memory holes */
> -	if (buf != buf_start + buflen)
> -		memset(buf, 0, buflen - (buf - buf_start));
> +	if (buf != buf_end)
> +		memset(buf, 0, buf_end - buf);
>  
> -	return buflen;
> +	return 0;
>  }
>  
>  /**
> @@ -1809,10 +1799,7 @@ finished:
>   *	@addr:		vm address.
>   *	@count:		number of bytes to be read.
>   *
> - *	Returns # of bytes which addr and buf should be incresed.
> - *	(same number to @count).
> - *	If [addr...addr+count) doesn't includes any intersect with valid
> - *	vmalloc area, returns 0.
> + *	Returns 0 on success.
>   *
>   *	This function checks that addr is a valid vmalloc'ed area, and
>   *	copy data from a buffer to the given addr. If specified range of
> @@ -1820,30 +1807,24 @@ finished:
>   *	proper area of @buf. If there are memory holes, no copy to hole.
>   *	IOREMAP area is treated as memory hole and no copy is done.
>   *
> - *	If [addr...addr+count) doesn't includes any intersects with alive
> - *	vm_struct area, returns 0.
>   *	@buf should be kernel's buffer. Because	this function uses KM_USER0,
>   *	the caller should guarantee KM_USER0 is not used.
>   *
>   *	Note: In usual ops, vwrite() is never necessary because the caller
>   *	should know vmalloc() area is valid and can use memcpy().
>   *	This is for routines which have to access vmalloc area without
> - *	any informaion, as /dev/kmem.
> - *
> - *	The caller should guarantee KM_USER1 is not used.
> + *	any information, as /dev/kmem.
>   */
>  
> -long vwrite(char *buf, char *addr, unsigned long count)
> +int vwrite(char *buf, char *addr, unsigned long count)
>  {
>  	struct vm_struct *tmp;
>  	char *vaddr;
> -	unsigned long n, buflen;
> -	int copied = 0;
> +	unsigned long n;
>  
>  	/* Don't allow overflow */
>  	if ((unsigned long) addr + count < count)
>  		count = -(unsigned long) addr;
> -	buflen = count;
>  
>  	read_lock(&vmlist_lock);
>  	for (tmp = vmlist; count && tmp; tmp = tmp->next) {
> @@ -1860,19 +1841,15 @@ long vwrite(char *buf, char *addr, unsig
>  		n = vaddr + tmp->size - PAGE_SIZE - addr;
>  		if (n > count)
>  			n = count;
> -		if (!(tmp->flags & VM_IOREMAP)) {
> +		if (!(tmp->flags & VM_IOREMAP))
>  			aligned_vwrite(buf, addr, n);
> -			copied++;
> -		}
>  		buf += n;
>  		addr += n;
>  		count -= n;
>  	}
>  finished:
>  	read_unlock(&vmlist_lock);
> -	if (!copied)
> -		return 0;
> -	return buflen;
> +	return 0;
>  }
>  
>  /**
> --- linux-mm.orig/include/linux/vmalloc.h	2009-09-16 08:52:12.000000000 +0800
> +++ linux-mm/include/linux/vmalloc.h	2009-09-16 08:52:17.000000000 +0800
> @@ -104,9 +104,9 @@ extern void unmap_kernel_range(unsigned 
>  extern struct vm_struct *alloc_vm_area(size_t size);
>  extern void free_vm_area(struct vm_struct *area);
>  
> -/* for /dev/kmem */
> -extern long vread(char *buf, char *addr, unsigned long count);
> -extern long vwrite(char *buf, char *addr, unsigned long count);
> +/* for /dev/kmem and /dev/kcore */
> +extern int vread(char *buf, char *addr, unsigned long count);
> +extern int vwrite(char *buf, char *addr, unsigned long count);
>  
>  /*
>   *	Internals.  Dont't use..
> --- linux-mm.orig/drivers/char/mem.c	2009-09-16 08:52:12.000000000 +0800
> +++ linux-mm/drivers/char/mem.c	2009-09-16 09:23:00.000000000 +0800
> @@ -396,6 +396,7 @@ static ssize_t read_kmem(struct file *fi
>  	unsigned long p = *ppos;
>  	ssize_t low_count, read, sz;
>  	char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */
> +	int err = 0;
>  
>  	read = 0;
>  	if (p < (unsigned long) high_memory) {
> @@ -442,12 +443,12 @@ static ssize_t read_kmem(struct file *fi
>  			return -ENOMEM;
>  		while (count > 0) {
>  			sz = size_inside_page(p, count);
> -			sz = vread(kbuf, (char *)p, sz);
> -			if (!sz)
> +			err = vread(kbuf, (char *)p, sz);
> +			if (err)
>  				break;
>  			if (copy_to_user(buf, kbuf, sz)) {
> -				free_page((unsigned long)kbuf);
> -				return -EFAULT;
> +				err = -EFAULT;
> +				break;
>  			}
>  			count -= sz;
>  			buf += sz;
> @@ -457,7 +458,7 @@ static ssize_t read_kmem(struct file *fi
>  		free_page((unsigned long)kbuf);
>  	}
>   	*ppos = p;
> - 	return read;
> +	return read ? read : err;
>  }
>  
>  
> @@ -521,6 +522,7 @@ static ssize_t write_kmem(struct file * 
>  	ssize_t wrote = 0;
>  	ssize_t virtr = 0;
>  	char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */
> +	int err = 0;
>  
>  	if (p < (unsigned long) high_memory) {
>  		unsigned long to_write = min_t(unsigned long, count,
> @@ -543,12 +545,12 @@ static ssize_t write_kmem(struct file * 
>  
>  			n = copy_from_user(kbuf, buf, sz);
>  			if (n) {
> -				if (wrote + virtr)
> -					break;
> -				free_page((unsigned long)kbuf);
> -				return -EFAULT;
> +				err = -EFAULT;
> +				break;
>  			}
> -			sz = vwrite(kbuf, (char *)p, sz);
> +			err = vwrite(kbuf, (char *)p, sz);
> +			if (err)
> +				break;
>  			count -= sz;
>  			buf += sz;
>  			virtr += sz;
> @@ -558,7 +560,7 @@ static ssize_t write_kmem(struct file * 
>  	}
>  
>   	*ppos = p;
> - 	return virtr + wrote;
> +	return virtr + wrote ? : err;
>  }
>  #endif
>  
> --- linux-mm.orig/fs/proc/kcore.c	2009-09-16 08:52:12.000000000 +0800
> +++ linux-mm/fs/proc/kcore.c	2009-09-16 08:52:17.000000000 +0800
> @@ -492,11 +492,14 @@ read_kcore(struct file *file, char __use
>  				return -EFAULT;
>  		} else if (is_vmalloc_or_module_addr((void *)start)) {
>  			char * elf_buf;
> +			int err;
>  
>  			elf_buf = kzalloc(tsz, GFP_KERNEL);
>  			if (!elf_buf)
>  				return -ENOMEM;
> -			vread(elf_buf, (char *)start, tsz);
> +			err = vread(elf_buf, (char *)start, tsz);
> +			if (err)
> +				return err;

This is wrong.

/proc/kcore provides "coredump" image of kernel and it never faults.
Then, it doesn't check error code intentionally and just return
zero-filled buffer of reuqested length.

Thanks,
-Kame


>  			/* we have to zero-fill user buffer even if no read */
>  			if (copy_to_user(buffer, elf_buf, tsz)) {
>  				kfree(elf_buf);
> --- linux-mm.orig/mm/nommu.c	2009-09-16 09:01:36.000000000 +0800
> +++ linux-mm/mm/nommu.c	2009-09-16 09:02:02.000000000 +0800
> @@ -263,20 +263,20 @@ unsigned long vmalloc_to_pfn(const void 
>  }
>  EXPORT_SYMBOL(vmalloc_to_pfn);
>  
> -long vread(char *buf, char *addr, unsigned long count)
> +int vread(char *buf, char *addr, unsigned long count)
>  {
>  	memcpy(buf, addr, count);
> -	return count;
> +	return 0;
>  }
>  
> -long vwrite(char *buf, char *addr, unsigned long count)
> +int vwrite(char *buf, char *addr, unsigned long count)
>  {
>  	/* Don't allow overflow */
>  	if ((unsigned long) addr + count < count)
>  		count = -(unsigned long) addr;
>  
>  	memcpy(addr, buf, count);
> -	return(count);
> +	return 0;
>  }
>  
>  /*
> 
> -- 
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] devmem: change vread()/vwrite() prototype to return success or error code
  2009-09-16  2:14     ` KAMEZAWA Hiroyuki
@ 2009-09-16  2:39       ` Wu Fengguang
  -1 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2009-09-16  2:39 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Andrew Morton, Benjamin Herrenschmidt, Andi Kleen,
	Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML,
	linux-mm

On Wed, Sep 16, 2009 at 10:14:25AM +0800, KAMEZAWA Hiroyuki wrote:
> On Wed, 16 Sep 2009 09:39:40 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > Silently ignore all vmalloc area holes in vread()/vwrite(),
> > and report success (or error code in future) to the caller.
> >
> > The original intention is to fix a vwrite() related bug, where
> > it could return 0 which cannot be handled correctly by its caller
> > write_kmem(). Then KAMEZAWA recommends to change the prototype
> > to make the semantics clear.
> >
> > CC: Andi Kleen <andi@firstfloor.org>
> > CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > CC: Christoph Lameter <cl@linux-foundation.org>
> > CC: Ingo Molnar <mingo@elte.hu>
> > CC: Tejun Heo <tj@kernel.org>
> > CC: Nick Piggin <npiggin@suse.de>
> > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > ---
> >  drivers/char/mem.c      |   24 ++++++++--------
> >  fs/proc/kcore.c         |    5 ++-
> >  include/linux/vmalloc.h |    6 ++--
> >  mm/nommu.c              |    8 ++---
> >  mm/vmalloc.c            |   55 +++++++++++---------------------------
> >  5 files changed, 40 insertions(+), 58 deletions(-)
> >
> > --- linux-mm.orig/mm/vmalloc.c        2009-09-16 08:52:12.000000000 +0800
> > +++ linux-mm/mm/vmalloc.c     2009-09-16 09:24:11.000000000 +0800
> > @@ -1655,7 +1655,6 @@ EXPORT_SYMBOL(vmalloc_32_user);
> >  static int aligned_vread(char *buf, char *addr, unsigned long count)
> >  {
> >       struct page *p;
> > -     int copied = 0;
> >
> >       while (count) {
> >               unsigned long offset, length;
> > @@ -1685,16 +1684,14 @@ static int aligned_vread(char *buf, char
> >
> >               addr += length;
> >               buf += length;
> > -             copied += length;
> >               count -= length;
> >       }
> > -     return copied;
> > +     return 0;
> >  }
> >
> >  static int aligned_vwrite(char *buf, char *addr, unsigned long count)
> >  {
> >       struct page *p;
> > -     int copied = 0;
> >
> >       while (count) {
> >               unsigned long offset, length;
> > @@ -1722,10 +1719,9 @@ static int aligned_vwrite(char *buf, cha
> >               }
> >               addr += length;
> >               buf += length;
> > -             copied += length;
> >               count -= length;
> >       }
> > -     return copied;
> > +     return 0;
> >  }
> >
> >  /**
> > @@ -1734,9 +1730,7 @@ static int aligned_vwrite(char *buf, cha
> >   *   @addr:          vm address.
> >   *   @count:         number of bytes to be read.
> >   *
> > - *   Returns # of bytes which addr and buf should be increased.
> > - *   (same number to @count). Returns 0 if [addr...addr+count) doesn't
> > - *   includes any intersect with alive vmalloc area.
> > + *   Returns 0 on success.
> 
> Seems to return 0 _always_. My version retunrs false if it's out-of-range.
> plz return false if out-of-vmalloc-range.
> Now, use vmalloc area is exported via /proc/vmallocinfo. Then, it's easy
> to know "valid" vmalloc area. But users cannot know memory holes in
> "valid" vmalloc area. So, vread() returns success even if it founds holes.

OK, that's fair enough.

> Another thought:
> Should we purge vread/vwrite and just use copy_to/from_user ?

I like its simplicity. Though I'm afraid I don't know enough to make useful comments.

> To do that, /proc/kcore should be revisited...
> It has to copy pages one by one in aligned manner.

You know it used to not rely on vread() ;)

> Anyway, it's merge window and not good season for this semantic changes of
> a function.

Agreed. I'll do a bug fix only patch, and leave other decisions to you.

> >   *
> >   *   This function checks that addr is a valid vmalloc'ed area, and
> >   *   copy data from that area to a given buffer. If the given memory range
> > @@ -1744,23 +1738,21 @@ static int aligned_vwrite(char *buf, cha
> >   *   proper area of @buf. If there are memory holes, they'll be zero-filled.
> >   *   IOREMAP area is treated as memory hole and no copy is done.
> >   *
> > - *   If [addr...addr+count) doesn't includes any intersects with alive
> > - *   vm_struct area, returns 0.
> >   *   @buf should be kernel's buffer. Because this function uses KM_USER0,
> >   *   the caller should guarantee KM_USER0 is not used.
> >   *
> >   *   Note: In usual ops, vread() is never necessary because the caller
> >   *   should know vmalloc() area is valid and can use memcpy().
> >   *   This is for routines which have to access vmalloc area without
> > - *   any informaion, as /dev/kmem.
> > + *   any information, as /dev/kmem and /dev/kcore.
> >   *
> >   */
> >
> > -long vread(char *buf, char *addr, unsigned long count)
> > +int vread(char *buf, char *addr, unsigned long count)
> >  {
> >       struct vm_struct *tmp;
> > -     char *vaddr, *buf_start = buf;
> > -     unsigned long buflen = count;
> > +     char *vaddr;
> > +     char *buf_end = buf + count;
> >       unsigned long n;
> >
> >       /* Don't allow overflow */
> > @@ -1794,13 +1786,11 @@ long vread(char *buf, char *addr, unsign
> >  finished:
> >       read_unlock(&vmlist_lock);
> >
> > -     if (buf == buf_start)
> > -             return 0;
> >       /* zero-fill memory holes */
> > -     if (buf != buf_start + buflen)
> > -             memset(buf, 0, buflen - (buf - buf_start));
> > +     if (buf != buf_end)
> > +             memset(buf, 0, buf_end - buf);
> >
> > -     return buflen;
> > +     return 0;
> >  }
> >
> >  /**
> > @@ -1809,10 +1799,7 @@ finished:
> >   *   @addr:          vm address.
> >   *   @count:         number of bytes to be read.
> >   *
> > - *   Returns # of bytes which addr and buf should be incresed.
> > - *   (same number to @count).
> > - *   If [addr...addr+count) doesn't includes any intersect with valid
> > - *   vmalloc area, returns 0.
> > + *   Returns 0 on success.
> >   *
> >   *   This function checks that addr is a valid vmalloc'ed area, and
> >   *   copy data from a buffer to the given addr. If specified range of
> > @@ -1820,30 +1807,24 @@ finished:
> >   *   proper area of @buf. If there are memory holes, no copy to hole.
> >   *   IOREMAP area is treated as memory hole and no copy is done.
> >   *
> > - *   If [addr...addr+count) doesn't includes any intersects with alive
> > - *   vm_struct area, returns 0.
> >   *   @buf should be kernel's buffer. Because this function uses KM_USER0,
> >   *   the caller should guarantee KM_USER0 is not used.
> >   *
> >   *   Note: In usual ops, vwrite() is never necessary because the caller
> >   *   should know vmalloc() area is valid and can use memcpy().
> >   *   This is for routines which have to access vmalloc area without
> > - *   any informaion, as /dev/kmem.
> > - *
> > - *   The caller should guarantee KM_USER1 is not used.
> > + *   any information, as /dev/kmem.
> >   */
> >
> > -long vwrite(char *buf, char *addr, unsigned long count)
> > +int vwrite(char *buf, char *addr, unsigned long count)
> >  {
> >       struct vm_struct *tmp;
> >       char *vaddr;
> > -     unsigned long n, buflen;
> > -     int copied = 0;
> > +     unsigned long n;
> >
> >       /* Don't allow overflow */
> >       if ((unsigned long) addr + count < count)
> >               count = -(unsigned long) addr;
> > -     buflen = count;
> >
> >       read_lock(&vmlist_lock);
> >       for (tmp = vmlist; count && tmp; tmp = tmp->next) {
> > @@ -1860,19 +1841,15 @@ long vwrite(char *buf, char *addr, unsig
> >               n = vaddr + tmp->size - PAGE_SIZE - addr;
> >               if (n > count)
> >                       n = count;
> > -             if (!(tmp->flags & VM_IOREMAP)) {
> > +             if (!(tmp->flags & VM_IOREMAP))
> >                       aligned_vwrite(buf, addr, n);
> > -                     copied++;
> > -             }
> >               buf += n;
> >               addr += n;
> >               count -= n;
> >       }
> >  finished:
> >       read_unlock(&vmlist_lock);
> > -     if (!copied)
> > -             return 0;
> > -     return buflen;
> > +     return 0;
> >  }
> >
> >  /**
> > --- linux-mm.orig/include/linux/vmalloc.h     2009-09-16 08:52:12.000000000 +0800
> > +++ linux-mm/include/linux/vmalloc.h  2009-09-16 08:52:17.000000000 +0800
> > @@ -104,9 +104,9 @@ extern void unmap_kernel_range(unsigned
> >  extern struct vm_struct *alloc_vm_area(size_t size);
> >  extern void free_vm_area(struct vm_struct *area);
> >
> > -/* for /dev/kmem */
> > -extern long vread(char *buf, char *addr, unsigned long count);
> > -extern long vwrite(char *buf, char *addr, unsigned long count);
> > +/* for /dev/kmem and /dev/kcore */
> > +extern int vread(char *buf, char *addr, unsigned long count);
> > +extern int vwrite(char *buf, char *addr, unsigned long count);
> >
> >  /*
> >   *   Internals.  Dont't use..
> > --- linux-mm.orig/drivers/char/mem.c  2009-09-16 08:52:12.000000000 +0800
> > +++ linux-mm/drivers/char/mem.c       2009-09-16 09:23:00.000000000 +0800
> > @@ -396,6 +396,7 @@ static ssize_t read_kmem(struct file *fi
> >       unsigned long p = *ppos;
> >       ssize_t low_count, read, sz;
> >       char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */
> > +     int err = 0;
> >
> >       read = 0;
> >       if (p < (unsigned long) high_memory) {
> > @@ -442,12 +443,12 @@ static ssize_t read_kmem(struct file *fi
> >                       return -ENOMEM;
> >               while (count > 0) {
> >                       sz = size_inside_page(p, count);
> > -                     sz = vread(kbuf, (char *)p, sz);
> > -                     if (!sz)
> > +                     err = vread(kbuf, (char *)p, sz);
> > +                     if (err)
> >                               break;
> >                       if (copy_to_user(buf, kbuf, sz)) {
> > -                             free_page((unsigned long)kbuf);
> > -                             return -EFAULT;
> > +                             err = -EFAULT;
> > +                             break;
> >                       }
> >                       count -= sz;
> >                       buf += sz;
> > @@ -457,7 +458,7 @@ static ssize_t read_kmem(struct file *fi
> >               free_page((unsigned long)kbuf);
> >       }
> >       *ppos = p;
> > -     return read;
> > +     return read ? read : err;
> >  }
> >
> >
> > @@ -521,6 +522,7 @@ static ssize_t write_kmem(struct file *
> >       ssize_t wrote = 0;
> >       ssize_t virtr = 0;
> >       char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */
> > +     int err = 0;
> >
> >       if (p < (unsigned long) high_memory) {
> >               unsigned long to_write = min_t(unsigned long, count,
> > @@ -543,12 +545,12 @@ static ssize_t write_kmem(struct file *
> >
> >                       n = copy_from_user(kbuf, buf, sz);
> >                       if (n) {
> > -                             if (wrote + virtr)
> > -                                     break;
> > -                             free_page((unsigned long)kbuf);
> > -                             return -EFAULT;
> > +                             err = -EFAULT;
> > +                             break;
> >                       }
> > -                     sz = vwrite(kbuf, (char *)p, sz);
> > +                     err = vwrite(kbuf, (char *)p, sz);
> > +                     if (err)
> > +                             break;
> >                       count -= sz;
> >                       buf += sz;
> >                       virtr += sz;
> > @@ -558,7 +560,7 @@ static ssize_t write_kmem(struct file *
> >       }
> >
> >       *ppos = p;
> > -     return virtr + wrote;
> > +     return virtr + wrote ? : err;
> >  }
> >  #endif
> >
> > --- linux-mm.orig/fs/proc/kcore.c     2009-09-16 08:52:12.000000000 +0800
> > +++ linux-mm/fs/proc/kcore.c  2009-09-16 08:52:17.000000000 +0800
> > @@ -492,11 +492,14 @@ read_kcore(struct file *file, char __use
> >                               return -EFAULT;
> >               } else if (is_vmalloc_or_module_addr((void *)start)) {
> >                       char * elf_buf;
> > +                     int err;
> >
> >                       elf_buf = kzalloc(tsz, GFP_KERNEL);
> >                       if (!elf_buf)
> >                               return -ENOMEM;
> > -                     vread(elf_buf, (char *)start, tsz);
> > +                     err = vread(elf_buf, (char *)start, tsz);
> > +                     if (err)
> > +                             return err;
> 
> This is wrong.
> 
> /proc/kcore provides "coredump" image of kernel and it never faults.
> Then, it doesn't check error code intentionally and just return
> zero-filled buffer of reuqested length.

Ah OK. Will revert this change.

Thanks,
Fengguang

> >                       /* we have to zero-fill user buffer even if no read */
> >                       if (copy_to_user(buffer, elf_buf, tsz)) {
> >                               kfree(elf_buf);
> > --- linux-mm.orig/mm/nommu.c  2009-09-16 09:01:36.000000000 +0800
> > +++ linux-mm/mm/nommu.c       2009-09-16 09:02:02.000000000 +0800
> > @@ -263,20 +263,20 @@ unsigned long vmalloc_to_pfn(const void
> >  }
> >  EXPORT_SYMBOL(vmalloc_to_pfn);
> >
> > -long vread(char *buf, char *addr, unsigned long count)
> > +int vread(char *buf, char *addr, unsigned long count)
> >  {
> >       memcpy(buf, addr, count);
> > -     return count;
> > +     return 0;
> >  }
> >
> > -long vwrite(char *buf, char *addr, unsigned long count)
> > +int vwrite(char *buf, char *addr, unsigned long count)
> >  {
> >       /* Don't allow overflow */
> >       if ((unsigned long) addr + count < count)
> >               count = -(unsigned long) addr;
> >
> >       memcpy(addr, buf, count);
> > -     return(count);
> > +     return 0;
> >  }
> >
> >  /*
> >
> > --
> >
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] devmem: change vread()/vwrite() prototype to return success or error code
@ 2009-09-16  2:39       ` Wu Fengguang
  0 siblings, 0 replies; 14+ messages in thread
From: Wu Fengguang @ 2009-09-16  2:39 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Andrew Morton, Benjamin Herrenschmidt, Andi Kleen,
	Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML,
	linux-mm

On Wed, Sep 16, 2009 at 10:14:25AM +0800, KAMEZAWA Hiroyuki wrote:
> On Wed, 16 Sep 2009 09:39:40 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
> 
> > Silently ignore all vmalloc area holes in vread()/vwrite(),
> > and report success (or error code in future) to the caller.
> >
> > The original intention is to fix a vwrite() related bug, where
> > it could return 0 which cannot be handled correctly by its caller
> > write_kmem(). Then KAMEZAWA recommends to change the prototype
> > to make the semantics clear.
> >
> > CC: Andi Kleen <andi@firstfloor.org>
> > CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > CC: Christoph Lameter <cl@linux-foundation.org>
> > CC: Ingo Molnar <mingo@elte.hu>
> > CC: Tejun Heo <tj@kernel.org>
> > CC: Nick Piggin <npiggin@suse.de>
> > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > ---
> >  drivers/char/mem.c      |   24 ++++++++--------
> >  fs/proc/kcore.c         |    5 ++-
> >  include/linux/vmalloc.h |    6 ++--
> >  mm/nommu.c              |    8 ++---
> >  mm/vmalloc.c            |   55 +++++++++++---------------------------
> >  5 files changed, 40 insertions(+), 58 deletions(-)
> >
> > --- linux-mm.orig/mm/vmalloc.c        2009-09-16 08:52:12.000000000 +0800
> > +++ linux-mm/mm/vmalloc.c     2009-09-16 09:24:11.000000000 +0800
> > @@ -1655,7 +1655,6 @@ EXPORT_SYMBOL(vmalloc_32_user);
> >  static int aligned_vread(char *buf, char *addr, unsigned long count)
> >  {
> >       struct page *p;
> > -     int copied = 0;
> >
> >       while (count) {
> >               unsigned long offset, length;
> > @@ -1685,16 +1684,14 @@ static int aligned_vread(char *buf, char
> >
> >               addr += length;
> >               buf += length;
> > -             copied += length;
> >               count -= length;
> >       }
> > -     return copied;
> > +     return 0;
> >  }
> >
> >  static int aligned_vwrite(char *buf, char *addr, unsigned long count)
> >  {
> >       struct page *p;
> > -     int copied = 0;
> >
> >       while (count) {
> >               unsigned long offset, length;
> > @@ -1722,10 +1719,9 @@ static int aligned_vwrite(char *buf, cha
> >               }
> >               addr += length;
> >               buf += length;
> > -             copied += length;
> >               count -= length;
> >       }
> > -     return copied;
> > +     return 0;
> >  }
> >
> >  /**
> > @@ -1734,9 +1730,7 @@ static int aligned_vwrite(char *buf, cha
> >   *   @addr:          vm address.
> >   *   @count:         number of bytes to be read.
> >   *
> > - *   Returns # of bytes which addr and buf should be increased.
> > - *   (same number to @count). Returns 0 if [addr...addr+count) doesn't
> > - *   includes any intersect with alive vmalloc area.
> > + *   Returns 0 on success.
> 
> Seems to return 0 _always_. My version retunrs false if it's out-of-range.
> plz return false if out-of-vmalloc-range.
> Now, use vmalloc area is exported via /proc/vmallocinfo. Then, it's easy
> to know "valid" vmalloc area. But users cannot know memory holes in
> "valid" vmalloc area. So, vread() returns success even if it founds holes.

OK, that's fair enough.

> Another thought:
> Should we purge vread/vwrite and just use copy_to/from_user ?

I like its simplicity. Though I'm afraid I don't know enough to make useful comments.

> To do that, /proc/kcore should be revisited...
> It has to copy pages one by one in aligned manner.

You know it used to not rely on vread() ;)

> Anyway, it's merge window and not good season for this semantic changes of
> a function.

Agreed. I'll do a bug fix only patch, and leave other decisions to you.

> >   *
> >   *   This function checks that addr is a valid vmalloc'ed area, and
> >   *   copy data from that area to a given buffer. If the given memory range
> > @@ -1744,23 +1738,21 @@ static int aligned_vwrite(char *buf, cha
> >   *   proper area of @buf. If there are memory holes, they'll be zero-filled.
> >   *   IOREMAP area is treated as memory hole and no copy is done.
> >   *
> > - *   If [addr...addr+count) doesn't includes any intersects with alive
> > - *   vm_struct area, returns 0.
> >   *   @buf should be kernel's buffer. Because this function uses KM_USER0,
> >   *   the caller should guarantee KM_USER0 is not used.
> >   *
> >   *   Note: In usual ops, vread() is never necessary because the caller
> >   *   should know vmalloc() area is valid and can use memcpy().
> >   *   This is for routines which have to access vmalloc area without
> > - *   any informaion, as /dev/kmem.
> > + *   any information, as /dev/kmem and /dev/kcore.
> >   *
> >   */
> >
> > -long vread(char *buf, char *addr, unsigned long count)
> > +int vread(char *buf, char *addr, unsigned long count)
> >  {
> >       struct vm_struct *tmp;
> > -     char *vaddr, *buf_start = buf;
> > -     unsigned long buflen = count;
> > +     char *vaddr;
> > +     char *buf_end = buf + count;
> >       unsigned long n;
> >
> >       /* Don't allow overflow */
> > @@ -1794,13 +1786,11 @@ long vread(char *buf, char *addr, unsign
> >  finished:
> >       read_unlock(&vmlist_lock);
> >
> > -     if (buf == buf_start)
> > -             return 0;
> >       /* zero-fill memory holes */
> > -     if (buf != buf_start + buflen)
> > -             memset(buf, 0, buflen - (buf - buf_start));
> > +     if (buf != buf_end)
> > +             memset(buf, 0, buf_end - buf);
> >
> > -     return buflen;
> > +     return 0;
> >  }
> >
> >  /**
> > @@ -1809,10 +1799,7 @@ finished:
> >   *   @addr:          vm address.
> >   *   @count:         number of bytes to be read.
> >   *
> > - *   Returns # of bytes which addr and buf should be incresed.
> > - *   (same number to @count).
> > - *   If [addr...addr+count) doesn't includes any intersect with valid
> > - *   vmalloc area, returns 0.
> > + *   Returns 0 on success.
> >   *
> >   *   This function checks that addr is a valid vmalloc'ed area, and
> >   *   copy data from a buffer to the given addr. If specified range of
> > @@ -1820,30 +1807,24 @@ finished:
> >   *   proper area of @buf. If there are memory holes, no copy to hole.
> >   *   IOREMAP area is treated as memory hole and no copy is done.
> >   *
> > - *   If [addr...addr+count) doesn't includes any intersects with alive
> > - *   vm_struct area, returns 0.
> >   *   @buf should be kernel's buffer. Because this function uses KM_USER0,
> >   *   the caller should guarantee KM_USER0 is not used.
> >   *
> >   *   Note: In usual ops, vwrite() is never necessary because the caller
> >   *   should know vmalloc() area is valid and can use memcpy().
> >   *   This is for routines which have to access vmalloc area without
> > - *   any informaion, as /dev/kmem.
> > - *
> > - *   The caller should guarantee KM_USER1 is not used.
> > + *   any information, as /dev/kmem.
> >   */
> >
> > -long vwrite(char *buf, char *addr, unsigned long count)
> > +int vwrite(char *buf, char *addr, unsigned long count)
> >  {
> >       struct vm_struct *tmp;
> >       char *vaddr;
> > -     unsigned long n, buflen;
> > -     int copied = 0;
> > +     unsigned long n;
> >
> >       /* Don't allow overflow */
> >       if ((unsigned long) addr + count < count)
> >               count = -(unsigned long) addr;
> > -     buflen = count;
> >
> >       read_lock(&vmlist_lock);
> >       for (tmp = vmlist; count && tmp; tmp = tmp->next) {
> > @@ -1860,19 +1841,15 @@ long vwrite(char *buf, char *addr, unsig
> >               n = vaddr + tmp->size - PAGE_SIZE - addr;
> >               if (n > count)
> >                       n = count;
> > -             if (!(tmp->flags & VM_IOREMAP)) {
> > +             if (!(tmp->flags & VM_IOREMAP))
> >                       aligned_vwrite(buf, addr, n);
> > -                     copied++;
> > -             }
> >               buf += n;
> >               addr += n;
> >               count -= n;
> >       }
> >  finished:
> >       read_unlock(&vmlist_lock);
> > -     if (!copied)
> > -             return 0;
> > -     return buflen;
> > +     return 0;
> >  }
> >
> >  /**
> > --- linux-mm.orig/include/linux/vmalloc.h     2009-09-16 08:52:12.000000000 +0800
> > +++ linux-mm/include/linux/vmalloc.h  2009-09-16 08:52:17.000000000 +0800
> > @@ -104,9 +104,9 @@ extern void unmap_kernel_range(unsigned
> >  extern struct vm_struct *alloc_vm_area(size_t size);
> >  extern void free_vm_area(struct vm_struct *area);
> >
> > -/* for /dev/kmem */
> > -extern long vread(char *buf, char *addr, unsigned long count);
> > -extern long vwrite(char *buf, char *addr, unsigned long count);
> > +/* for /dev/kmem and /dev/kcore */
> > +extern int vread(char *buf, char *addr, unsigned long count);
> > +extern int vwrite(char *buf, char *addr, unsigned long count);
> >
> >  /*
> >   *   Internals.  Dont't use..
> > --- linux-mm.orig/drivers/char/mem.c  2009-09-16 08:52:12.000000000 +0800
> > +++ linux-mm/drivers/char/mem.c       2009-09-16 09:23:00.000000000 +0800
> > @@ -396,6 +396,7 @@ static ssize_t read_kmem(struct file *fi
> >       unsigned long p = *ppos;
> >       ssize_t low_count, read, sz;
> >       char * kbuf; /* k-addr because vread() takes vmlist_lock rwlock */
> > +     int err = 0;
> >
> >       read = 0;
> >       if (p < (unsigned long) high_memory) {
> > @@ -442,12 +443,12 @@ static ssize_t read_kmem(struct file *fi
> >                       return -ENOMEM;
> >               while (count > 0) {
> >                       sz = size_inside_page(p, count);
> > -                     sz = vread(kbuf, (char *)p, sz);
> > -                     if (!sz)
> > +                     err = vread(kbuf, (char *)p, sz);
> > +                     if (err)
> >                               break;
> >                       if (copy_to_user(buf, kbuf, sz)) {
> > -                             free_page((unsigned long)kbuf);
> > -                             return -EFAULT;
> > +                             err = -EFAULT;
> > +                             break;
> >                       }
> >                       count -= sz;
> >                       buf += sz;
> > @@ -457,7 +458,7 @@ static ssize_t read_kmem(struct file *fi
> >               free_page((unsigned long)kbuf);
> >       }
> >       *ppos = p;
> > -     return read;
> > +     return read ? read : err;
> >  }
> >
> >
> > @@ -521,6 +522,7 @@ static ssize_t write_kmem(struct file *
> >       ssize_t wrote = 0;
> >       ssize_t virtr = 0;
> >       char * kbuf; /* k-addr because vwrite() takes vmlist_lock rwlock */
> > +     int err = 0;
> >
> >       if (p < (unsigned long) high_memory) {
> >               unsigned long to_write = min_t(unsigned long, count,
> > @@ -543,12 +545,12 @@ static ssize_t write_kmem(struct file *
> >
> >                       n = copy_from_user(kbuf, buf, sz);
> >                       if (n) {
> > -                             if (wrote + virtr)
> > -                                     break;
> > -                             free_page((unsigned long)kbuf);
> > -                             return -EFAULT;
> > +                             err = -EFAULT;
> > +                             break;
> >                       }
> > -                     sz = vwrite(kbuf, (char *)p, sz);
> > +                     err = vwrite(kbuf, (char *)p, sz);
> > +                     if (err)
> > +                             break;
> >                       count -= sz;
> >                       buf += sz;
> >                       virtr += sz;
> > @@ -558,7 +560,7 @@ static ssize_t write_kmem(struct file *
> >       }
> >
> >       *ppos = p;
> > -     return virtr + wrote;
> > +     return virtr + wrote ? : err;
> >  }
> >  #endif
> >
> > --- linux-mm.orig/fs/proc/kcore.c     2009-09-16 08:52:12.000000000 +0800
> > +++ linux-mm/fs/proc/kcore.c  2009-09-16 08:52:17.000000000 +0800
> > @@ -492,11 +492,14 @@ read_kcore(struct file *file, char __use
> >                               return -EFAULT;
> >               } else if (is_vmalloc_or_module_addr((void *)start)) {
> >                       char * elf_buf;
> > +                     int err;
> >
> >                       elf_buf = kzalloc(tsz, GFP_KERNEL);
> >                       if (!elf_buf)
> >                               return -ENOMEM;
> > -                     vread(elf_buf, (char *)start, tsz);
> > +                     err = vread(elf_buf, (char *)start, tsz);
> > +                     if (err)
> > +                             return err;
> 
> This is wrong.
> 
> /proc/kcore provides "coredump" image of kernel and it never faults.
> Then, it doesn't check error code intentionally and just return
> zero-filled buffer of reuqested length.

Ah OK. Will revert this change.

Thanks,
Fengguang

> >                       /* we have to zero-fill user buffer even if no read */
> >                       if (copy_to_user(buffer, elf_buf, tsz)) {
> >                               kfree(elf_buf);
> > --- linux-mm.orig/mm/nommu.c  2009-09-16 09:01:36.000000000 +0800
> > +++ linux-mm/mm/nommu.c       2009-09-16 09:02:02.000000000 +0800
> > @@ -263,20 +263,20 @@ unsigned long vmalloc_to_pfn(const void
> >  }
> >  EXPORT_SYMBOL(vmalloc_to_pfn);
> >
> > -long vread(char *buf, char *addr, unsigned long count)
> > +int vread(char *buf, char *addr, unsigned long count)
> >  {
> >       memcpy(buf, addr, count);
> > -     return count;
> > +     return 0;
> >  }
> >
> > -long vwrite(char *buf, char *addr, unsigned long count)
> > +int vwrite(char *buf, char *addr, unsigned long count)
> >  {
> >       /* Don't allow overflow */
> >       if ((unsigned long) addr + count < count)
> >               count = -(unsigned long) addr;
> >
> >       memcpy(addr, buf, count);
> > -     return(count);
> > +     return 0;
> >  }
> >
> >  /*
> >
> > --
> >
> >

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] devmem: check vmalloc address on kmem read/write
  2009-09-16  1:39 ` [PATCH 2/3] devmem: check vmalloc address on kmem read/write Wu Fengguang
@ 2009-09-16  2:50     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-09-16  2:50 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, Benjamin Herrenschmidt, Greg Kroah-Hartman,
	Hugh Dickins, Christoph Lameter, Ingo Molnar, Tejun Heo,
	Nick Piggin, LKML, linux-mm

On Wed, 16 Sep 2009 09:39:41 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> Otherwise vmalloc_to_page() will BUG().
> 
> This also makes the kmem read/write implementation aligned with mem(4):
> "References to nonexistent locations cause errors to be returned." Here
> we return -ENXIO (inspired by Hugh) if no bytes have been transfered
> to/from user space, otherwise return partial read/write results.
> 
seems reasonable.

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

> CC: Greg Kroah-Hartman <gregkh@suse.de>
> CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
>  drivers/char/mem.c |    8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> --- linux-mm.orig/drivers/char/mem.c	2009-09-16 08:52:17.000000000 +0800
> +++ linux-mm/drivers/char/mem.c	2009-09-16 09:15:03.000000000 +0800
> @@ -443,6 +443,10 @@ static ssize_t read_kmem(struct file *fi
>  			return -ENOMEM;
>  		while (count > 0) {
>  			sz = size_inside_page(p, count);
> +			if (!is_vmalloc_or_module_addr((void *)p)) {
> +				err = -ENXIO;
> +				break;
> +			}
>  			err = vread(kbuf, (char *)p, sz);
>  			if (err)
>  				break;
> @@ -543,6 +547,10 @@ static ssize_t write_kmem(struct file * 
>  			unsigned long sz = size_inside_page(p, count);
>  			unsigned long n;
>  
> +			if (!is_vmalloc_or_module_addr((void *)p)) {
> +				err = -ENXIO;
> +				break;
> +			}
>  			n = copy_from_user(kbuf, buf, sz);
>  			if (n) {
>  				err = -EFAULT;
> 
> -- 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] devmem: check vmalloc address on kmem read/write
@ 2009-09-16  2:50     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-09-16  2:50 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, Benjamin Herrenschmidt, Greg Kroah-Hartman,
	Hugh Dickins, Christoph Lameter, Ingo Molnar, Tejun Heo,
	Nick Piggin, LKML, linux-mm

On Wed, 16 Sep 2009 09:39:41 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> Otherwise vmalloc_to_page() will BUG().
> 
> This also makes the kmem read/write implementation aligned with mem(4):
> "References to nonexistent locations cause errors to be returned." Here
> we return -ENXIO (inspired by Hugh) if no bytes have been transfered
> to/from user space, otherwise return partial read/write results.
> 
seems reasonable.

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

> CC: Greg Kroah-Hartman <gregkh@suse.de>
> CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
>  drivers/char/mem.c |    8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> --- linux-mm.orig/drivers/char/mem.c	2009-09-16 08:52:17.000000000 +0800
> +++ linux-mm/drivers/char/mem.c	2009-09-16 09:15:03.000000000 +0800
> @@ -443,6 +443,10 @@ static ssize_t read_kmem(struct file *fi
>  			return -ENOMEM;
>  		while (count > 0) {
>  			sz = size_inside_page(p, count);
> +			if (!is_vmalloc_or_module_addr((void *)p)) {
> +				err = -ENXIO;
> +				break;
> +			}
>  			err = vread(kbuf, (char *)p, sz);
>  			if (err)
>  				break;
> @@ -543,6 +547,10 @@ static ssize_t write_kmem(struct file * 
>  			unsigned long sz = size_inside_page(p, count);
>  			unsigned long n;
>  
> +			if (!is_vmalloc_or_module_addr((void *)p)) {
> +				err = -ENXIO;
> +				break;
> +			}
>  			n = copy_from_user(kbuf, buf, sz);
>  			if (n) {
>  				err = -EFAULT;
> 
> -- 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] devmem: change vread()/vwrite() prototype to return success or error code
  2009-09-16  1:39   ` Wu Fengguang
@ 2010-09-14  4:42     ` KAMEZAWA Hiroyuki
  -1 siblings, 0 replies; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-14  4:42 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, Benjamin Herrenschmidt, Andi Kleen,
	Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML,
	linux-mm

On Wed, 16 Sep 2009 09:39:40 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> Silently ignore all vmalloc area holes in vread()/vwrite(),
> and report success (or error code in future) to the caller.
> 
> The original intention is to fix a vwrite() related bug, where
> it could return 0 which cannot be handled correctly by its caller
> write_kmem(). Then KAMEZAWA recommends to change the prototype
> to make the semantics clear.
> 
> CC: Andi Kleen <andi@firstfloor.org>
> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> CC: Christoph Lameter <cl@linux-foundation.org>
> CC: Ingo Molnar <mingo@elte.hu>
> CC: Tejun Heo <tj@kernel.org>
> CC: Nick Piggin <npiggin@suse.de>
> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>

Okay, maybe it's good to change for now.

I sometimes wonder we should use copy_to/from_user to access vmalloc area...
but we have IO_REMAP. So, maybe vread/vwrite is required one.

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] devmem: change vread()/vwrite() prototype to return success or error code
@ 2010-09-14  4:42     ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2010-09-14  4:42 UTC (permalink / raw)
  To: Wu Fengguang
  Cc: Andrew Morton, Benjamin Herrenschmidt, Andi Kleen,
	Christoph Lameter, Ingo Molnar, Tejun Heo, Nick Piggin, LKML,
	linux-mm

On Wed, 16 Sep 2009 09:39:40 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:

> Silently ignore all vmalloc area holes in vread()/vwrite(),
> and report success (or error code in future) to the caller.
> 
> The original intention is to fix a vwrite() related bug, where
> it could return 0 which cannot be handled correctly by its caller
> write_kmem(). Then KAMEZAWA recommends to change the prototype
> to make the semantics clear.
> 
> CC: Andi Kleen <andi@firstfloor.org>
> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> CC: Christoph Lameter <cl@linux-foundation.org>
> CC: Ingo Molnar <mingo@elte.hu>
> CC: Tejun Heo <tj@kernel.org>
> CC: Nick Piggin <npiggin@suse.de>
> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>

Okay, maybe it's good to change for now.

I sometimes wonder we should use copy_to/from_user to access vmalloc area...
but we have IO_REMAP. So, maybe vread/vwrite is required one.

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Thanks,
-Kame

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-09-14  4:47 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-16  1:39 [PATCH 0/3] /proc/kmem fixes and hwpoison bits v2 Wu Fengguang
2009-09-16  1:39 ` [PATCH 1/3] devmem: change vread()/vwrite() prototype to return success or error code Wu Fengguang
2009-09-16  1:39   ` Wu Fengguang
2009-09-16  2:14   ` KAMEZAWA Hiroyuki
2009-09-16  2:14     ` KAMEZAWA Hiroyuki
2009-09-16  2:39     ` Wu Fengguang
2009-09-16  2:39       ` Wu Fengguang
2010-09-14  4:42   ` KAMEZAWA Hiroyuki
2010-09-14  4:42     ` KAMEZAWA Hiroyuki
2009-09-16  1:39 ` [PATCH 2/3] devmem: check vmalloc address on kmem read/write Wu Fengguang
2009-09-16  2:50   ` KAMEZAWA Hiroyuki
2009-09-16  2:50     ` KAMEZAWA Hiroyuki
2009-09-16  1:39 ` [PATCH 3/3] HWPOISON: prevent /dev/kmem users from accessing hwpoison pages Wu Fengguang
2009-09-16  1:39   ` Wu Fengguang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.