linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 4GB+DEBUG_PAGEALLOC oopses with 2.6.0-test3-mm1
@ 2003-08-10 10:33 Manfred Spraul
  2003-08-11  8:56 ` Ingo Molnar
  2003-08-11  9:00 ` Ingo Molnar
  0 siblings, 2 replies; 8+ messages in thread
From: Manfred Spraul @ 2003-08-10 10:33 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Andrew Morton

Hi Ingo,

I'm running into crashes in copy_mount_options with 
CONFIG_DEBUG_PAGEALLOC and 4GB in 2.6.0-test3-mm1:

The functions in mm/usercopy assume that no exception handling is 
required if fs is KERNEL_DS. This is not true: at least the mount 
options copy and the i386 traps handler assume exception handling with 
fs==KERNEL_DS.

How should this be fixed? I don't see a simple, portable way to 
implement exception handling for the kernel address space.

--
    Manfred


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 4GB+DEBUG_PAGEALLOC oopses with 2.6.0-test3-mm1
  2003-08-10 10:33 4GB+DEBUG_PAGEALLOC oopses with 2.6.0-test3-mm1 Manfred Spraul
@ 2003-08-11  8:56 ` Ingo Molnar
  2003-08-11 15:22   ` Manfred Spraul
  2003-08-11  9:00 ` Ingo Molnar
  1 sibling, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2003-08-11  8:56 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel, Andrew Morton, Alexander Viro


On Sun, 10 Aug 2003, Manfred Spraul wrote:

> The functions in mm/usercopy assume that no exception handling is
> required if fs is KERNEL_DS. This is not true: at least the mount
> options copy and the i386 traps handler assume exception handling with
> fs==KERNEL_DS.

precisely how does copy_mount_options() break under 4G/4G?

one thing that is known is that copy_mount_options() handles a strange mix
of string-type and data-type pointers. Right now the tools rely on the
string copy getting a fault and aborting the copy early.  The
copy_mount_options() function takes a pointer and copies a full page -
even if we pass a string pointer to it which is at the end of the BSS, and
which thus cannot be copied as a full page.

the attached patch sort out some of this, but it's an incomplete (and thus
incorrect) patch, because some filesystems want a type parameter that is a
binary blob not a string. Some others call it with a string - which can
fault if copied as a PAGE_SIZE object.

in any case, this detail of fault handling is handled correctly by the
4G/4G patch. If there is any other hole, please let me know.

(in theory it's possible that kernel-internal mounts pass in a pointer
where pointer + PAGE_SIZE is not a valid kernel address - if this happens
then we'd get a hard crash.)

	Ingo

--- fs/namespace.c.orig
+++ fs/namespace.c
@@ -863,27 +863,40 @@
 }
 
 asmlinkage long sys_mount(char __user * dev_name, char __user * dir_name,
-			  char __user * type, unsigned long flags,
+			  char __user * type_name, unsigned long flags,
 			  void __user * data)
 {
 	int retval;
 	unsigned long data_page;
-	unsigned long type_page;
-	unsigned long dev_page;
-	char *dir_page;
+	char empty[] = "";
+	char *dev_page = empty, *type_page = empty, *dir_page = empty;
 
-	retval = copy_mount_options (type, &type_page);
-	if (retval < 0)
-		return retval;
+	/*
+	 * Copy those paramters that are string via the string
+	 * functions.
+	 *
+	 * only the data page is copied as a full page.
+	 */
+	if (type_name) {
+		type_page = getname(type_name);
+		retval = PTR_ERR(type_page);
+		if (IS_ERR(type_page))
+			return retval;
+	}
 
-	dir_page = getname(dir_name);
-	retval = PTR_ERR(dir_page);
-	if (IS_ERR(dir_page))
-		goto out1;
+	if (dir_name) {
+		dir_page = getname(dir_name);
+		retval = PTR_ERR(dir_page);
+		if (IS_ERR(dir_page))
+			goto out1;
+	}
 
-	retval = copy_mount_options (dev_name, &dev_page);
-	if (retval < 0)
-		goto out2;
+	if (dev_name) {
+		dev_page = getname(dev_name);
+		retval = PTR_ERR(dev_page);
+		if (IS_ERR(dev_page))
+			goto out2;
+	}
 
 	retval = copy_mount_options (data, &data_page);
 	if (retval < 0)
@@ -896,11 +909,14 @@
 	free_page(data_page);
 
 out3:
-	free_page(dev_page);
+	if (dev_name)
+		putname(dev_page);
 out2:
-	putname(dir_page);
+	if (dir_name)
+		putname(dir_page);
 out1:
-	free_page(type_page);
+	if (type_name)
+		putname(type_page);
 	return retval;
 }
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 4GB+DEBUG_PAGEALLOC oopses with 2.6.0-test3-mm1
  2003-08-10 10:33 4GB+DEBUG_PAGEALLOC oopses with 2.6.0-test3-mm1 Manfred Spraul
  2003-08-11  8:56 ` Ingo Molnar
@ 2003-08-11  9:00 ` Ingo Molnar
  1 sibling, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2003-08-11  9:00 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel, Andrew Morton


On Sun, 10 Aug 2003, Manfred Spraul wrote:

> The functions in mm/usercopy assume that no exception handling is
> required if fs is KERNEL_DS. This is not true: at least the mount
> options copy and the i386 traps handler assume exception handling with
> fs==KERNEL_DS.

hm, what do you mean by the i386 traps handler? The boot-time WP detection
fault thing?

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 4GB+DEBUG_PAGEALLOC oopses with 2.6.0-test3-mm1
  2003-08-11  8:56 ` Ingo Molnar
@ 2003-08-11 15:22   ` Manfred Spraul
  2003-08-11 15:24     ` Ingo Molnar
  0 siblings, 1 reply; 8+ messages in thread
From: Manfred Spraul @ 2003-08-11 15:22 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Andrew Morton, Alexander Viro

Ingo Molnar wrote:

>(in theory it's possible that kernel-internal mounts pass in a pointer
>where pointer + PAGE_SIZE is not a valid kernel address - if this happens
>then we'd get a hard crash.)
>  
>
Exactly that happens.
I'm running with CONFIG_PAGE_DEBUG, i.e. unallocated pages are marked as 
non-present in the linear mapping.

Regarding the i386 trap handler: show_registers tries to hexdump the 
current instructions. It did a __get_user, to avoid causing a fault when 
%eip is invalid.
Now it contains:

>      if ((user_mode(regs) && get_user(c, eip)) ||
>           (!user_mode(regs) && __direct_get_user(c, eip))) {
>              printk(" Bad EIP value.");
>              break;
>      }

I.e. it's already fixed.

--
    Manfred


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 4GB+DEBUG_PAGEALLOC oopses with 2.6.0-test3-mm1
  2003-08-11 15:22   ` Manfred Spraul
@ 2003-08-11 15:24     ` Ingo Molnar
  2003-08-11 16:05       ` Manfred Spraul
  0 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2003-08-11 15:24 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel, Andrew Morton, Alexander Viro


On Mon, 11 Aug 2003, Manfred Spraul wrote:

> >(in theory it's possible that kernel-internal mounts pass in a pointer
> >where pointer + PAGE_SIZE is not a valid kernel address - if this happens
> >then we'd get a hard crash.)

> Exactly that happens.
> I'm running with CONFIG_PAGE_DEBUG, i.e. unallocated pages are marked as 
> non-present in the linear mapping.

this is not a bug technically, unless the mount options are in the last
linearly mapped page. It is a bug to copy those unallocated bytes, but
they do not get to relied upon. Note that the non-4G code copies them just
as much.

> Regarding the i386 trap handler: show_registers tries to hexdump the 
> current instructions. It did a __get_user, to avoid causing a fault when 
> %eip is invalid.
> Now it contains:
> 
> >      if ((user_mode(regs) && get_user(c, eip)) ||
> >           (!user_mode(regs) && __direct_get_user(c, eip))) {
> >              printk(" Bad EIP value.");
> >              break;
> >      }
> 
> I.e. it's already fixed.

yeah, this was one of the latest patches that went into Andrew's tree.  
(This fix enabled us to get rid of the do_page_fault() hacks.)

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 4GB+DEBUG_PAGEALLOC oopses with 2.6.0-test3-mm1
  2003-08-11 15:24     ` Ingo Molnar
@ 2003-08-11 16:05       ` Manfred Spraul
  2003-08-11 17:19         ` Ingo Molnar
  0 siblings, 1 reply; 8+ messages in thread
From: Manfred Spraul @ 2003-08-11 16:05 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Andrew Morton, Alexander Viro

Ingo Molnar wrote:

> Exactly that happens.
>
>>I'm running with CONFIG_PAGE_DEBUG, i.e. unallocated pages are marked as 
>>non-present in the linear mapping.
>>    
>>
>
>this is not a bug technically, unless the mount options are in the last
>linearly mapped page. It is a bug to copy those unallocated bytes, but
>they do not get to relied upon. Note that the non-4G code copies them just
>as much.
>  
>
Or unless there are holes in the memory map, or unless pages were 
unmapped from the kernel linear mapping for GART.
IMHO the currect code is unacceptable.
It is possible to use direct_copy_ instead of memcpy for fs==KERNEL_DS 
in mm/usercopy.c?

--
    Manfred


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 4GB+DEBUG_PAGEALLOC oopses with 2.6.0-test3-mm1
  2003-08-11 16:05       ` Manfred Spraul
@ 2003-08-11 17:19         ` Ingo Molnar
  2003-08-11 18:01           ` Manfred Spraul
  0 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2003-08-11 17:19 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel, Andrew Morton, Alexander Viro


On Mon, 11 Aug 2003, Manfred Spraul wrote:

> > this is not a bug technically, unless the mount options are in the
> > last linearly mapped page. It is a bug to copy those unallocated
> > bytes, but they do not get to relied upon. Note that the non-4G code 
> > copies them just as much.
>
> Or unless there are holes in the memory map, or unless pages were 
> unmapped from the kernel linear mapping for GART.
> IMHO the currect code is unacceptable.

IMHO the mount-options API is quite unclean - since when do we guarantee
-EFAULT semantics?

> It is possible to use direct_copy_ instead of memcpy for fs==KERNEL_DS
> in mm/usercopy.c?

yes. But i still think we should fix the API too.

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 4GB+DEBUG_PAGEALLOC oopses with 2.6.0-test3-mm1
  2003-08-11 17:19         ` Ingo Molnar
@ 2003-08-11 18:01           ` Manfred Spraul
  0 siblings, 0 replies; 8+ messages in thread
From: Manfred Spraul @ 2003-08-11 18:01 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-kernel, Andrew Morton, Alexander Viro

Ingo Molnar wrote:

>IMHO the mount-options API is quite unclean - since when do we guarantee
>-EFAULT semantics?
>  
>
I don't know - I found copy_mount_options() with similar code in 1.0.9 :-(

I agree that it's ugly as hell, but it's user space ABI.

--
    Manfred



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-08-11 18:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-08-10 10:33 4GB+DEBUG_PAGEALLOC oopses with 2.6.0-test3-mm1 Manfred Spraul
2003-08-11  8:56 ` Ingo Molnar
2003-08-11 15:22   ` Manfred Spraul
2003-08-11 15:24     ` Ingo Molnar
2003-08-11 16:05       ` Manfred Spraul
2003-08-11 17:19         ` Ingo Molnar
2003-08-11 18:01           ` Manfred Spraul
2003-08-11  9:00 ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).