All of lore.kernel.org
 help / color / mirror / Atom feed
* Revision 2136 breaks two-disk configuarion
@ 2009-04-27  5:25 Pavel Roskin
  2009-04-27  5:57 ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Roskin @ 2009-04-27  5:25 UTC (permalink / raw)
  To: grub-devel, David S. Miller

Hello!

Since the revision 2136, I'm getting a non-fatal error message:

GRUB loading.
Welcome to GRUB!

alloc magic is broken at 0x7fdaf20
Aborted. Press any key to exit.

After pressing enter, the menu appears and I can boot from it.  The
description of the revision is:

       * disk/fs_uuid.c (grub_fs_uuid_close): Call grub_disk_close()
       on disk->data.

The patch looks wrong to me.  There are no calls to grub_disk_open() in
disk/fs_uuid.c.  Other callers to grub_disk_close() call
grub_disk_open() as well.

# cat /boot/grub/device.map 
(hd0)   /dev/sda
(hd1)   /dev/sdb

The root is on /dev/sdb1.  GRUB is installed on /dev/sda.  Having two
disks is needed to reproduce the problem.

-- 
Regards,
Pavel Roskin



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Revision 2136 breaks two-disk configuarion
  2009-04-27  5:25 Revision 2136 breaks two-disk configuarion Pavel Roskin
@ 2009-04-27  5:57 ` David Miller
  2009-04-27 22:26   ` Pavel Roskin
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2009-04-27  5:57 UTC (permalink / raw)
  To: grub-devel, proski

From: Pavel Roskin <proski@gnu.org>
Date: Mon, 27 Apr 2009 01:25:37 -0400

> Hello!
> 
> Since the revision 2136, I'm getting a non-fatal error message:
> 
> GRUB loading.
> Welcome to GRUB!
> 
> alloc magic is broken at 0x7fdaf20
> Aborted. Press any key to exit.
> 
> After pressing enter, the menu appears and I can boot from it.  The
> description of the revision is:
> 
>        * disk/fs_uuid.c (grub_fs_uuid_close): Call grub_disk_close()
>        on disk->data.
> 
> The patch looks wrong to me.  There are no calls to grub_disk_open() in
> disk/fs_uuid.c.  Other callers to grub_disk_close() call
> grub_disk_open() as well.

grub_disk_open() isn't used, but a grub_device_open() does occur
during the iterator that has us find the FS_UUID device.  This
happens search_fs_uuid().

If we don't do this, we leave a device reference open and dangling.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Revision 2136 breaks two-disk configuarion
  2009-04-27  5:57 ` David Miller
@ 2009-04-27 22:26   ` Pavel Roskin
  2009-04-28  1:37     ` [PATCH] " Pavel Roskin
  2009-04-28  8:59     ` David Miller
  0 siblings, 2 replies; 9+ messages in thread
From: Pavel Roskin @ 2009-04-27 22:26 UTC (permalink / raw)
  To: The development of GRUB 2

On Sun, 2009-04-26 at 22:57 -0700, David Miller wrote:

> grub_disk_open() isn't used, but a grub_device_open() does occur
> during the iterator that has us find the FS_UUID device.  This
> happens search_fs_uuid().
> 
> If we don't do this, we leave a device reference open and dangling.

Then we should be using grub_device_close().  I've made a patch that
keeps dev, not disk in the data field.  Unfortunately, the problem
persists.  I could easily reproduce it on another machine by using a
flash drive and qemu.

It's entirely possible that the problem is elsewhere.  But I have no
experience debugging memory problems in GRUB, so it will take time
before I find out.

Here's the patch.  It doesn't make the memory problem go away, but it
makes the code nicer by using the same abstraction to open and close the
disk.

diff --git a/disk/fs_uuid.c b/disk/fs_uuid.c
index 9d83bb8..f590ad2 100644
--- a/disk/fs_uuid.c
+++ b/disk/fs_uuid.c
@@ -89,7 +89,7 @@ grub_fs_uuid_open (const char *name, grub_disk_t disk)
   disk->total_sectors = dev->disk->total_sectors;
   disk->has_partitions = 0;
   disk->partition = dev->disk->partition;
-  disk->data = dev->disk;
+  disk->data = dev;
 
   return GRUB_ERR_NONE;
 }
@@ -97,24 +97,24 @@ grub_fs_uuid_open (const char *name, grub_disk_t disk)
 static void
 grub_fs_uuid_close (grub_disk_t disk __attribute((unused)))
 {
-  grub_disk_t parent = disk->data;
-  grub_disk_close (parent);
+  grub_device_t parent = disk->data;
+  grub_device_close (parent);
 }
 
 static grub_err_t
 grub_fs_uuid_read (grub_disk_t disk, grub_disk_addr_t sector,
 		   grub_size_t size, char *buf)
 {
-  grub_disk_t parent = disk->data;
-  return parent->dev->read (parent, sector, size, buf);
+  grub_device_t parent = disk->data;
+  return parent->disk->dev->read (parent->disk, sector, size, buf);
 }
 
 static grub_err_t
 grub_fs_uuid_write (grub_disk_t disk, grub_disk_addr_t sector,
 		    grub_size_t size, const char *buf)
 {
-  grub_disk_t parent = disk->data;
-  return parent->dev->write (parent, sector, size, buf);
+  grub_device_t parent = disk->data;
+  return parent->disk->dev->write (parent->disk, sector, size, buf);
 }
 
 static struct grub_disk_dev grub_fs_uuid_dev =

-- 
Regards,
Pavel Roskin



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] Re: Revision 2136 breaks two-disk configuarion
  2009-04-27 22:26   ` Pavel Roskin
@ 2009-04-28  1:37     ` Pavel Roskin
  2009-04-28  1:45       ` David Miller
  2009-04-28  8:59     ` David Miller
  1 sibling, 1 reply; 9+ messages in thread
From: Pavel Roskin @ 2009-04-28  1:37 UTC (permalink / raw)
  To: The development of GRUB 2

On Mon, 2009-04-27 at 18:26 -0400, Pavel Roskin wrote:

> It's entirely possible that the problem is elsewhere.  But I have no
> experience debugging memory problems in GRUB, so it will take time
> before I find out.

Done!  disk->partition should not be copied by reference.  This patch
fixes the broken magic problem.  The issue with mixing device and disk
functions could be addressed separately.

ChangeLog:
	disk/fs_uuid.c (grub_fs_uuid_open): Allocate memory to copy
	parent's partition, don't copy it by reference, as it gets freed
	on close.
---

 disk/fs_uuid.c |   12 +++++++++++-
 1 files changed, 11 insertions(+), 1 deletions(-)


diff --git a/disk/fs_uuid.c b/disk/fs_uuid.c
index 9d83bb8..9636ce9 100644
--- a/disk/fs_uuid.c
+++ b/disk/fs_uuid.c
@@ -25,6 +25,7 @@
 #include <grub/types.h>
 
 #include <grub/fs.h>
+#include <grub/partition.h>
 
 static grub_device_t
 search_fs_uuid (const char *key, unsigned long *count)
@@ -88,7 +89,16 @@ grub_fs_uuid_open (const char *name, grub_disk_t disk)
 
   disk->total_sectors = dev->disk->total_sectors;
   disk->has_partitions = 0;
-  disk->partition = dev->disk->partition;
+  if (dev->disk->partition)
+    {
+      disk->partition = grub_malloc (sizeof (*disk->partition));
+      if (disk->partition)
+	grub_memcpy (disk->partition, dev->disk->partition,
+		     sizeof (*disk->partition));
+    }
+  else
+    disk->partition = NULL;
+
   disk->data = dev->disk;
 
   return GRUB_ERR_NONE;

-- 
Regards,
Pavel Roskin



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Re: Revision 2136 breaks two-disk configuarion
  2009-04-28  1:37     ` [PATCH] " Pavel Roskin
@ 2009-04-28  1:45       ` David Miller
  2009-04-28  3:50         ` Pavel Roskin
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2009-04-28  1:45 UTC (permalink / raw)
  To: grub-devel, proski

From: Pavel Roskin <proski@gnu.org>
Date: Mon, 27 Apr 2009 21:37:52 -0400

> On Mon, 2009-04-27 at 18:26 -0400, Pavel Roskin wrote:
> 
>> It's entirely possible that the problem is elsewhere.  But I have no
>> experience debugging memory problems in GRUB, so it will take time
>> before I find out.
> 
> Done!  disk->partition should not be copied by reference.  This patch
> fixes the broken magic problem.  The issue with mixing device and disk
> functions could be addressed separately.
> 
> ChangeLog:
> 	disk/fs_uuid.c (grub_fs_uuid_open): Allocate memory to copy
> 	parent's partition, don't copy it by reference, as it gets freed
> 	on close.

Thanks for fixing this Pavel.  I suspect this bug is why the close was
left as a NOP function all of this time.

Please commit this as it seems you haven't already :-)

I'll take a look at your other patch too.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Re: Revision 2136 breaks two-disk configuarion
  2009-04-28  1:45       ` David Miller
@ 2009-04-28  3:50         ` Pavel Roskin
  2009-04-28  8:59           ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Pavel Roskin @ 2009-04-28  3:50 UTC (permalink / raw)
  To: The development of GRUB 2

On Mon, 2009-04-27 at 18:45 -0700, David Miller wrote:

> Thanks for fixing this Pavel.  I suspect this bug is why the close was
> left as a NOP function all of this time.

Maybe.

> Please commit this as it seems you haven't already :-)

I was about to commit it when I realized that the same could be done
using a reference counter in struct grub_partition.  It may be an
overkill for this purpose, but maybe there will be more users of the
partition table in the future.  Besides, this patch could set an
example.  Please have a cursory look to make sure it's a good example.

ChangeLog:
	* grub/partition.h (struct grub_partition): Add refcount.
	(grub_partition_ref): New inline function.
	(grub_partition_unref): Likewise.
	* kern/disk.c (grub_disk_close): Use grub_partition_unref().
	* disk/fs_uuid.c (grub_fs_uuid_open): Use grub_partition_ref().

diff --git a/disk/fs_uuid.c b/disk/fs_uuid.c
index 9d83bb8..12086b2 100644
--- a/disk/fs_uuid.c
+++ b/disk/fs_uuid.c
@@ -25,6 +25,7 @@
 #include <grub/types.h>
 
 #include <grub/fs.h>
+#include <grub/partition.h>
 
 static grub_device_t
 search_fs_uuid (const char *key, unsigned long *count)
@@ -88,7 +89,7 @@ grub_fs_uuid_open (const char *name, grub_disk_t disk)
 
   disk->total_sectors = dev->disk->total_sectors;
   disk->has_partitions = 0;
-  disk->partition = dev->disk->partition;
+  disk->partition = grub_partition_ref (dev->disk->partition);
   disk->data = dev->disk;
 
   return GRUB_ERR_NONE;
diff --git a/include/grub/partition.h b/include/grub/partition.h
index 6e74cd5..c3e4d15 100644
--- a/include/grub/partition.h
+++ b/include/grub/partition.h
@@ -62,6 +62,9 @@ struct grub_partition
 
   /* The index of this partition in the partition table.  */
   int index;
+
+  /* Reference count.  */
+  int refcount;
   
   /* Partition map type specific data.  */
   void *data;
@@ -110,4 +113,23 @@ grub_partition_get_len (const grub_partition_t p)
   return p->len;
 }
 
+/* Return a copy by reference of the original partition.  */
+static inline grub_partition_t
+grub_partition_ref (grub_partition_t p)
+{
+  if (p)
+    p->refcount++;
+  return p;
+}
+
+/* Return partition to be freed if it can be freed.  */
+static inline grub_partition_t
+grub_partition_unref (const grub_partition_t p)
+{
+  if (p && p->refcount-- <= 0)
+    return p;
+  else
+    return 0;
+}
+
 #endif /* ! GRUB_PART_HEADER */
diff --git a/kern/disk.c b/kern/disk.c
index 8a92989..070cab3 100644
--- a/kern/disk.c
+++ b/kern/disk.c
@@ -324,7 +324,7 @@ grub_disk_close (grub_disk_t disk)
   /* Reset the timer.  */
   grub_last_time = grub_get_time_ms ();
 
-  grub_free (disk->partition);
+  grub_free (grub_partition_unref (disk->partition));
   grub_free ((void *) disk->name);
   grub_free (disk);
 }

-- 
Regards,
Pavel Roskin



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Re: Revision 2136 breaks two-disk configuarion
  2009-04-28  3:50         ` Pavel Roskin
@ 2009-04-28  8:59           ` David Miller
  2009-04-28 13:26             ` Pavel Roskin
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2009-04-28  8:59 UTC (permalink / raw)
  To: grub-devel, proski

From: Pavel Roskin <proski@gnu.org>
Date: Mon, 27 Apr 2009 23:50:45 -0400

> On Mon, 2009-04-27 at 18:45 -0700, David Miller wrote:
> 
>> Thanks for fixing this Pavel.  I suspect this bug is why the close was
>> left as a NOP function all of this time.
> 
> Maybe.
> 
>> Please commit this as it seems you haven't already :-)
> 
> I was about to commit it when I realized that the same could be done
> using a reference counter in struct grub_partition.  It may be an
> overkill for this purpose, but maybe there will be more users of the
> partition table in the future.  Besides, this patch could set an
> example.  Please have a cursory look to make sure it's a good example.

I think it's overkill at the moment.

> +/* Return partition to be freed if it can be freed.  */
> +static inline grub_partition_t
> +grub_partition_unref (const grub_partition_t p)
> +{
> +  if (p && p->refcount-- <= 0)
> +    return p;

If the reference count dips below zero, that should trigger
an assertion rather than giving the pointer to the caller
again which will potentially double-free memory or whatever.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Revision 2136 breaks two-disk configuarion
  2009-04-27 22:26   ` Pavel Roskin
  2009-04-28  1:37     ` [PATCH] " Pavel Roskin
@ 2009-04-28  8:59     ` David Miller
  1 sibling, 0 replies; 9+ messages in thread
From: David Miller @ 2009-04-28  8:59 UTC (permalink / raw)
  To: grub-devel, proski

From: Pavel Roskin <proski@gnu.org>
Date: Mon, 27 Apr 2009 18:26:11 -0400

> On Sun, 2009-04-26 at 22:57 -0700, David Miller wrote:
> 
>> grub_disk_open() isn't used, but a grub_device_open() does occur
>> during the iterator that has us find the FS_UUID device.  This
>> happens search_fs_uuid().
>> 
>> If we don't do this, we leave a device reference open and dangling.
> 
> Then we should be using grub_device_close().  I've made a patch that
> keeps dev, not disk in the data field.  Unfortunately, the problem
> persists.  I could easily reproduce it on another machine by using a
> flash drive and qemu.
> 
> It's entirely possible that the problem is elsewhere.  But I have no
> experience debugging memory problems in GRUB, so it will take time
> before I find out.
> 
> Here's the patch.  It doesn't make the memory problem go away, but it
> makes the code nicer by using the same abstraction to open and close the
> disk.
> 

I like this patch, please commit it.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Re: Revision 2136 breaks two-disk configuarion
  2009-04-28  8:59           ` David Miller
@ 2009-04-28 13:26             ` Pavel Roskin
  0 siblings, 0 replies; 9+ messages in thread
From: Pavel Roskin @ 2009-04-28 13:26 UTC (permalink / raw)
  To: The development of GRUB 2

On Tue, 2009-04-28 at 01:59 -0700, David Miller wrote:

> I think it's overkill at the moment.

I've committed the patch with grub_malloc() and the patch making
grub_fs_uuid_close() use grub_device_close().

> > +/* Return partition to be freed if it can be freed.  */
> > +static inline grub_partition_t
> > +grub_partition_unref (const grub_partition_t p)
> > +{
> > +  if (p && p->refcount-- <= 0)
> > +    return p;
> 
> If the reference count dips below zero, that should trigger
> an assertion rather than giving the pointer to the caller
> again which will potentially double-free memory or whatever.

Correct.  If we have a better case for using refcounts, we'll have that
check.

-- 
Regards,
Pavel Roskin



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-04-28 13:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-27  5:25 Revision 2136 breaks two-disk configuarion Pavel Roskin
2009-04-27  5:57 ` David Miller
2009-04-27 22:26   ` Pavel Roskin
2009-04-28  1:37     ` [PATCH] " Pavel Roskin
2009-04-28  1:45       ` David Miller
2009-04-28  3:50         ` Pavel Roskin
2009-04-28  8:59           ` David Miller
2009-04-28 13:26             ` Pavel Roskin
2009-04-28  8:59     ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.