linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
@ 2007-06-07  3:27 Albert Cahalan
  2007-06-07  3:44 ` Andrew Morton
  0 siblings, 1 reply; 34+ messages in thread
From: Albert Cahalan @ 2007-06-07  3:27 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm, ebiederm, pbadari, torvalds

Eric W. Biederman writes:
> Badari Pulavarty <pbadari@us.ibm.com> writes:

>> Your recent cleanup to shm code, namely
>>
>> [PATCH] shm: make sysv ipc shared memory use stacked files
>>
>> took away one of the debugging feature for shm segments.
>> Originally, shmid were forced to be the inode numbers and
>> they show up in /proc/pid/maps for the process which mapped
>> this shared memory segments (vma listing). That way, its easy
>> to find out who all mapped this shared memory segment. Your
>> patchset, took away the inode# setting. So, we can't easily
>> match the shmem segments to /proc/pid/maps easily. (It was
>> really useful in tracking down a customer problem recently).
>> Is this done deliberately ? Anything wrong in setting this back ?
>
> Theoretically it makes the stacked file concept more brittle,
> because it means the lower layers can't care about their inode
> number.
>
> We do need something to tie these things together.
>
> So I suspect what makes most sense is to simply rename the
> dentry SYSVID<segmentid>

Please stop breaking things in /proc. The pmap command relys
on the old behavior. It's time to revert. Put back the segment ID
where it belongs, and leave the key where it belongs too.

Containers are NOT worth breaking our ABIs left and right.
We don't need to leap off that bridge just because Solaris did,
unless you can explain why complexity and bloat are desirable.
We already have SE Linux, chroot, KVM, and several more!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07  3:27 [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid Albert Cahalan
@ 2007-06-07  3:44 ` Andrew Morton
  2007-06-07  4:53   ` Albert Cahalan
  0 siblings, 1 reply; 34+ messages in thread
From: Andrew Morton @ 2007-06-07  3:44 UTC (permalink / raw)
  To: Albert Cahalan; +Cc: linux-kernel, linux-mm, ebiederm, pbadari, torvalds

On Wed, 6 Jun 2007 23:27:01 -0400 "Albert Cahalan" <acahalan@gmail.com> wrote:

> Eric W. Biederman writes:
> > Badari Pulavarty <pbadari@us.ibm.com> writes:
> 
> >> Your recent cleanup to shm code, namely
> >>
> >> [PATCH] shm: make sysv ipc shared memory use stacked files
> >>
> >> took away one of the debugging feature for shm segments.
> >> Originally, shmid were forced to be the inode numbers and
> >> they show up in /proc/pid/maps for the process which mapped
> >> this shared memory segments (vma listing). That way, its easy
> >> to find out who all mapped this shared memory segment. Your
> >> patchset, took away the inode# setting. So, we can't easily
> >> match the shmem segments to /proc/pid/maps easily. (It was
> >> really useful in tracking down a customer problem recently).
> >> Is this done deliberately ? Anything wrong in setting this back ?
> >
> > Theoretically it makes the stacked file concept more brittle,
> > because it means the lower layers can't care about their inode
> > number.
> >
> > We do need something to tie these things together.
> >
> > So I suspect what makes most sense is to simply rename the
> > dentry SYSVID<segmentid>
> 
> Please stop breaking things in /proc. The pmap command relys
> on the old behavior.

What effect did this change have upon the pmap command?  Details, please.

> It's time to revert.

Probably true, but we'd need to understand what the impact was.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07  3:44 ` Andrew Morton
@ 2007-06-07  4:53   ` Albert Cahalan
  2007-06-07 16:20     ` Serge E. Hallyn
  2007-06-07 16:23     ` [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid Badari Pulavarty
  0 siblings, 2 replies; 34+ messages in thread
From: Albert Cahalan @ 2007-06-07  4:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, linux-mm, ebiederm, pbadari, torvalds

On 6/6/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Wed, 6 Jun 2007 23:27:01 -0400 "Albert Cahalan" <acahalan@gmail.com> wrote:
> > Eric W. Biederman writes:
> > > Badari Pulavarty <pbadari@us.ibm.com> writes:
> >
> > >> Your recent cleanup to shm code, namely
> > >>
> > >> [PATCH] shm: make sysv ipc shared memory use stacked files
> > >>
> > >> took away one of the debugging feature for shm segments.
> > >> Originally, shmid were forced to be the inode numbers and
> > >> they show up in /proc/pid/maps for the process which mapped
> > >> this shared memory segments (vma listing). That way, its easy
> > >> to find out who all mapped this shared memory segment. Your
> > >> patchset, took away the inode# setting. So, we can't easily
> > >> match the shmem segments to /proc/pid/maps easily. (It was
> > >> really useful in tracking down a customer problem recently).
> > >> Is this done deliberately ? Anything wrong in setting this back ?
> > >
> > > Theoretically it makes the stacked file concept more brittle,
> > > because it means the lower layers can't care about their inode
> > > number.
> > >
> > > We do need something to tie these things together.
> > >
> > > So I suspect what makes most sense is to simply rename the
> > > dentry SYSVID<segmentid>
> >
> > Please stop breaking things in /proc. The pmap command relys
> > on the old behavior.
>
> What effect did this change have upon the pmap command?  Details, please.
>
> > It's time to revert.
>
> Probably true, but we'd need to understand what the impact was.

Very simply, pmap reports the shmid.

albert 0 ~$ pmap `pidof X` | egrep -2 shmid
30050000  16384K rw-s-  /dev/fb0
31050000    152K rw---    [ anon ]
31076000    384K rw-s-    [ shmid=0x3f428000 ]
310d6000    384K rw-s-    [ shmid=0x3f430001 ]
31136000    384K rw-s-    [ shmid=0x3f438002 ]
31196000    384K rw-s-    [ shmid=0x3f440003 ]
311f6000    384K rw-s-    [ shmid=0x3f448004 ]
31256000    384K rw-s-    [ shmid=0x3f450005 ]
312b6000    384K rw-s-    [ shmid=0x3f460006 ]
31316000    384K rw-s-    [ shmid=0x3f870007 ]
31491000    140K r----  /usr/share/fonts/type1/gsfonts/n021003l.pfb
3150e000   9496K rw---    [ anon ]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07  4:53   ` Albert Cahalan
@ 2007-06-07 16:20     ` Serge E. Hallyn
  2007-06-08  3:45       ` Eric W. Biederman
  2007-06-07 16:23     ` [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid Badari Pulavarty
  1 sibling, 1 reply; 34+ messages in thread
From: Serge E. Hallyn @ 2007-06-07 16:20 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: Andrew Morton, linux-kernel, linux-mm, ebiederm, pbadari, torvalds

Quoting Albert Cahalan (acahalan@gmail.com):
> On 6/6/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> >On Wed, 6 Jun 2007 23:27:01 -0400 "Albert Cahalan" <acahalan@gmail.com> 
> >wrote:
> >> Eric W. Biederman writes:
> >> > Badari Pulavarty <pbadari@us.ibm.com> writes:
> >>
> >> >> Your recent cleanup to shm code, namely
> >> >>
> >> >> [PATCH] shm: make sysv ipc shared memory use stacked files
> >> >>
> >> >> took away one of the debugging feature for shm segments.
> >> >> Originally, shmid were forced to be the inode numbers and
> >> >> they show up in /proc/pid/maps for the process which mapped
> >> >> this shared memory segments (vma listing). That way, its easy
> >> >> to find out who all mapped this shared memory segment. Your
> >> >> patchset, took away the inode# setting. So, we can't easily
> >> >> match the shmem segments to /proc/pid/maps easily. (It was
> >> >> really useful in tracking down a customer problem recently).
> >> >> Is this done deliberately ? Anything wrong in setting this back ?
> >> >
> >> > Theoretically it makes the stacked file concept more brittle,
> >> > because it means the lower layers can't care about their inode
> >> > number.
> >> >
> >> > We do need something to tie these things together.
> >> >
> >> > So I suspect what makes most sense is to simply rename the
> >> > dentry SYSVID<segmentid>
> >>
> >> Please stop breaking things in /proc. The pmap command relys
> >> on the old behavior.
> >
> >What effect did this change have upon the pmap command?  Details, please.
> >
> >> It's time to revert.
> >
> >Probably true, but we'd need to understand what the impact was.
> 
> Very simply, pmap reports the shmid.
> 
> albert 0 ~$ pmap `pidof X` | egrep -2 shmid
> 30050000  16384K rw-s-  /dev/fb0
> 31050000    152K rw---    [ anon ]
> 31076000    384K rw-s-    [ shmid=0x3f428000 ]
> 310d6000    384K rw-s-    [ shmid=0x3f430001 ]
> 31136000    384K rw-s-    [ shmid=0x3f438002 ]
> 31196000    384K rw-s-    [ shmid=0x3f440003 ]
> 311f6000    384K rw-s-    [ shmid=0x3f448004 ]
> 31256000    384K rw-s-    [ shmid=0x3f450005 ]
> 312b6000    384K rw-s-    [ shmid=0x3f460006 ]
> 31316000    384K rw-s-    [ shmid=0x3f870007 ]
> 31491000    140K r----  /usr/share/fonts/type1/gsfonts/n021003l.pfb
> 3150e000   9496K rw---    [ anon ]

Ok, so IIUC the problem was that inode->i_ino was being set to the id,
and the id can be the same for different things in two namespaces.

So aside from not using the id as inode->i_ino, an alternative is to use
a separate superblock, spearate mqeueue fs, for each ipc ns.

I haven't looked at that enough to see whether it's feasible, i.e. I 
don't know what else mqueue fs is used for.  Eric, does that sound
reasonable to you?

thanks,
-serge

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07  4:53   ` Albert Cahalan
  2007-06-07 16:20     ` Serge E. Hallyn
@ 2007-06-07 16:23     ` Badari Pulavarty
  2007-06-07 16:43       ` Albert Cahalan
  1 sibling, 1 reply; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-07 16:23 UTC (permalink / raw)
  To: Albert Cahalan; +Cc: Andrew Morton, lkml, linux-mm, ebiederm, torvalds

On Thu, 2007-06-07 at 00:53 -0400, Albert Cahalan wrote:
> On 6/6/07, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Wed, 6 Jun 2007 23:27:01 -0400 "Albert Cahalan" <acahalan@gmail.com> wrote:
> > > Eric W. Biederman writes:
> > > > Badari Pulavarty <pbadari@us.ibm.com> writes:
> > >
> > > >> Your recent cleanup to shm code, namely
> > > >>
> > > >> [PATCH] shm: make sysv ipc shared memory use stacked files
> > > >>
> > > >> took away one of the debugging feature for shm segments.
> > > >> Originally, shmid were forced to be the inode numbers and
> > > >> they show up in /proc/pid/maps for the process which mapped
> > > >> this shared memory segments (vma listing). That way, its easy
> > > >> to find out who all mapped this shared memory segment. Your
> > > >> patchset, took away the inode# setting. So, we can't easily
> > > >> match the shmem segments to /proc/pid/maps easily. (It was
> > > >> really useful in tracking down a customer problem recently).
> > > >> Is this done deliberately ? Anything wrong in setting this back ?
> > > >
> > > > Theoretically it makes the stacked file concept more brittle,
> > > > because it means the lower layers can't care about their inode
> > > > number.
> > > >
> > > > We do need something to tie these things together.
> > > >
> > > > So I suspect what makes most sense is to simply rename the
> > > > dentry SYSVID<segmentid>
> > >
> > > Please stop breaking things in /proc. The pmap command relys
> > > on the old behavior.
> >
> > What effect did this change have upon the pmap command?  Details, please.
> >
> > > It's time to revert.
> >
> > Probably true, but we'd need to understand what the impact was.
> 
> Very simply, pmap reports the shmid.
> 
> albert 0 ~$ pmap `pidof X` | egrep -2 shmid
> 30050000  16384K rw-s-  /dev/fb0
> 31050000    152K rw---    [ anon ]
> 31076000    384K rw-s-    [ shmid=0x3f428000 ]
> 310d6000    384K rw-s-    [ shmid=0x3f430001 ]
> 31136000    384K rw-s-    [ shmid=0x3f438002 ]
> 31196000    384K rw-s-    [ shmid=0x3f440003 ]
> 311f6000    384K rw-s-    [ shmid=0x3f448004 ]
> 31256000    384K rw-s-    [ shmid=0x3f450005 ]
> 312b6000    384K rw-s-    [ shmid=0x3f460006 ]
> 31316000    384K rw-s-    [ shmid=0x3f870007 ]
> 31491000    140K r----  /usr/share/fonts/type1/gsfonts/n021003l.pfb
> 3150e000   9496K rw---    [ anon ]

pmap seems to get shmid from "ino#" field of /proc/pid/map.
Its already broken in current mainline.

But, the breakage is not due to namespaces or container effort :(
Its due to noble effort from Eric to clean up the shm code,
take out the hacks to handle hugetlbfs and make the code
more streamlined and readable.

If we really really want old behaviour, we need my one line
patch to force shmid as inode# :(

BTW, I agree with Eric that its would be nice to use shmid as part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from "key" or name. Isn't it ?

Andrew/Linus, its up to you to figure out if its worth breaking.
Here is the patch to base dentry-name on shmid - so we don't
need to use ino# to identify shmid.

Thanks,
Badari

Instead of basing dentry name on the shm "key", base it on
"shmid" - so it shows up clearly in /proc/pid/maps. Earlier
we were forcing ino# to match shmid.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Index: linux-2.6.22-rc4/ipc/shm.c
===================================================================
--- linux-2.6.22-rc4.orig/ipc/shm.c	2007-06-04 17:57:25.000000000 -0700
+++ linux-2.6.22-rc4/ipc/shm.c	2007-06-06 13:43:36.000000000 -0700
@@ -364,6 +364,14 @@ static int newseg (struct ipc_namespace 
 		return error;
 	}
 
+	error = -ENOSPC;
+	id = shm_addid(ns, shp);
+	if(id == -1)
+		goto no_id;
+
+	/* Build an id, so we can use it for filename */
+	shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
+
 	if (shmflg & SHM_HUGETLB) {
 		/* hugetlb_zero_setup takes care of mlock user accounting */
 		file = hugetlb_zero_setup(size);
@@ -377,34 +385,28 @@ static int newseg (struct ipc_namespace 
 		if  ((shmflg & SHM_NORESERVE) &&
 				sysctl_overcommit_memory != OVERCOMMIT_NEVER)
 			acctflag = 0;
-		sprintf (name, "SYSV%08x", key);
+		sprintf (name, "SYSVID%d", shp->id);
 		file = shmem_file_setup(name, size, acctflag);
 	}
 	error = PTR_ERR(file);
 	if (IS_ERR(file))
 		goto no_file;
 
-	error = -ENOSPC;
-	id = shm_addid(ns, shp);
-	if(id == -1) 
-		goto no_id;
-
 	shp->shm_cprid = current->tgid;
 	shp->shm_lprid = 0;
 	shp->shm_atim = shp->shm_dtim = 0;
 	shp->shm_ctim = get_seconds();
 	shp->shm_segsz = size;
 	shp->shm_nattch = 0;
-	shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
 	shp->shm_file = file;
 
 	ns->shm_tot += numpages;
 	shm_unlock(shp);
 	return shp->id;
 
-no_id:
-	fput(file);
 no_file:
+	shm_rmid(ns, shp->id);
+no_id:
 	security_shm_free(shp);
 	ipc_rcu_putref(shp);
 	return error;



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 16:23     ` [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid Badari Pulavarty
@ 2007-06-07 16:43       ` Albert Cahalan
  2007-06-07 17:06         ` Badari Pulavarty
  0 siblings, 1 reply; 34+ messages in thread
From: Albert Cahalan @ 2007-06-07 16:43 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: Andrew Morton, lkml, linux-mm, ebiederm, torvalds

On 6/7/07, Badari Pulavarty <pbadari@us.ibm.com> wrote:

> BTW, I agree with Eric that its would be nice to use shmid as part
> of name instead of forcing to be as inode number. It should be
> possible for pmap to workout shmid from "key" or name. Isn't it ?

It is not at all nice.

1. it's incompatible ABI breakage
2. where will you put the key then, in the inode? :-)

Changing to "SYSVID%d" is no good either. Look, people
are ***parsing*** this stuff in /proc. The /proc filesystem
is not some random sandbox to be playing in.

Before you go messing with it, note that the device number
also matters. (it's per-boot dynamic, but that's OK)
That's how one knows that /SYSV00000000 is not just
a regular file; sadly these didn't get a non-/ prefix.
(and no you can't fix that now; it's way too late)

Next time you feel like breaking an ABI, mind putting
"LET'S BREAK AN ABI!" in the subject of your email?

BTW, I suspect this kind of thing also breaks:
a. fuser, lsof, and other resource usage display tools
b. various obscure emulators (similar to valgrind)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 16:43       ` Albert Cahalan
@ 2007-06-07 17:06         ` Badari Pulavarty
  2007-06-07 19:48           ` Andrew Morton
  0 siblings, 1 reply; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-07 17:06 UTC (permalink / raw)
  To: Albert Cahalan; +Cc: Andrew Morton, lkml, linux-mm, ebiederm, torvalds

On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> On 6/7/07, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> 
> > BTW, I agree with Eric that its would be nice to use shmid as part
> > of name instead of forcing to be as inode number. It should be
> > possible for pmap to workout shmid from "key" or name. Isn't it ?
> 
> It is not at all nice.
> 
> 1. it's incompatible ABI breakage
> 2. where will you put the key then, in the inode? :-)

Nope. Currently "key" is part of the name (but its not unique).

> 
> Changing to "SYSVID%d" is no good either. Look, people
> are ***parsing*** this stuff in /proc. The /proc filesystem
> is not some random sandbox to be playing in.
> 
> Before you go messing with it, note that the device number
> also matters. (it's per-boot dynamic, but that's OK)
> That's how one knows that /SYSV00000000 is not just
> a regular file; sadly these didn't get a non-/ prefix.
> (and no you can't fix that now; it's way too late)
> 
> Next time you feel like breaking an ABI, mind putting
> "LET'S BREAK AN ABI!" in the subject of your email?

I am not breaking ABI. Its already broken in the current
mainline. I am trying to fix it by putting back the ino#
as shmid. Eric had a suggestion that, instead of depending
on the inode# to be shmid, we could embed shmid into name
(instead of "key" which is currently not unique).

> BTW, I suspect this kind of thing also breaks:
> a. fuser, lsof, and other resource usage display tools
> b. various obscure emulators (similar to valgrind)

If you strongly feel that "old" behaviour needs to be retained, 
here is the patch I originally suggested.

Thanks,
Badari

"ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
memory (shmid). It was useful in debugging, but its changed recently. 
This patch sets inode number to shared memory id to match /proc/pid/maps.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>

Index: linux-2.6.22-rc4/ipc/shm.c
===================================================================
--- linux-2.6.22-rc4.orig/ipc/shm.c	2007-06-04 17:57:25.000000000 -0700
+++ linux-2.6.22-rc4/ipc/shm.c	2007-06-06 08:23:57.000000000 -0700
@@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
 	shp->shm_nattch = 0;
 	shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
 	shp->shm_file = file;
+	file->f_dentry->d_inode->i_ino = shp->id;
 
 	ns->shm_tot += numpages;
 	shm_unlock(shp);




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 17:06         ` Badari Pulavarty
@ 2007-06-07 19:48           ` Andrew Morton
  2007-06-07 19:59             ` Badari Pulavarty
  0 siblings, 1 reply; 34+ messages in thread
From: Andrew Morton @ 2007-06-07 19:48 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: Albert Cahalan, lkml, linux-mm, ebiederm, torvalds

On Thu, 07 Jun 2007 10:06:37 -0700
Badari Pulavarty <pbadari@us.ibm.com> wrote:

> On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > On 6/7/07, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > 
> > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > of name instead of forcing to be as inode number. It should be
> > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > 
> > It is not at all nice.
> > 
> > 1. it's incompatible ABI breakage
> > 2. where will you put the key then, in the inode? :-)
> 
> Nope. Currently "key" is part of the name (but its not unique).
> 
> > 
> > Changing to "SYSVID%d" is no good either. Look, people
> > are ***parsing*** this stuff in /proc. The /proc filesystem
> > is not some random sandbox to be playing in.
> > 
> > Before you go messing with it, note that the device number
> > also matters. (it's per-boot dynamic, but that's OK)
> > That's how one knows that /SYSV00000000 is not just
> > a regular file; sadly these didn't get a non-/ prefix.
> > (and no you can't fix that now; it's way too late)
> > 
> > Next time you feel like breaking an ABI, mind putting
> > "LET'S BREAK AN ABI!" in the subject of your email?
> 
> I am not breaking ABI. Its already broken in the current
> mainline. I am trying to fix it by putting back the ino#
> as shmid. Eric had a suggestion that, instead of depending
> on the inode# to be shmid, we could embed shmid into name
> (instead of "key" which is currently not unique).
> 
> > BTW, I suspect this kind of thing also breaks:
> > a. fuser, lsof, and other resource usage display tools
> > b. various obscure emulators (similar to valgrind)
> 
> If you strongly feel that "old" behaviour needs to be retained, 

yup, we should put it back.  The change was, afaik, accidental.

> here is the patch I originally suggested.

Confused.  Will this one-liner fix all the userspace breakage to which
Albert refers?

> Thanks,
> Badari
> 
> "ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
> memory (shmid). It was useful in debugging, but its changed recently. 
> This patch sets inode number to shared memory id to match /proc/pid/maps.
> 
> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
> 
> Index: linux-2.6.22-rc4/ipc/shm.c
> ===================================================================
> --- linux-2.6.22-rc4.orig/ipc/shm.c	2007-06-04 17:57:25.000000000 -0700
> +++ linux-2.6.22-rc4/ipc/shm.c	2007-06-06 08:23:57.000000000 -0700
> @@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
>  	shp->shm_nattch = 0;
>  	shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
>  	shp->shm_file = file;
> +	file->f_dentry->d_inode->i_ino = shp->id;
>  
>  	ns->shm_tot += numpages;
>  	shm_unlock(shp);
> 
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 19:48           ` Andrew Morton
@ 2007-06-07 19:59             ` Badari Pulavarty
  2007-06-07 20:37               ` Serge E. Hallyn
  0 siblings, 1 reply; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-07 19:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Albert Cahalan, lkml, linux-mm, ebiederm, torvalds

On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> On Thu, 07 Jun 2007 10:06:37 -0700
> Badari Pulavarty <pbadari@us.ibm.com> wrote:
> 
> > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > On 6/7/07, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > 
> > > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > > of name instead of forcing to be as inode number. It should be
> > > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > > 
> > > It is not at all nice.
> > > 
> > > 1. it's incompatible ABI breakage
> > > 2. where will you put the key then, in the inode? :-)
> > 
> > Nope. Currently "key" is part of the name (but its not unique).
> > 
> > > 
> > > Changing to "SYSVID%d" is no good either. Look, people
> > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > is not some random sandbox to be playing in.
> > > 
> > > Before you go messing with it, note that the device number
> > > also matters. (it's per-boot dynamic, but that's OK)
> > > That's how one knows that /SYSV00000000 is not just
> > > a regular file; sadly these didn't get a non-/ prefix.
> > > (and no you can't fix that now; it's way too late)
> > > 
> > > Next time you feel like breaking an ABI, mind putting
> > > "LET'S BREAK AN ABI!" in the subject of your email?
> > 
> > I am not breaking ABI. Its already broken in the current
> > mainline. I am trying to fix it by putting back the ino#
> > as shmid. Eric had a suggestion that, instead of depending
> > on the inode# to be shmid, we could embed shmid into name
> > (instead of "key" which is currently not unique).
> > 
> > > BTW, I suspect this kind of thing also breaks:
> > > a. fuser, lsof, and other resource usage display tools
> > > b. various obscure emulators (similar to valgrind)
> > 
> > If you strongly feel that "old" behaviour needs to be retained, 
> 
> yup, we should put it back.  The change was, afaik, accidental.
> 
> > here is the patch I originally suggested.
> 
> Confused.  Will this one-liner fix all the userspace breakage to which
> Albert refers?

Yes. Albert, please correct me if I am wrong.

Thanks,
Badari


> > "ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
> > memory (shmid). It was useful in debugging, but its changed recently. 
> > This patch sets inode number to shared memory id to match /proc/pid/maps.
> > 
> > Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
> > 
> > Index: linux-2.6.22-rc4/ipc/shm.c
> > ===================================================================
> > --- linux-2.6.22-rc4.orig/ipc/shm.c	2007-06-04 17:57:25.000000000 -0700
> > +++ linux-2.6.22-rc4/ipc/shm.c	2007-06-06 08:23:57.000000000 -0700
> > @@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
> >  	shp->shm_nattch = 0;
> >  	shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
> >  	shp->shm_file = file;
> > +	file->f_dentry->d_inode->i_ino = shp->id;
> >  
> >  	ns->shm_tot += numpages;
> >  	shm_unlock(shp);
> > 
> > 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 19:59             ` Badari Pulavarty
@ 2007-06-07 20:37               ` Serge E. Hallyn
  2007-06-07 20:51                 ` Serge E. Hallyn
  2007-06-07 21:16                 ` Badari Pulavarty
  0 siblings, 2 replies; 34+ messages in thread
From: Serge E. Hallyn @ 2007-06-07 20:37 UTC (permalink / raw)
  To: Badari Pulavarty
  Cc: Andrew Morton, Albert Cahalan, lkml, linux-mm, ebiederm, torvalds

Quoting Badari Pulavarty (pbadari@us.ibm.com):
> On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> > On Thu, 07 Jun 2007 10:06:37 -0700
> > Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > 
> > > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > > On 6/7/07, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > > 
> > > > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > > > of name instead of forcing to be as inode number. It should be
> > > > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > > > 
> > > > It is not at all nice.
> > > > 
> > > > 1. it's incompatible ABI breakage
> > > > 2. where will you put the key then, in the inode? :-)
> > > 
> > > Nope. Currently "key" is part of the name (but its not unique).
> > > 
> > > > 
> > > > Changing to "SYSVID%d" is no good either. Look, people
> > > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > > is not some random sandbox to be playing in.
> > > > 
> > > > Before you go messing with it, note that the device number
> > > > also matters. (it's per-boot dynamic, but that's OK)
> > > > That's how one knows that /SYSV00000000 is not just
> > > > a regular file; sadly these didn't get a non-/ prefix.
> > > > (and no you can't fix that now; it's way too late)
> > > > 
> > > > Next time you feel like breaking an ABI, mind putting
> > > > "LET'S BREAK AN ABI!" in the subject of your email?
> > > 
> > > I am not breaking ABI. Its already broken in the current
> > > mainline. I am trying to fix it by putting back the ino#
> > > as shmid. Eric had a suggestion that, instead of depending
> > > on the inode# to be shmid, we could embed shmid into name
> > > (instead of "key" which is currently not unique).
> > > 
> > > > BTW, I suspect this kind of thing also breaks:
> > > > a. fuser, lsof, and other resource usage display tools
> > > > b. various obscure emulators (similar to valgrind)
> > > 
> > > If you strongly feel that "old" behaviour needs to be retained, 
> > 
> > yup, we should put it back.  The change was, afaik, accidental.
> > 
> > > here is the patch I originally suggested.
> > 
> > Confused.  Will this one-liner fix all the userspace breakage to which
> > Albert refers?
> 
> Yes. Albert, please correct me if I am wrong.

It will, but could lead to two different inodes with the same i_ino,
right?

thanks,
-serge

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 20:37               ` Serge E. Hallyn
@ 2007-06-07 20:51                 ` Serge E. Hallyn
  2007-06-07 21:16                 ` Badari Pulavarty
  1 sibling, 0 replies; 34+ messages in thread
From: Serge E. Hallyn @ 2007-06-07 20:51 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Badari Pulavarty, Andrew Morton, Albert Cahalan, lkml, linux-mm,
	ebiederm, torvalds

Quoting Serge E. Hallyn (serge@hallyn.com):
> Quoting Badari Pulavarty (pbadari@us.ibm.com):
> > On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> > > On Thu, 07 Jun 2007 10:06:37 -0700
> > > Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > 
> > > > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > > > On 6/7/07, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > > > 
> > > > > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > > > > of name instead of forcing to be as inode number. It should be
> > > > > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > > > > 
> > > > > It is not at all nice.
> > > > > 
> > > > > 1. it's incompatible ABI breakage
> > > > > 2. where will you put the key then, in the inode? :-)
> > > > 
> > > > Nope. Currently "key" is part of the name (but its not unique).
> > > > 
> > > > > 
> > > > > Changing to "SYSVID%d" is no good either. Look, people
> > > > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > > > is not some random sandbox to be playing in.
> > > > > 
> > > > > Before you go messing with it, note that the device number
> > > > > also matters. (it's per-boot dynamic, but that's OK)
> > > > > That's how one knows that /SYSV00000000 is not just
> > > > > a regular file; sadly these didn't get a non-/ prefix.
> > > > > (and no you can't fix that now; it's way too late)
> > > > > 
> > > > > Next time you feel like breaking an ABI, mind putting
> > > > > "LET'S BREAK AN ABI!" in the subject of your email?
> > > > 
> > > > I am not breaking ABI. Its already broken in the current
> > > > mainline. I am trying to fix it by putting back the ino#
> > > > as shmid. Eric had a suggestion that, instead of depending
> > > > on the inode# to be shmid, we could embed shmid into name
> > > > (instead of "key" which is currently not unique).
> > > > 
> > > > > BTW, I suspect this kind of thing also breaks:
> > > > > a. fuser, lsof, and other resource usage display tools
> > > > > b. various obscure emulators (similar to valgrind)
> > > > 
> > > > If you strongly feel that "old" behaviour needs to be retained, 
> > > 
> > > yup, we should put it back.  The change was, afaik, accidental.
> > > 
> > > > here is the patch I originally suggested.
> > > 
> > > Confused.  Will this one-liner fix all the userspace breakage to which
> > > Albert refers?
> > 
> > Yes. Albert, please correct me if I am wrong.
> 
> It will, but could lead to two different inodes with the same i_ino,
> right?

Well I guess it's not *technically* a problem since these inodes are
never hashed.

-serge

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 20:37               ` Serge E. Hallyn
  2007-06-07 20:51                 ` Serge E. Hallyn
@ 2007-06-07 21:16                 ` Badari Pulavarty
  2007-06-07 22:08                   ` Serge E. Hallyn
  1 sibling, 1 reply; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-07 21:16 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Andrew Morton, Albert Cahalan, lkml, linux-mm, ebiederm, torvalds

On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:
> Quoting Badari Pulavarty (pbadari@us.ibm.com):
> > On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> > > On Thu, 07 Jun 2007 10:06:37 -0700
> > > Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > 
> > > > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > > > On 6/7/07, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > > > 
> > > > > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > > > > of name instead of forcing to be as inode number. It should be
> > > > > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > > > > 
> > > > > It is not at all nice.
> > > > > 
> > > > > 1. it's incompatible ABI breakage
> > > > > 2. where will you put the key then, in the inode? :-)
> > > > 
> > > > Nope. Currently "key" is part of the name (but its not unique).
> > > > 
> > > > > 
> > > > > Changing to "SYSVID%d" is no good either. Look, people
> > > > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > > > is not some random sandbox to be playing in.
> > > > > 
> > > > > Before you go messing with it, note that the device number
> > > > > also matters. (it's per-boot dynamic, but that's OK)
> > > > > That's how one knows that /SYSV00000000 is not just
> > > > > a regular file; sadly these didn't get a non-/ prefix.
> > > > > (and no you can't fix that now; it's way too late)
> > > > > 
> > > > > Next time you feel like breaking an ABI, mind putting
> > > > > "LET'S BREAK AN ABI!" in the subject of your email?
> > > > 
> > > > I am not breaking ABI. Its already broken in the current
> > > > mainline. I am trying to fix it by putting back the ino#
> > > > as shmid. Eric had a suggestion that, instead of depending
> > > > on the inode# to be shmid, we could embed shmid into name
> > > > (instead of "key" which is currently not unique).
> > > > 
> > > > > BTW, I suspect this kind of thing also breaks:
> > > > > a. fuser, lsof, and other resource usage display tools
> > > > > b. various obscure emulators (similar to valgrind)
> > > > 
> > > > If you strongly feel that "old" behaviour needs to be retained, 
> > > 
> > > yup, we should put it back.  The change was, afaik, accidental.
> > > 
> > > > here is the patch I originally suggested.
> > > 
> > > Confused.  Will this one-liner fix all the userspace breakage to which
> > > Albert refers?
> > 
> > Yes. Albert, please correct me if I am wrong.
> 
> It will, but could lead to two different inodes with the same i_ino,
> right?

Only if we generate same ID in two different namespaces. Is it currently
possible ? 

Thanks,
Badari



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 21:16                 ` Badari Pulavarty
@ 2007-06-07 22:08                   ` Serge E. Hallyn
  2007-06-07 22:21                     ` Badari Pulavarty
  2007-06-07 22:22                     ` Serge E. Hallyn
  0 siblings, 2 replies; 34+ messages in thread
From: Serge E. Hallyn @ 2007-06-07 22:08 UTC (permalink / raw)
  To: Badari Pulavarty
  Cc: Serge E. Hallyn, Andrew Morton, Albert Cahalan, lkml, linux-mm,
	ebiederm, torvalds

Quoting Badari Pulavarty (pbadari@us.ibm.com):
> On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:
> > Quoting Badari Pulavarty (pbadari@us.ibm.com):
> > > On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> > > > On Thu, 07 Jun 2007 10:06:37 -0700
> > > > Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > > 
> > > > > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > > > > On 6/7/07, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > > > > 
> > > > > > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > > > > > of name instead of forcing to be as inode number. It should be
> > > > > > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > > > > > 
> > > > > > It is not at all nice.
> > > > > > 
> > > > > > 1. it's incompatible ABI breakage
> > > > > > 2. where will you put the key then, in the inode? :-)
> > > > > 
> > > > > Nope. Currently "key" is part of the name (but its not unique).
> > > > > 
> > > > > > 
> > > > > > Changing to "SYSVID%d" is no good either. Look, people
> > > > > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > > > > is not some random sandbox to be playing in.
> > > > > > 
> > > > > > Before you go messing with it, note that the device number
> > > > > > also matters. (it's per-boot dynamic, but that's OK)
> > > > > > That's how one knows that /SYSV00000000 is not just
> > > > > > a regular file; sadly these didn't get a non-/ prefix.
> > > > > > (and no you can't fix that now; it's way too late)
> > > > > > 
> > > > > > Next time you feel like breaking an ABI, mind putting
> > > > > > "LET'S BREAK AN ABI!" in the subject of your email?
> > > > > 
> > > > > I am not breaking ABI. Its already broken in the current
> > > > > mainline. I am trying to fix it by putting back the ino#
> > > > > as shmid. Eric had a suggestion that, instead of depending
> > > > > on the inode# to be shmid, we could embed shmid into name
> > > > > (instead of "key" which is currently not unique).
> > > > > 
> > > > > > BTW, I suspect this kind of thing also breaks:
> > > > > > a. fuser, lsof, and other resource usage display tools
> > > > > > b. various obscure emulators (similar to valgrind)
> > > > > 
> > > > > If you strongly feel that "old" behaviour needs to be retained, 
> > > > 
> > > > yup, we should put it back.  The change was, afaik, accidental.
> > > > 
> > > > > here is the patch I originally suggested.
> > > > 
> > > > Confused.  Will this one-liner fix all the userspace breakage to which
> > > > Albert refers?
> > > 
> > > Yes. Albert, please correct me if I am wrong.
> > 
> > It will, but could lead to two different inodes with the same i_ino,
> > right?
> 
> Only if we generate same ID in two different namespaces. Is it currently
> possible ? 

Should be nothing stopping it.

But like I say we never find the inode based on i_ino, and don't hash
the inode, so it might be ok.

-serge

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 22:08                   ` Serge E. Hallyn
@ 2007-06-07 22:21                     ` Badari Pulavarty
  2007-06-07 22:22                     ` Serge E. Hallyn
  1 sibling, 0 replies; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-07 22:21 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Andrew Morton, Albert Cahalan, lkml, linux-mm, ebiederm, torvalds



Serge E. Hallyn wrote:

>Quoting Badari Pulavarty (pbadari@us.ibm.com):
>
>>On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:
>>
>>>Quoting Badari Pulavarty (pbadari@us.ibm.com):
>>>
>>>>On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
>>>>
>>>>>On Thu, 07 Jun 2007 10:06:37 -0700
>>>>>Badari Pulavarty <pbadari@us.ibm.com> wrote:
>>>>>
>>>>>>On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
>>>>>>
>>>>>>>On 6/7/07, Badari Pulavarty <pbadari@us.ibm.com> wrote:
>>>>>>>
>>>>>>>>BTW, I agree with Eric that its would be nice to use shmid as part
>>>>>>>>of name instead of forcing to be as inode number. It should be
>>>>>>>>possible for pmap to workout shmid from "key" or name. Isn't it ?
>>>>>>>>
>>>>>>>It is not at all nice.
>>>>>>>
>>>>>>>1. it's incompatible ABI breakage
>>>>>>>2. where will you put the key then, in the inode? :-)
>>>>>>>
>>>>>>Nope. Currently "key" is part of the name (but its not unique).
>>>>>>
>>>>>>>Changing to "SYSVID%d" is no good either. Look, people
>>>>>>>are ***parsing*** this stuff in /proc. The /proc filesystem
>>>>>>>is not some random sandbox to be playing in.
>>>>>>>
>>>>>>>Before you go messing with it, note that the device number
>>>>>>>also matters. (it's per-boot dynamic, but that's OK)
>>>>>>>That's how one knows that /SYSV00000000 is not just
>>>>>>>a regular file; sadly these didn't get a non-/ prefix.
>>>>>>>(and no you can't fix that now; it's way too late)
>>>>>>>
>>>>>>>Next time you feel like breaking an ABI, mind putting
>>>>>>>"LET'S BREAK AN ABI!" in the subject of your email?
>>>>>>>
>>>>>>I am not breaking ABI. Its already broken in the current
>>>>>>mainline. I am trying to fix it by putting back the ino#
>>>>>>as shmid. Eric had a suggestion that, instead of depending
>>>>>>on the inode# to be shmid, we could embed shmid into name
>>>>>>(instead of "key" which is currently not unique).
>>>>>>
>>>>>>>BTW, I suspect this kind of thing also breaks:
>>>>>>>a. fuser, lsof, and other resource usage display tools
>>>>>>>b. various obscure emulators (similar to valgrind)
>>>>>>>
>>>>>>If you strongly feel that "old" behaviour needs to be retained, 
>>>>>>
>>>>>yup, we should put it back.  The change was, afaik, accidental.
>>>>>
>>>>>>here is the patch I originally suggested.
>>>>>>
>>>>>Confused.  Will this one-liner fix all the userspace breakage to which
>>>>>Albert refers?
>>>>>
>>>>Yes. Albert, please correct me if I am wrong.
>>>>
>>>It will, but could lead to two different inodes with the same i_ino,
>>>right?
>>>
>>Only if we generate same ID in two different namespaces. Is it currently
>>possible ? 
>>
>
>Should be nothing stopping it.
>
>But like I say we never find the inode based on i_ino, and don't hash
>the inode, so it might be ok.
>
Correct. We might end up with same shmid - which mean same inode# shows 
up in /proc/pid/maps.
If we don't unshare pid namespace or look from parent namespace - we 
will end up seeing same
shmid/inode# in different /proc/pid/maps, even though they are 
different. But I guess its okay..

Thanks,
Badari




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 22:08                   ` Serge E. Hallyn
  2007-06-07 22:21                     ` Badari Pulavarty
@ 2007-06-07 22:22                     ` Serge E. Hallyn
  2007-06-07 23:57                       ` Badari Pulavarty
  1 sibling, 1 reply; 34+ messages in thread
From: Serge E. Hallyn @ 2007-06-07 22:22 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Badari Pulavarty, Andrew Morton, Albert Cahalan, lkml, linux-mm,
	ebiederm, torvalds

Quoting Serge E. Hallyn (serge@hallyn.com):
> Quoting Badari Pulavarty (pbadari@us.ibm.com):
> > On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:
> > > Quoting Badari Pulavarty (pbadari@us.ibm.com):
> > > > On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> > > > > On Thu, 07 Jun 2007 10:06:37 -0700
> > > > > Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > > > 
> > > > > > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > > > > > On 6/7/07, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > > > > > 
> > > > > > > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > > > > > > of name instead of forcing to be as inode number. It should be
> > > > > > > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > > > > > > 
> > > > > > > It is not at all nice.
> > > > > > > 
> > > > > > > 1. it's incompatible ABI breakage
> > > > > > > 2. where will you put the key then, in the inode? :-)
> > > > > > 
> > > > > > Nope. Currently "key" is part of the name (but its not unique).
> > > > > > 
> > > > > > > 
> > > > > > > Changing to "SYSVID%d" is no good either. Look, people
> > > > > > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > > > > > is not some random sandbox to be playing in.
> > > > > > > 
> > > > > > > Before you go messing with it, note that the device number
> > > > > > > also matters. (it's per-boot dynamic, but that's OK)
> > > > > > > That's how one knows that /SYSV00000000 is not just
> > > > > > > a regular file; sadly these didn't get a non-/ prefix.
> > > > > > > (and no you can't fix that now; it's way too late)
> > > > > > > 
> > > > > > > Next time you feel like breaking an ABI, mind putting
> > > > > > > "LET'S BREAK AN ABI!" in the subject of your email?
> > > > > > 
> > > > > > I am not breaking ABI. Its already broken in the current
> > > > > > mainline. I am trying to fix it by putting back the ino#
> > > > > > as shmid. Eric had a suggestion that, instead of depending
> > > > > > on the inode# to be shmid, we could embed shmid into name
> > > > > > (instead of "key" which is currently not unique).
> > > > > > 
> > > > > > > BTW, I suspect this kind of thing also breaks:
> > > > > > > a. fuser, lsof, and other resource usage display tools
> > > > > > > b. various obscure emulators (similar to valgrind)
> > > > > > 
> > > > > > If you strongly feel that "old" behaviour needs to be retained, 
> > > > > 
> > > > > yup, we should put it back.  The change was, afaik, accidental.
> > > > > 
> > > > > > here is the patch I originally suggested.
> > > > > 
> > > > > Confused.  Will this one-liner fix all the userspace breakage to which
> > > > > Albert refers?
> > > > 
> > > > Yes. Albert, please correct me if I am wrong.
> > > 
> > > It will, but could lead to two different inodes with the same i_ino,
> > > right?
> > 
> > Only if we generate same ID in two different namespaces. Is it currently
> > possible ? 
> 
> Should be nothing stopping it.

(just to be more certain, a quick test showed I can get id 0 for
different keys, and different ids for the same key 0xff, in different
ipc namespaces)

-serge

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 22:22                     ` Serge E. Hallyn
@ 2007-06-07 23:57                       ` Badari Pulavarty
  0 siblings, 0 replies; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-07 23:57 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Andrew Morton, Albert Cahalan, lkml, linux-mm, ebiederm, torvalds



Serge E. Hallyn wrote:

>Quoting Serge E. Hallyn (serge@hallyn.com):
>
>>Quoting Badari Pulavarty (pbadari@us.ibm.com):
>>
>>>On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:
>>>
>>>>Quoting Badari Pulavarty (pbadari@us.ibm.com):
>>>>
>>>>>On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
>>>>>
>>>>>>On Thu, 07 Jun 2007 10:06:37 -0700
>>>>>>Badari Pulavarty <pbadari@us.ibm.com> wrote:
>>>>>>
>>>>>>>On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
>>>>>>>
>>>>>>>>On 6/7/07, Badari Pulavarty <pbadari@us.ibm.com> wrote:
>>>>>>>>
>>>>>>>>>BTW, I agree with Eric that its would be nice to use shmid as part
>>>>>>>>>of name instead of forcing to be as inode number. It should be
>>>>>>>>>possible for pmap to workout shmid from "key" or name. Isn't it ?
>>>>>>>>>
>>>>>>>>It is not at all nice.
>>>>>>>>
>>>>>>>>1. it's incompatible ABI breakage
>>>>>>>>2. where will you put the key then, in the inode? :-)
>>>>>>>>
>>>>>>>Nope. Currently "key" is part of the name (but its not unique).
>>>>>>>
>>>>>>>>Changing to "SYSVID%d" is no good either. Look, people
>>>>>>>>are ***parsing*** this stuff in /proc. The /proc filesystem
>>>>>>>>is not some random sandbox to be playing in.
>>>>>>>>
>>>>>>>>Before you go messing with it, note that the device number
>>>>>>>>also matters. (it's per-boot dynamic, but that's OK)
>>>>>>>>That's how one knows that /SYSV00000000 is not just
>>>>>>>>a regular file; sadly these didn't get a non-/ prefix.
>>>>>>>>(and no you can't fix that now; it's way too late)
>>>>>>>>
>>>>>>>>Next time you feel like breaking an ABI, mind putting
>>>>>>>>"LET'S BREAK AN ABI!" in the subject of your email?
>>>>>>>>
>>>>>>>I am not breaking ABI. Its already broken in the current
>>>>>>>mainline. I am trying to fix it by putting back the ino#
>>>>>>>as shmid. Eric had a suggestion that, instead of depending
>>>>>>>on the inode# to be shmid, we could embed shmid into name
>>>>>>>(instead of "key" which is currently not unique).
>>>>>>>
>>>>>>>>BTW, I suspect this kind of thing also breaks:
>>>>>>>>a. fuser, lsof, and other resource usage display tools
>>>>>>>>b. various obscure emulators (similar to valgrind)
>>>>>>>>
>>>>>>>If you strongly feel that "old" behaviour needs to be retained, 
>>>>>>>
>>>>>>yup, we should put it back.  The change was, afaik, accidental.
>>>>>>
>>>>>>>here is the patch I originally suggested.
>>>>>>>
>>>>>>Confused.  Will this one-liner fix all the userspace breakage to which
>>>>>>Albert refers?
>>>>>>
>>>>>Yes. Albert, please correct me if I am wrong.
>>>>>
>>>>It will, but could lead to two different inodes with the same i_ino,
>>>>right?
>>>>
>>>Only if we generate same ID in two different namespaces. Is it currently
>>>possible ? 
>>>
>>Should be nothing stopping it.
>>
>
>(just to be more certain, a quick test showed I can get id 0 for
>different keys, and different ids for the same key 0xff, in different
>ipc namespaces)
>
Funny. I played with it and decided that it can happen :)

Thanks,
Badari




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-07 16:20     ` Serge E. Hallyn
@ 2007-06-08  3:45       ` Eric W. Biederman
  2007-06-08  4:41         ` Albert Cahalan
  2007-06-08 16:07         ` [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid Badari Pulavarty
  0 siblings, 2 replies; 34+ messages in thread
From: Eric W. Biederman @ 2007-06-08  3:45 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Albert Cahalan, Andrew Morton, linux-kernel, linux-mm, pbadari, torvalds

"Serge E. Hallyn" <serge@hallyn.com> writes:

> Ok, so IIUC the problem was that inode->i_ino was being set to the id,
> and the id can be the same for different things in two namespaces.

There is nothing preventing inode number collisions in this code even
without multiple namespaces, and even when it was functioning
correctly.  However as it does not seem possible to find these files
through normal filesystem operations that does not seem to be a problem.

> So aside from not using the id as inode->i_ino, an alternative is to use
> a separate superblock, spearate mqeueue fs, for each ipc ns.
>
> I haven't looked at that enough to see whether it's feasible, i.e. I 
> don't know what else mqueue fs is used for.  Eric, does that sound
> reasonable to you?

At this point given that we actually have a small user space dependency
and the fact that after I have reviewed the code it looks harmless to
change the inode number of those inodes, in both cases they are just
anonymous inodes generated with new_inode, and anything that we wrap
is likely to be equally so.

So it looks to me like we need to do three things:
- Fix the inode number
- Fix the name on the hugetlbfs dentry to hold the key
- Add a big fat comment that user space programs depend on this
  behavior of both the dentry name and the inode number.

So Badari it looks like your original patch plus a little bit is
what we need.

Eric

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-08  3:45       ` Eric W. Biederman
@ 2007-06-08  4:41         ` Albert Cahalan
  2007-06-08  5:55           ` Eric W. Biederman
  2007-06-08 16:07         ` [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid Badari Pulavarty
  1 sibling, 1 reply; 34+ messages in thread
From: Albert Cahalan @ 2007-06-08  4:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge E. Hallyn, Andrew Morton, linux-kernel, linux-mm, pbadari,
	torvalds

On 6/7/07, Eric W. Biederman <ebiederm@xmission.com> wrote:

> So it looks to me like we need to do three things:
> - Fix the inode number
> - Fix the name on the hugetlbfs dentry to hold the key
> - Add a big fat comment that user space programs depend on this
>   behavior of both the dentry name and the inode number.

Assuming that this proposed fix goes in:

Since the inode number is the shmid, and this is a number
that the kernel randomly chooses AFAIK, there should be
no need to have different shm segments sharing the same
inode number.

The situation with the key is a bit more disturbing, though
we already hit that anyway when IPC_PRIVATE is used.
(why anybody would NOT use IPC_PRIVATE is a mystery)
So having the key in the name doesn't make things worse.

I have some concern about the device minor number.
This should be the same for all shm mappings; I do not
know if the behavior changed.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-08  4:41         ` Albert Cahalan
@ 2007-06-08  5:55           ` Eric W. Biederman
  2007-06-08  6:51             ` Albert Cahalan
  0 siblings, 1 reply; 34+ messages in thread
From: Eric W. Biederman @ 2007-06-08  5:55 UTC (permalink / raw)
  To: Albert Cahalan
  Cc: Serge E. Hallyn, Andrew Morton, linux-kernel, linux-mm, pbadari,
	torvalds

"Albert Cahalan" <acahalan@gmail.com> writes:

> On 6/7/07, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
>> So it looks to me like we need to do three things:
>> - Fix the inode number
>> - Fix the name on the hugetlbfs dentry to hold the key
>> - Add a big fat comment that user space programs depend on this
>>   behavior of both the dentry name and the inode number.
>
> Assuming that this proposed fix goes in:
>
> Since the inode number is the shmid, and this is a number
> that the kernel randomly chooses AFAIK, there should be
> no need to have different shm segments sharing the same
> inode number.

Where we run into inode number confusion is that all of
these shm segments are actually files on a tmpfs filesystem
somewhere, and by making the inode number the shmid we loose
the tmpfs inode number.  So it is possible we get tmpfs inode
number conflicts.  However the inode number is not used for
anything, and the files are not visible in any other way except
as shm segments so it doesn't matter.

There is another case with ipc namespaces where we ultimately need
to support duplicate shmids on the same machine (so migration
is a possibility).  However by and large the user space
processes with duplicate ids should be invisible to each other.

> The situation with the key is a bit more disturbing, though
> we already hit that anyway when IPC_PRIVATE is used.
> (why anybody would NOT use IPC_PRIVATE is a mystery)
> So having the key in the name doesn't make things worse.

Having "SYSV" in the name appears mandatory.  Otherwise you
don't even know it is a shm file. Although I may be confused.

> I have some concern about the device minor number.
> This should be the same for all shm mappings; I do not
> know if the behavior changed.

So I haven't changed anything here.  But I haven't really
looked either.

I don't have a clue if hugetlbfs files use the same device minor
number as tmpfs files.

Hmm.  Thinking about this I have just realized that we may want
to approach this a little differently.  Currently I am reusing
the dentry and inode structure that hugetlbfs and tmpfs return
me, and simply have a distinct struct file for each shm mapping.

There is a little more cost but it may actually make sense to have
a dentry and inode that is specific to shm.c so we can do whatever
we need to without adding requirements to the normal tmpfs or hugtlb
code.

Eric

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-08  5:55           ` Eric W. Biederman
@ 2007-06-08  6:51             ` Albert Cahalan
  2007-06-08 22:31               ` [PATCH] Restore shmid as inode# to fix /proc/pid/maps ABI breakage Badari Pulavarty
  0 siblings, 1 reply; 34+ messages in thread
From: Albert Cahalan @ 2007-06-08  6:51 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge E. Hallyn, Andrew Morton, linux-kernel, linux-mm, pbadari,
	torvalds

On 6/8/07, Eric W. Biederman <ebiederm@xmission.com> wrote:
> "Albert Cahalan" <acahalan@gmail.com> writes:
> > On 6/7/07, Eric W. Biederman <ebiederm@xmission.com> wrote:

> >> So it looks to me like we need to do three things:
> >> - Fix the inode number
> >> - Fix the name on the hugetlbfs dentry to hold the key
> >> - Add a big fat comment that user space programs depend on this
> >>   behavior of both the dentry name and the inode number.
> >
> > Assuming that this proposed fix goes in:
> >
> > Since the inode number is the shmid, and this is a number
> > that the kernel randomly chooses AFAIK, there should be
> > no need to have different shm segments sharing the same
> > inode number.
>
> Where we run into inode number confusion is that all of
> these shm segments are actually files on a tmpfs filesystem
> somewhere, and by making the inode number the shmid we loose
> the tmpfs inode number.  So it is possible we get tmpfs inode
> number conflicts.  However the inode number is not used for
> anything, and the files are not visible in any other way except
> as shm segments so it doesn't matter.

Eh, the kernel choses both shmid and tmpfs inode number.
You could set a high bit in one or the other.

> There is another case with ipc namespaces where we ultimately need
> to support duplicate shmids on the same machine (so migration
> is a possibility).  However by and large the user space
> processes with duplicate ids should be invisible to each other.

On the bright side, this only screws up people who get the
crazy idea that processes can be migrated.

> > The situation with the key is a bit more disturbing, though
> > we already hit that anyway when IPC_PRIVATE is used.
> > (why anybody would NOT use IPC_PRIVATE is a mystery)
> > So having the key in the name doesn't make things worse.
>
> Having "SYSV" in the name appears mandatory.  Otherwise you
> don't even know it is a shm file. Although I may be confused.

It's mandatory for a different reason: to satisfy parsers.

It is nearly useless for identifying shm files. Look what I can do:
    touch /SYSV00000000
    touch '/SYSV00000000 (deleted)'

(so pmap creates a shm, looks for the address in /proc/self/maps,
determines the device major/minor in use, and then uses that)

> Hmm.  Thinking about this I have just realized that we may want
> to approach this a little differently.  Currently I am reusing
> the dentry and inode structure that hugetlbfs and tmpfs return
> me, and simply have a distinct struct file for each shm mapping.
>
> There is a little more cost but it may actually make sense to have
> a dentry and inode that is specific to shm.c so we can do whatever
> we need to without adding requirements to the normal tmpfs or hugtlb
> code.

Piggybacking on tmpfs has always seemed a bit dirty to me.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-08  3:45       ` Eric W. Biederman
  2007-06-08  4:41         ` Albert Cahalan
@ 2007-06-08 16:07         ` Badari Pulavarty
  2007-06-08 23:43           ` [PATCH] shm: Fix the filename of hugetlb sysv shared memory Eric W. Biederman
  1 sibling, 1 reply; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-08 16:07 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge E. Hallyn, Albert Cahalan, Andrew Morton, linux-kernel,
	linux-mm, torvalds



Eric W. Biederman wrote:

>
>At this point given that we actually have a small user space dependency
>and the fact that after I have reviewed the code it looks harmless to
>change the inode number of those inodes, in both cases they are just
>anonymous inodes generated with new_inode, and anything that we wrap
>is likely to be equally so.
>
>So it looks to me like we need to do three things:
>- Fix the inode number
>
Okay. its already done.

>
>- Fix the name on the hugetlbfs dentry to hold the key
>
I don't see need for doing this for hugetlbfs inodes. Currently, they 
don't base their
name on "key" + basing on the "key" is kind of useless anyway (its not 
unique).

>
>- Add a big fat comment that user space programs depend on this
>  behavior of both the dentry name and the inode number.
>
I don't think, the user-space can depend on the dentry-name. It can only 
depend
on inode# to match shmid. (since key is not unique esp. for key=0x00000000).

BTW, I agree that shmid is not unique even without namespaces as its 
based on
seq# and we wrap seq#.

Thanks,
Badari





^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] Restore shmid as inode# to fix /proc/pid/maps ABI breakage
  2007-06-08  6:51             ` Albert Cahalan
@ 2007-06-08 22:31               ` Badari Pulavarty
  0 siblings, 0 replies; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-08 22:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Albert Cahalan, Eric W. Biederman, lkml, linux-mm, torvalds

Andrew,

Can you include this in -mm ?

Thanks,
Badari

shmid used to be stored as inode# for shared memory segments. Some of
the proc-ps tools use this from /proc/pid/maps.  Recent cleanups
to newseg() changed it.  This patch sets inode number back to shared 
memory id to fix breakage.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>

Index: linux-2.6.22-rc4/ipc/shm.c
===================================================================
--- linux-2.6.22-rc4.orig/ipc/shm.c	2007-06-08 15:17:20.000000000 -0700
+++ linux-2.6.22-rc4/ipc/shm.c	2007-06-08 15:19:38.000000000 -0700
@@ -397,6 +397,11 @@ static int newseg (struct ipc_namespace 
 	shp->shm_nattch = 0;
 	shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
 	shp->shm_file = file;
+	/*
+	 * shmid gets reported as "inode#" in /proc/pid/maps.
+	 * proc-ps tools use this. Changing this will break them.
+	 */
+	file->f_dentry->d_inode->i_ino = shp->id;
 
 	ns->shm_tot += numpages;
 	shm_unlock(shp);



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH] shm: Fix the filename of hugetlb sysv shared memory
  2007-06-08 16:07         ` [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid Badari Pulavarty
@ 2007-06-08 23:43           ` Eric W. Biederman
  2007-06-08 23:55             ` Andrew Morton
                               ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Eric W. Biederman @ 2007-06-08 23:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Serge E. Hallyn, Albert Cahalan, linux-kernel, linux-mm,
	torvalds, Badari Pulavarty


Some user space tools need to identify SYSV shared memory when
examining /proc/<pid>/maps.  To do so they look for a block device
with major zero, a dentry named SYSV<sysv key>, and having the minor of
the internal sysv shared memory kernel mount.

To help these tools and to make it easier for people just browsing
/proc/<pid>/maps this patch modifies hugetlb sysv shared memory to
use the SYSV<key> dentry naming convention.

User space tools will still have to be aware that hugetlb sysv
shared memory lives on a different internal kernel mount and so
has a different block device minor number from the rest of sysv
shared memory.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
---
 fs/hugetlbfs/inode.c    |    7 ++-----
 include/linux/hugetlb.h |    4 ++--
 ipc/shm.c               |    6 +++---
 3 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index aa083dd..e6b46b3 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -736,15 +736,13 @@ static int can_do_hugetlb_shm(void)
 			can_do_mlock());
 }
 
-struct file *hugetlb_zero_setup(size_t size)
+struct file *hugetlb_file_setup(const char *name, size_t size)
 {
 	int error = -ENOMEM;
 	struct file *file;
 	struct inode *inode;
 	struct dentry *dentry, *root;
 	struct qstr quick_string;
-	char buf[16];
-	static atomic_t counter;
 
 	if (!hugetlbfs_vfsmount)
 		return ERR_PTR(-ENOENT);
@@ -756,8 +754,7 @@ struct file *hugetlb_zero_setup(size_t size)
 		return ERR_PTR(-ENOMEM);
 
 	root = hugetlbfs_vfsmount->mnt_root;
-	snprintf(buf, 16, "%u", atomic_inc_return(&counter));
-	quick_string.name = buf;
+	quick_string.name = name;
 	quick_string.len = strlen(quick_string.name);
 	quick_string.hash = 0;
 	dentry = d_alloc(root, &quick_string);
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index b4570b6..2c13715 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -163,7 +163,7 @@ static inline struct hugetlbfs_sb_info *HUGETLBFS_SB(struct super_block *sb)
 
 extern const struct file_operations hugetlbfs_file_operations;
 extern struct vm_operations_struct hugetlb_vm_ops;
-struct file *hugetlb_zero_setup(size_t);
+struct file *hugetlb_file_setup(const char *name, size_t);
 int hugetlb_get_quota(struct address_space *mapping);
 void hugetlb_put_quota(struct address_space *mapping);
 
@@ -185,7 +185,7 @@ static inline void set_file_hugepages(struct file *file)
 
 #define is_file_hugepages(file)		0
 #define set_file_hugepages(file)	BUG()
-#define hugetlb_zero_setup(size)	ERR_PTR(-ENOSYS)
+#define hugetlb_file_setup(name,size)	ERR_PTR(-ENOSYS)
 
 #endif /* !CONFIG_HUGETLBFS */
 
diff --git a/ipc/shm.c b/ipc/shm.c
index 4fefbad..c31f743 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -364,9 +364,10 @@ static int newseg (struct ipc_namespace *ns, key_t key, int shmflg, size_t size)
 		return error;
 	}
 
+	sprintf (name, "SYSV%08x", key);
 	if (shmflg & SHM_HUGETLB) {
-		/* hugetlb_zero_setup takes care of mlock user accounting */
-		file = hugetlb_zero_setup(size);
+		/* hugetlb_file_setup takes care of mlock user accounting */
+		file = hugetlb_file_setup(name, size);
 		shp->mlock_user = current->user;
 	} else {
 		int acctflag = VM_ACCOUNT;
@@ -377,7 +378,6 @@ static int newseg (struct ipc_namespace *ns, key_t key, int shmflg, size_t size)
 		if  ((shmflg & SHM_NORESERVE) &&
 				sysctl_overcommit_memory != OVERCOMMIT_NEVER)
 			acctflag = 0;
-		sprintf (name, "SYSV%08x", key);
 		file = shmem_file_setup(name, size, acctflag);
 	}
 	error = PTR_ERR(file);
-- 
1.5.1.1.181.g2de0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH] shm: Fix the filename of hugetlb sysv shared memory
  2007-06-08 23:43           ` [PATCH] shm: Fix the filename of hugetlb sysv shared memory Eric W. Biederman
@ 2007-06-08 23:55             ` Andrew Morton
  2007-06-09  4:32               ` Badari Pulavarty
  2007-06-11 18:11             ` Andrew Morton
  2007-06-11 19:00             ` Adam Litke
  2 siblings, 1 reply; 34+ messages in thread
From: Andrew Morton @ 2007-06-08 23:55 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge E. Hallyn, Albert Cahalan, linux-kernel, linux-mm,
	torvalds, Badari Pulavarty

On Fri, 08 Jun 2007 17:43:34 -0600
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Some user space tools need to identify SYSV shared memory when
> examining /proc/<pid>/maps.  To do so they look for a block device
> with major zero, a dentry named SYSV<sysv key>, and having the minor of
> the internal sysv shared memory kernel mount.
> 
> To help these tools and to make it easier for people just browsing
> /proc/<pid>/maps this patch modifies hugetlb sysv shared memory to
> use the SYSV<key> dentry naming convention.
> 
> User space tools will still have to be aware that hugetlb sysv
> shared memory lives on a different internal kernel mount and so
> has a different block device minor number from the rest of sysv
> shared memory.

I assume this fix is preferred over Badari's?  If so, why?



From: Badari Pulavarty <pbadari@us.ibm.com>

shmid used to be stored as inode# for shared memory segments. Some of
the proc-ps tools use this from /proc/pid/maps.  Recent cleanups
to newseg() changed it.  This patch sets inode number back to shared
memory id to fix breakage.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Cc: "Albert Cahalan" <acahalan@gmail.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 ipc/shm.c |    5 +++++
 1 files changed, 5 insertions(+)

diff -puN ipc/shm.c~restore-shmid-as-inode-to-fix-proc-pid-maps-abi-breakage ipc/shm.c
--- a/ipc/shm.c~restore-shmid-as-inode-to-fix-proc-pid-maps-abi-breakage
+++ a/ipc/shm.c
@@ -397,6 +397,11 @@ static int newseg (struct ipc_namespace 
 	shp->shm_nattch = 0;
 	shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
 	shp->shm_file = file;
+	/*
+	 * shmid gets reported as "inode#" in /proc/pid/maps.
+	 * proc-ps tools use this. Changing this will break them.
+	 */
+	file->f_dentry->d_inode->i_ino = shp->id;
 
 	ns->shm_tot += numpages;
 	shm_unlock(shp);
_


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] shm: Fix the filename of hugetlb sysv shared memory
  2007-06-08 23:55             ` Andrew Morton
@ 2007-06-09  4:32               ` Badari Pulavarty
  2007-06-09  8:01                 ` Eric W. Biederman
  0 siblings, 1 reply; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-09  4:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, Serge E. Hallyn, Albert Cahalan, linux-kernel,
	linux-mm, torvalds



Andrew Morton wrote:

>On Fri, 08 Jun 2007 17:43:34 -0600
>ebiederm@xmission.com (Eric W. Biederman) wrote:
>
>>Some user space tools need to identify SYSV shared memory when
>>examining /proc/<pid>/maps.  To do so they look for a block device
>>with major zero, a dentry named SYSV<sysv key>, and having the minor of
>>the internal sysv shared memory kernel mount.
>>
>>To help these tools and to make it easier for people just browsing
>>/proc/<pid>/maps this patch modifies hugetlb sysv shared memory to
>>use the SYSV<key> dentry naming convention.
>>
>>User space tools will still have to be aware that hugetlb sysv
>>shared memory lives on a different internal kernel mount and so
>>has a different block device minor number from the rest of sysv
>>shared memory.
>>
>
>I assume this fix is preferred over Badari's?  If so, why?
>
No. You still need my patch to fix the current breakage.

This patch makes hugetlbfs also use same naming convention as regular 
shmem for its
name. This is not absolutely needed, its a nice to have. Currently, user 
space tools
can't depend on the filename alone, since its not unique (based on kry).

Thanks,
Badari

>



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] shm: Fix the filename of hugetlb sysv shared memory
  2007-06-09  4:32               ` Badari Pulavarty
@ 2007-06-09  8:01                 ` Eric W. Biederman
  0 siblings, 0 replies; 34+ messages in thread
From: Eric W. Biederman @ 2007-06-09  8:01 UTC (permalink / raw)
  To: Badari Pulavarty
  Cc: Andrew Morton, Serge E. Hallyn, Albert Cahalan, linux-kernel,
	linux-mm, torvalds

Badari Pulavarty <pbadari@us.ibm.com> writes:

> No. You still need my patch to fix the current breakage.

Agreed.

> This patch makes hugetlbfs also use same naming convention as regular shmem for
> its
> name. This is not absolutely needed, its a nice to have. Currently, user space
> tools
> can't depend on the filename alone, since its not unique (based on kry).

Exactly.  My patch is an additional fix/cleanup to bring the hugetlbfs
shm segments as close to their normal counterparts as I can.

pmap still won't recognize them as shm segments (different block device
minor number) but otherwise they are now presented identically with
normal shm segments.

Eric

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] shm: Fix the filename of hugetlb sysv shared memory
  2007-06-08 23:43           ` [PATCH] shm: Fix the filename of hugetlb sysv shared memory Eric W. Biederman
  2007-06-08 23:55             ` Andrew Morton
@ 2007-06-11 18:11             ` Andrew Morton
  2007-06-11 19:55               ` Badari Pulavarty
  2007-06-11 19:00             ` Adam Litke
  2 siblings, 1 reply; 34+ messages in thread
From: Andrew Morton @ 2007-06-11 18:11 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge E. Hallyn, Albert Cahalan, linux-kernel, linux-mm,
	torvalds, Badari Pulavarty

On Fri, 08 Jun 2007 17:43:34 -0600
ebiederm@xmission.com (Eric W. Biederman) wrote:

> Some user space tools need to identify SYSV shared memory when
> examining /proc/<pid>/maps.  To do so they look for a block device
> with major zero, a dentry named SYSV<sysv key>, and having the minor of
> the internal sysv shared memory kernel mount.
> 
> To help these tools and to make it easier for people just browsing
> /proc/<pid>/maps this patch modifies hugetlb sysv shared memory to
> use the SYSV<key> dentry naming convention.
> 
> User space tools will still have to be aware that hugetlb sysv
> shared memory lives on a different internal kernel mount and so
> has a different block device minor number from the rest of sysv
> shared memory.

So..  I am sitting here believing that this patch and Badari's
restore-shmid-as-inode-to-fix-proc-pid-maps-abi-breakage.patch are both
needed in 2.6.22 and that they will fix all these issues up.

If that is untrue, someone please let us know..

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] shm: Fix the filename of hugetlb sysv shared memory
  2007-06-08 23:43           ` [PATCH] shm: Fix the filename of hugetlb sysv shared memory Eric W. Biederman
  2007-06-08 23:55             ` Andrew Morton
  2007-06-11 18:11             ` Andrew Morton
@ 2007-06-11 19:00             ` Adam Litke
  2007-06-11 20:53               ` Ken Chen
  2 siblings, 1 reply; 34+ messages in thread
From: Adam Litke @ 2007-06-11 19:00 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Andrew Morton, Serge E. Hallyn, Albert Cahalan, linux-kernel,
	linux-mm, torvalds, Badari Pulavarty

On 6/8/07, Eric W. Biederman <ebiederm@xmission.com> wrote:
> -struct file *hugetlb_zero_setup(size_t size)
> +struct file *hugetlb_file_setup(const char *name, size_t size)

The bulk of this patch seems to handle renaming this function.  Is
that really necessary?

--
Adam Litke ( agl at us.ibm.com )
IBM Linux Technology Center

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] shm: Fix the filename of hugetlb sysv shared memory
  2007-06-11 18:11             ` Andrew Morton
@ 2007-06-11 19:55               ` Badari Pulavarty
  0 siblings, 0 replies; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-11 19:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, Serge E. Hallyn, Albert Cahalan, lkml,
	linux-mm, torvalds

On Mon, 2007-06-11 at 11:11 -0700, Andrew Morton wrote:
> On Fri, 08 Jun 2007 17:43:34 -0600
> ebiederm@xmission.com (Eric W. Biederman) wrote:
> 
> > Some user space tools need to identify SYSV shared memory when
> > examining /proc/<pid>/maps.  To do so they look for a block device
> > with major zero, a dentry named SYSV<sysv key>, and having the minor of
> > the internal sysv shared memory kernel mount.
> > 
> > To help these tools and to make it easier for people just browsing
> > /proc/<pid>/maps this patch modifies hugetlb sysv shared memory to
> > use the SYSV<key> dentry naming convention.
> > 
> > User space tools will still have to be aware that hugetlb sysv
> > shared memory lives on a different internal kernel mount and so
> > has a different block device minor number from the rest of sysv
> > shared memory.
> 
> So..  I am sitting here believing that this patch and Badari's
> restore-shmid-as-inode-to-fix-proc-pid-maps-abi-breakage.patch are both
> needed in 2.6.22 and that they will fix all these issues up.
> 
> If that is untrue, someone please let us know..

Andrew,

My restore-shmid-as-inode-to-fix-proc-pid-maps-abi-breakage.patch is
definitely needed for 2.6.22 to fix ABI issue.

Eric's patch goes beyond and provides same naming convention for
hugetlbfs backed shm segs (which we never did in the past). So,
its not absolutely need for 2.6.22. You can queue up for next 
release,  unless Albert really wants to extend proc-ps utils for
hugetlbfs segments too.

But, its very simple patch - you might as well push this too.

Thanks,
Badari


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH] shm: Fix the filename of hugetlb sysv shared memory
  2007-06-11 19:00             ` Adam Litke
@ 2007-06-11 20:53               ` Ken Chen
  0 siblings, 0 replies; 34+ messages in thread
From: Ken Chen @ 2007-06-11 20:53 UTC (permalink / raw)
  To: Adam Litke
  Cc: Eric W. Biederman, Andrew Morton, Serge E. Hallyn,
	Albert Cahalan, linux-kernel, linux-mm, torvalds,
	Badari Pulavarty

On 6/11/07, Adam Litke <aglitke@gmail.com> wrote:
> On 6/8/07, Eric W. Biederman <ebiederm@xmission.com> wrote:
> > -struct file *hugetlb_zero_setup(size_t size)
> > +struct file *hugetlb_file_setup(const char *name, size_t size)
>
> The bulk of this patch seems to handle renaming this function.  Is
> that really necessary?

It looks OK to me though, because the argument list to that function
is changed.  Avoid change the function name isn't going to reduce the
patch size either.  So we might just change the name as well to match
the function name shmem_file_setup().

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-06 17:37   ` Badari Pulavarty
@ 2007-06-06 18:24     ` Eric W. Biederman
  0 siblings, 0 replies; 34+ messages in thread
From: Eric W. Biederman @ 2007-06-06 18:24 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: Andrew Morton, linux-mm, lkml

Badari Pulavarty <pbadari@us.ibm.com> writes:
>
> ------ Shared Memory Segments --------
> key        shmid      owner      perms      bytes      nattch     status
> 0x00000000 884737     db2inst1  767        33554432   13
> 0x00000000 950275     db2fenc1  701        23052288   13
>
> There is no unique way to identify them easily :(
>
> I guess, like you suggested, we can change the dentry name to use shmid
> instead of the portions of the "key" to make it unique. I think, I can 
> work out a patch for this.

Thanks.  That should be more robust.

Eric

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-06 17:02 ` Eric W. Biederman
@ 2007-06-06 17:37   ` Badari Pulavarty
  2007-06-06 18:24     ` Eric W. Biederman
  0 siblings, 1 reply; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-06 17:37 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Andrew Morton, linux-mm, lkml

On Wed, 2007-06-06 at 11:02 -0600, Eric W. Biederman wrote:
> Badari Pulavarty <pbadari@us.ibm.com> writes:
> 
> > Hi Eric,
> >
> > Your recent cleanup to shm code, namely
> >
> > [PATCH] shm: make sysv ipc shared memory use stacked files
> >
> > took away one of the debugging feature for shm segments.
> > Originally, shmid were forced to be the inode numbers and
> > they show up in /proc/pid/maps for the process which mapped
> > this shared memory segments (vma listing). That way, its easy
> > to find out who all mapped this shared memory segment. Your
> > patchset, took away the inode# setting. So, we can't easily
> > match the shmem segments to /proc/pid/maps easily. (It was
> > really useful in tracking down a customer problem recently). 
> > Is this done deliberately ? Anything wrong in setting this back ?
> >
> > Comments ?
> >
> > Thanks,
> > Badari
> >
> > Without patch:
> > --------------
> >
> > # ipcs -m
> >
> > ------ Shared Memory Segments --------
> > key        shmid      owner      perms      bytes      nattch     status
> > 0x00000000 884737     db2inst1  767        33554432   13
> >
> > # grep 884737 /proc/*/maps
> > #
> >
> > With patch:
> > -----------
> >
> > # ipcs -m
> >
> > ------ Shared Memory Segments --------
> > key        shmid      owner      perms      bytes      nattch     status
> > 0x00000000 884737     db2inst1  767        33554432   13
> >
> > # grep 884737 /proc/*/maps
> > /proc/11110/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11111/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11112/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11113/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11114/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11115/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11116/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11117/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11118/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11121/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11122/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11124/maps:4000389c000-4000589c000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> > /proc/11575/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> > /SYSV00000000 (deleted)
> >
> >
> >
> > Here is the patch.
> >
> > "ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
> > memory (shmid). It was useful in debugging, but its changed recently. 
> > This patch sets inode number to shared memory id to match /proc/pid/maps.
> 
> Theoretically it makes the stacked file concept more brittle, because
> it means the lower layers can't care about their inode number.
> 
> We do need something to tie these things together.
> 
> So I suspect what makes most sense is to simply rename the dentry
> SYSVID<segmentid>

Yep. Currently, we use part of "key" as the dentry name. For example,

# ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x083d0d74 851968     db2inst1  767        33554432   13

# grep 83d0d74 /proc/*/maps
/proc/11110/maps:40004724000-40006724000 rw-s 00000000 00:08 851968  /SYSV083d0d74 (deleted)
/proc/11111/maps:40004724000-40006724000 rw-s 00000000 00:08 851968  /SYSV083d0d74 (deleted)
/proc/11112/maps:40004724000-40006724000 rw-s 00000000 00:08 851968  /SYSV083d0d74 (deleted)
/proc/11113/maps:40004724000-40006724000 rw-s 00000000 00:08 851968  /SYSV083d0d74 (deleted)
..

The issue is with the ones with key = 0x0000000, like following:

# ipcs

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 884737     db2inst1  767        33554432   13
0x00000000 950275     db2fenc1  701        23052288   13

There is no unique way to identify them easily :(

I guess, like you suggested, we can change the dentry name to use shmid
instead of the portions of the "key" to make it unique. I think, I can 
work out a patch for this.


> 
> That should give you the necessary information while not doing something
> that is a long term maintenance problem.
> 
> Do you think you can cook up a patch to that effect?
> Otherwise I will see if I can.
> 
> In practice I'm not really against your change, but I would prefer
> to leave the code in a state where someone can reimplement hugetlbfs
> or shmfs and we simply don't care.

Thanks for your suggestion.

Thanks,
Badari



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
  2007-06-06 16:07 Badari Pulavarty
@ 2007-06-06 17:02 ` Eric W. Biederman
  2007-06-06 17:37   ` Badari Pulavarty
  0 siblings, 1 reply; 34+ messages in thread
From: Eric W. Biederman @ 2007-06-06 17:02 UTC (permalink / raw)
  To: Badari Pulavarty; +Cc: Andrew Morton, linux-mm, lkml

Badari Pulavarty <pbadari@us.ibm.com> writes:

> Hi Eric,
>
> Your recent cleanup to shm code, namely
>
> [PATCH] shm: make sysv ipc shared memory use stacked files
>
> took away one of the debugging feature for shm segments.
> Originally, shmid were forced to be the inode numbers and
> they show up in /proc/pid/maps for the process which mapped
> this shared memory segments (vma listing). That way, its easy
> to find out who all mapped this shared memory segment. Your
> patchset, took away the inode# setting. So, we can't easily
> match the shmem segments to /proc/pid/maps easily. (It was
> really useful in tracking down a customer problem recently). 
> Is this done deliberately ? Anything wrong in setting this back ?
>
> Comments ?
>
> Thanks,
> Badari
>
> Without patch:
> --------------
>
> # ipcs -m
>
> ------ Shared Memory Segments --------
> key        shmid      owner      perms      bytes      nattch     status
> 0x00000000 884737     db2inst1  767        33554432   13
>
> # grep 884737 /proc/*/maps
> #
>
> With patch:
> -----------
>
> # ipcs -m
>
> ------ Shared Memory Segments --------
> key        shmid      owner      perms      bytes      nattch     status
> 0x00000000 884737     db2inst1  767        33554432   13
>
> # grep 884737 /proc/*/maps
> /proc/11110/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11111/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11112/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11113/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11114/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11115/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11116/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11117/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11118/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11121/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11122/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11124/maps:4000389c000-4000589c000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
> /proc/11575/maps:40006724000-40008724000 rw-s 00000000 00:08 884737
> /SYSV00000000 (deleted)
>
>
>
> Here is the patch.
>
> "ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
> memory (shmid). It was useful in debugging, but its changed recently. 
> This patch sets inode number to shared memory id to match /proc/pid/maps.

Theoretically it makes the stacked file concept more brittle, because
it means the lower layers can't care about their inode number.

We do need something to tie these things together.

So I suspect what makes most sense is to simply rename the dentry
SYSVID<segmentid>

That should give you the necessary information while not doing something
that is a long term maintenance problem.

Do you think you can cook up a patch to that effect?
Otherwise I will see if I can.

In practice I'm not really against your change, but I would prefer
to leave the code in a state where someone can reimplement hugetlbfs
or shmfs and we simply don't care.

Eric

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid
@ 2007-06-06 16:07 Badari Pulavarty
  2007-06-06 17:02 ` Eric W. Biederman
  0 siblings, 1 reply; 34+ messages in thread
From: Badari Pulavarty @ 2007-06-06 16:07 UTC (permalink / raw)
  To: Eric W. Biederman, Andrew Morton; +Cc: linux-mm, lkml

Hi Eric,

Your recent cleanup to shm code, namely

[PATCH] shm: make sysv ipc shared memory use stacked files

took away one of the debugging feature for shm segments.
Originally, shmid were forced to be the inode numbers and
they show up in /proc/pid/maps for the process which mapped
this shared memory segments (vma listing). That way, its easy
to find out who all mapped this shared memory segment. Your
patchset, took away the inode# setting. So, we can't easily
match the shmem segments to /proc/pid/maps easily. (It was
really useful in tracking down a customer problem recently). 
Is this done deliberately ? Anything wrong in setting this back ?

Comments ?

Thanks,
Badari

Without patch:
--------------

# ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 884737     db2inst1  767        33554432   13

# grep 884737 /proc/*/maps
#

With patch:
-----------

# ipcs -m

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status
0x00000000 884737     db2inst1  767        33554432   13

# grep 884737 /proc/*/maps
/proc/11110/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11111/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11112/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11113/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11114/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11115/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11116/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11117/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11118/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11121/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11122/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11124/maps:4000389c000-4000589c000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)
/proc/11575/maps:40006724000-40008724000 rw-s 00000000 00:08 884737 /SYSV00000000 (deleted)



Here is the patch.

"ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
memory (shmid). It was useful in debugging, but its changed recently. 
This patch sets inode number to shared memory id to match /proc/pid/maps.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>

Index: linux-2.6.22-rc4/ipc/shm.c
===================================================================
--- linux-2.6.22-rc4.orig/ipc/shm.c	2007-06-04 17:57:25.000000000 -0700
+++ linux-2.6.22-rc4/ipc/shm.c	2007-06-06 08:23:57.000000000 -0700
@@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
 	shp->shm_nattch = 0;
 	shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
 	shp->shm_file = file;
+	file->f_dentry->d_inode->i_ino = shp->id;
 
 	ns->shm_tot += numpages;
 	shm_unlock(shp);



^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2007-06-11 20:54 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-07  3:27 [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid Albert Cahalan
2007-06-07  3:44 ` Andrew Morton
2007-06-07  4:53   ` Albert Cahalan
2007-06-07 16:20     ` Serge E. Hallyn
2007-06-08  3:45       ` Eric W. Biederman
2007-06-08  4:41         ` Albert Cahalan
2007-06-08  5:55           ` Eric W. Biederman
2007-06-08  6:51             ` Albert Cahalan
2007-06-08 22:31               ` [PATCH] Restore shmid as inode# to fix /proc/pid/maps ABI breakage Badari Pulavarty
2007-06-08 16:07         ` [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid Badari Pulavarty
2007-06-08 23:43           ` [PATCH] shm: Fix the filename of hugetlb sysv shared memory Eric W. Biederman
2007-06-08 23:55             ` Andrew Morton
2007-06-09  4:32               ` Badari Pulavarty
2007-06-09  8:01                 ` Eric W. Biederman
2007-06-11 18:11             ` Andrew Morton
2007-06-11 19:55               ` Badari Pulavarty
2007-06-11 19:00             ` Adam Litke
2007-06-11 20:53               ` Ken Chen
2007-06-07 16:23     ` [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid Badari Pulavarty
2007-06-07 16:43       ` Albert Cahalan
2007-06-07 17:06         ` Badari Pulavarty
2007-06-07 19:48           ` Andrew Morton
2007-06-07 19:59             ` Badari Pulavarty
2007-06-07 20:37               ` Serge E. Hallyn
2007-06-07 20:51                 ` Serge E. Hallyn
2007-06-07 21:16                 ` Badari Pulavarty
2007-06-07 22:08                   ` Serge E. Hallyn
2007-06-07 22:21                     ` Badari Pulavarty
2007-06-07 22:22                     ` Serge E. Hallyn
2007-06-07 23:57                       ` Badari Pulavarty
  -- strict thread matches above, loose matches on Subject: below --
2007-06-06 16:07 Badari Pulavarty
2007-06-06 17:02 ` Eric W. Biederman
2007-06-06 17:37   ` Badari Pulavarty
2007-06-06 18:24     ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).