linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RFC [patch 00/34] PID Virtualization Overview
@ 2006-01-17 14:32 Serge Hallyn
  2006-01-17 14:32 ` RFC [patch 01/34] PID Virtualization Change pid accesses: drivers Serge Hallyn
                   ` (34 more replies)
  0 siblings, 35 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

--
PID Virtualization is based on the concept of a container.
Our ultimate goal is to checkpoint/restart containers.  The
containers should also be useful as a basis for the pid
virtualization required, for instance, by vserver.

The mechanism to start a container 
is to 'echo "container_name" > /proc/container'  which creates a new
container and associates the calling process with it. All subsequently
forked tasks then belong to that container.
There is a separate pid space associated with each container.
Only processes/task belonging to the same container "see" each other.
The exception is an implied default system container that has 
a global view.
The following patches accomplish 3 things:
1) identify the locations at the user/kernel boundary where pids and 
   related ids ( pgrp, sessionids, .. ) need to be (de-)virtualized and
   call appropriate (de-)virtualization functions.
2) provide the virtualization implementation in these functions.
3) implement a container object and a simple /proc interface to create one
4) provide a per container /proc/fs

-- Hubertus Franke    (frankeh@watson.ibm.com)
-- Cedric Le Goater   (clg@fr.ibm.com)
-- Serge E Hallyn     (serue@us.ibm.com)
-- Dave Hansen        (haveblue@us.ibm.com)


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 01/34] PID Virtualization Change pid accesses: drivers
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
@ 2006-01-17 14:32 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 02/34] PID Virtualization Change pid accesses: most archs Serge Hallyn
                   ` (33 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: B0-change-pid-tgid-references-drivers --]
[-- Type: text/plain, Size: 42550 bytes --]

Instead of letting all parts of the kernel have direct access to
tsk->pid and tgid, make them use a accessor functions: task_{p,tg}id().
This will need to be done for a few other fields, but I decided to start
with those two.  Note that pid has been renamed __pid to make sure that
any uncaught users will error out.

Note that this is very similar to what the vserver vx_map_pid() does,
and doing something like this should shrink their patch.

Our next patchset can find all the places where a pid crosses the
user<->kernel boundary, and do the correct conversions.  Perhaps some
sparse annotations will allow us to do this more automatically, instead
of auditing everything.  We could have a concept of __user for pids, not
just pointers.

This first patch changes the pid accesses under drivers/.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 block/cfq-iosched.c                         |    4 ++--
 block/ll_rw_blk.c                           |    2 +-
 drivers/acorn/block/fd1772.c                |    2 +-
 drivers/acpi/osl.c                          |    2 +-
 drivers/block/ataflop.c                     |    2 +-
 drivers/block/nbd.c                         |    2 +-
 drivers/cdrom/cdrom.c                       |    2 +-
 drivers/cdrom/mcdx.c                        |    2 +-
 drivers/cdrom/sbpcd.c                       |    4 ++--
 drivers/char/agp/frontend.c                 |    6 +++---
 drivers/char/cyclades.c                     |   10 +++++-----
 drivers/char/drm/drm_bufs.c                 |    2 +-
 drivers/char/drm/drm_drv.c                  |    2 +-
 drivers/char/drm/drm_fops.c                 |    8 ++++----
 drivers/char/drm/drm_lock.c                 |    6 +++---
 drivers/char/drm/drm_os_linux.h             |    2 +-
 drivers/char/drm/drm_vm.c                   |    2 +-
 drivers/char/drm/i810_dma.c                 |    2 +-
 drivers/char/drm/i830_dma.c                 |    2 +-
 drivers/char/n_r3964.c                      |   12 ++++++------
 drivers/char/random.c                       |    2 +-
 drivers/char/rio/linux_compat.h             |    2 +-
 drivers/char/snsc_event.c                   |    2 +-
 drivers/char/sx.c                           |    2 +-
 drivers/char/sysrq.c                        |    2 +-
 drivers/char/tty_io.c                       |    6 +++---
 drivers/char/vt_ioctl.c                     |    4 ++--
 drivers/input/joystick/iforce/iforce-main.c |    8 ++++----
 drivers/input/joystick/iforce/iforce.h      |    4 ++--
 drivers/macintosh/adb.c                     |    8 ++++----
 drivers/md/bitmap.c                         |    4 ++--
 drivers/md/md.c                             |    6 +++---
 drivers/media/video/zoran_driver.c          |    4 ++--
 drivers/net/slip.c                          |    6 +++---
 drivers/net/tun.c                           |    2 +-
 drivers/net/wireless/hostap/hostap_ioctl.c  |    2 +-
 drivers/oprofile/buffer_sync.c              |    4 ++--
 drivers/s390/char/fs3270.c                  |    2 +-
 drivers/s390/crypto/z90main.c               |    4 ++--
 drivers/s390/s390mach.c                     |    2 +-
 drivers/scsi/53c7xx.c                       |    2 +-
 drivers/scsi/dc395x.c                       |    4 ++--
 drivers/scsi/eata_pio.c                     |    2 +-
 drivers/serial/crisv10.c                    |    4 ++--
 drivers/usb/core/devio.c                    |    6 +++---
 drivers/usb/input/hid-lgff.c                |    8 ++++----
 drivers/usb/input/hid-tmff.c                |    8 ++++----
 drivers/usb/input/pid.c                     |    8 ++++----
 48 files changed, 97 insertions(+), 97 deletions(-)

Index: linux-2.6.15/drivers/acorn/block/fd1772.c
===================================================================
--- linux-2.6.15.orig/drivers/acorn/block/fd1772.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/acorn/block/fd1772.c	2006-01-17 08:36:53.000000000 -0500
@@ -1275,7 +1275,7 @@
 {
 	unsigned long flags;
 
-	DPRINT(("do_fd_request for pid %d\n", current->pid));
+	DPRINT(("do_fd_request for pid %d\n", task_pid(current)));
 	if (fdc_busy) return;
 	save_flags(flags);
 	cli();
Index: linux-2.6.15/drivers/acpi/osl.c
===================================================================
--- linux-2.6.15.orig/drivers/acpi/osl.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/acpi/osl.c	2006-01-17 08:36:53.000000000 -0500
@@ -931,7 +931,7 @@
 u32 acpi_os_get_thread_id(void)
 {
 	if (!in_atomic())
-		return current->pid;
+		return task_pid(current);
 
 	return 0;
 }
Index: linux-2.6.15/drivers/block/ataflop.c
===================================================================
--- linux-2.6.15.orig/drivers/block/ataflop.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/block/ataflop.c	2006-01-17 08:36:53.000000000 -0500
@@ -1470,7 +1470,7 @@
 {
  	unsigned long flags;
 
-	DPRINT(("do_fd_request for pid %d\n",current->pid));
+	DPRINT(("do_fd_request for pid %d\n",task_pid(current)));
 	while( fdc_busy ) sleep_on( &fdc_wait );
 	fdc_busy = 1;
 	stdma_lock(floppy_irq, NULL);
Index: linux-2.6.15/drivers/block/nbd.c
===================================================================
--- linux-2.6.15.orig/drivers/block/nbd.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/block/nbd.c	2006-01-17 08:36:53.000000000 -0500
@@ -182,7 +182,7 @@
 			siginfo_t info;
 			spin_lock_irqsave(&current->sighand->siglock, flags);
 			printk(KERN_WARNING "nbd (pid %d: %s) got signal %d\n",
-				current->pid, current->comm, 
+				task_pid(current), current->comm,
 				dequeue_signal(current, &current->blocked, &info));
 			spin_unlock_irqrestore(&current->sighand->siglock, flags);
 			result = -EINTR;
Index: linux-2.6.15/drivers/cdrom/cdrom.c
===================================================================
--- linux-2.6.15.orig/drivers/cdrom/cdrom.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/cdrom/cdrom.c	2006-01-17 08:36:53.000000000 -0500
@@ -1097,7 +1097,7 @@
 		       is the default case! */
 		    cdinfo(CD_OPEN, "bummer. wrong media type.\n"); 
 		    cdinfo(CD_WARNING, "pid %d must open device O_NONBLOCK!\n",
-					(unsigned int)current->pid); 
+					(unsigned int)task_pid(current));
 		    ret=-EMEDIUMTYPE;
 		    goto clean_up_and_return;
 		}
Index: linux-2.6.15/drivers/cdrom/mcdx.c
===================================================================
--- linux-2.6.15.orig/drivers/cdrom/mcdx.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/cdrom/mcdx.c	2006-01-17 08:36:53.000000000 -0500
@@ -1351,7 +1351,7 @@
 						sector + nr_sectors)
 		    ? stuffp->high_border : border;
 
-		stuffp->lock = current->pid;
+		stuffp->lock = task_pid(current);
 
 		do {
 
Index: linux-2.6.15/drivers/cdrom/sbpcd.c
===================================================================
--- linux-2.6.15.orig/drivers/cdrom/sbpcd.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/cdrom/sbpcd.c	2006-01-17 08:36:53.000000000 -0500
@@ -4854,14 +4854,14 @@
 	if (!req)
 	{
 		printk( "do_sbpcd_request[%di](NULL), Pid:%d, Time:%li\n",
-			xnr, current->pid, jiffies);
+			xnr, task_pid(current), jiffies);
 		printk( "do_sbpcd_request[%do](NULL) end 0 (null), Time:%li\n",
 			xnr, jiffies);
 		return;
 	}
 
 	printk(" do_sbpcd_request[%di](%p:%ld+%ld), Pid:%d, Time:%li\n",
-		xnr, req, req->sector, req->nr_sectors, current->pid, jiffies);
+		xnr, req, req->sector, req->nr_sectors, task_pid(current), jiffies);
 #endif
 
 	req = elv_next_request(q);	/* take out our request so no other */
Index: linux-2.6.15/drivers/char/agp/frontend.c
===================================================================
--- linux-2.6.15.orig/drivers/char/agp/frontend.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/agp/frontend.c	2006-01-17 08:36:53.000000000 -0500
@@ -611,7 +611,7 @@
 		if ((size + offset) > current_size)
 			goto out_inval;
 
-		client = agp_find_client_by_pid(current->pid);
+		client = agp_find_client_by_pid(task_pid(current));
 
 		if (client == NULL)
 			goto out_eperm;
@@ -708,13 +708,13 @@
 		goto err_out_nomem;
 
 	set_bit(AGP_FF_ALLOW_CLIENT, &priv->access_flags);
-	priv->my_pid = current->pid;
+	priv->my_pid = task_pid(current);
 
 	if ((current->uid == 0) || (current->suid == 0)) {
 		/* Root priv, can be controller */
 		set_bit(AGP_FF_ALLOW_CONTROLLER, &priv->access_flags);
 	}
-	client = agp_find_client_by_pid(current->pid);
+	client = agp_find_client_by_pid(task_pid(current));
 
 	if (client != NULL) {
 		set_bit(AGP_FF_IS_CLIENT, &priv->access_flags);
Index: linux-2.6.15/drivers/char/cyclades.c
===================================================================
--- linux-2.6.15.orig/drivers/char/cyclades.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/cyclades.c	2006-01-17 08:36:54.000000000 -0500
@@ -2361,7 +2361,7 @@
     CY_UNLOCK(info, flags);
 #ifdef CY_DEBUG_COUNT
     printk("cyc block_til_ready: (%d): decrementing count to %d\n",
-        current->pid, info->count);
+        task_pid(current), info->count);
 #endif
     info->blocked_open++;
 
@@ -2478,7 +2478,7 @@
 	info->count++;
 #ifdef CY_DEBUG_COUNT
 	printk("cyc:block_til_ready (%d): incrementing count to %d\n",
-	    current->pid, info->count);
+	    task_pid(current), info->count);
 #endif
     }
     info->blocked_open--;
@@ -2579,7 +2579,7 @@
     info->count++;
 #ifdef CY_DEBUG_COUNT
     printk("cyc:cy_open (%d): incrementing count to %d\n",
-        current->pid, info->count);
+        task_pid(current), info->count);
 #endif
     if (!tmp_buf) {
 	page = get_zeroed_page(GFP_KERNEL);
@@ -2745,7 +2745,7 @@
     }
 #ifdef CY_DEBUG_COUNT
     printk("cyc:cy_close at (%d): decrementing count to %d\n",
-        current->pid, info->count - 1);
+        task_pid(current), info->count - 1);
 #endif
     if (--info->count < 0) {
 #ifdef CY_DEBUG_COUNT
@@ -4472,7 +4472,7 @@
     info->event = 0;
     info->count = 0;
 #ifdef CY_DEBUG_COUNT
-    printk("cyc:cy_hangup (%d): setting count to 0\n", current->pid);
+    printk("cyc:cy_hangup (%d): setting count to 0\n", task_pid(current));
 #endif
     info->tty = NULL;
     info->flags &= ~ASYNC_NORMAL_ACTIVE;
Index: linux-2.6.15/drivers/char/drm/drm_bufs.c
===================================================================
--- linux-2.6.15.orig/drivers/char/drm/drm_bufs.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/drm/drm_bufs.c	2006-01-17 08:36:54.000000000 -0500
@@ -1434,7 +1434,7 @@
 		buf = dma->buflist[idx];
 		if (buf->filp != filp) {
 			DRM_ERROR("Process %d freeing buffer not owned\n",
-				  current->pid);
+				  task_pid(current));
 			return -EINVAL;
 		}
 		drm_free_buffer(dev, buf);
Index: linux-2.6.15/drivers/char/drm/drm_drv.c
===================================================================
--- linux-2.6.15.orig/drivers/char/drm/drm_drv.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/drm/drm_drv.c	2006-01-17 08:36:54.000000000 -0500
@@ -473,7 +473,7 @@
 	++priv->ioctl_count;
 
 	DRM_DEBUG("pid=%d, cmd=0x%02x, nr=0x%02x, dev 0x%lx, auth=%d\n",
-		  current->pid, cmd, nr,
+		  task_pid(current), cmd, nr,
 		  (long)old_encode_dev(priv->head->device),
 		  priv->authenticated);
 
Index: linux-2.6.15/drivers/char/drm/drm_fops.c
===================================================================
--- linux-2.6.15.orig/drivers/char/drm/drm_fops.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/drm/drm_fops.c	2006-01-17 08:36:54.000000000 -0500
@@ -188,7 +188,7 @@
 	 */
 
 	DRM_DEBUG("pid = %d, device = 0x%lx, open_count = %d\n",
-		  current->pid, (long)old_encode_dev(priv->head->device),
+		  task_pid(current), (long)old_encode_dev(priv->head->device),
 		  dev->open_count);
 
 	if (priv->lock_count && dev->lock.hw_lock &&
@@ -347,7 +347,7 @@
 	if (!drm_cpu_valid())
 		return -EINVAL;
 
-	DRM_DEBUG("pid = %d, minor = %d\n", current->pid, minor);
+	DRM_DEBUG("pid = %d, minor = %d\n", task_pid(current), minor);
 
 	priv = drm_alloc(sizeof(*priv), DRM_MEM_FILES);
 	if (!priv)
@@ -356,7 +356,7 @@
 	memset(priv, 0, sizeof(*priv));
 	filp->private_data = priv;
 	priv->uid = current->euid;
-	priv->pid = current->pid;
+	priv->pid = task_pid(current);
 	priv->minor = minor;
 	priv->head = drm_heads[minor];
 	priv->ioctl_count = 0;
@@ -416,7 +416,7 @@
 	drm_device_t *dev = priv->head->dev;
 
 	DRM_DEBUG("pid = %d, device = 0x%lx, open_count = %d\n",
-		  current->pid, (long)old_encode_dev(priv->head->device),
+		  task_pid(current), (long)old_encode_dev(priv->head->device),
 		  dev->open_count);
 	return 0;
 }
Index: linux-2.6.15/drivers/char/drm/drm_lock.c
===================================================================
--- linux-2.6.15.orig/drivers/char/drm/drm_lock.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/drm/drm_lock.c	2006-01-17 08:36:54.000000000 -0500
@@ -67,12 +67,12 @@
 
 	if (lock.context == DRM_KERNEL_CONTEXT) {
 		DRM_ERROR("Process %d using kernel context %d\n",
-			  current->pid, lock.context);
+			  task_pid(current), lock.context);
 		return -EINVAL;
 	}
 
 	DRM_DEBUG("%d (pid %d) requests lock (0x%08x), flags = 0x%08x\n",
-		  lock.context, current->pid,
+		  lock.context, task_pid(current),
 		  dev->lock.hw_lock->lock, lock.flags);
 
 	if (drm_core_check_feature(dev, DRIVER_DMA_QUEUE))
@@ -162,7 +162,7 @@
 
 	if (lock.context == DRM_KERNEL_CONTEXT) {
 		DRM_ERROR("Process %d using kernel context %d\n",
-			  current->pid, lock.context);
+			  task_pid(current), lock.context);
 		return -EINVAL;
 	}
 
Index: linux-2.6.15/drivers/char/drm/drm_os_linux.h
===================================================================
--- linux-2.6.15.orig/drivers/char/drm/drm_os_linux.h	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/drm/drm_os_linux.h	2006-01-17 08:36:54.000000000 -0500
@@ -12,7 +12,7 @@
 #define DRM_IOCTL_ARGS			struct inode *inode, struct file *filp, unsigned int cmd, unsigned long data
 #define DRM_ERR(d)			-(d)
 /** Current process ID */
-#define DRM_CURRENTPID			current->pid
+#define DRM_CURRENTPID			task_pid(current)
 #define DRM_UDELAY(d)			udelay(d)
 /** Read a byte from a MMIO region */
 #define DRM_READ8(map, offset)		readb(((void __iomem *)(map)->handle) + (offset))
Index: linux-2.6.15/drivers/char/drm/drm_vm.c
===================================================================
--- linux-2.6.15.orig/drivers/char/drm/drm_vm.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/drm/drm_vm.c	2006-01-17 08:36:54.000000000 -0500
@@ -407,7 +407,7 @@
 		down(&dev->struct_sem);
 		vma_entry->vma = vma;
 		vma_entry->next = dev->vmalist;
-		vma_entry->pid = current->pid;
+		vma_entry->pid = task_pid(current);
 		dev->vmalist = vma_entry;
 		up(&dev->struct_sem);
 	}
Index: linux-2.6.15/drivers/char/drm/i810_dma.c
===================================================================
--- linux-2.6.15.orig/drivers/char/drm/i810_dma.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/drm/i810_dma.c	2006-01-17 08:36:54.000000000 -0500
@@ -1110,7 +1110,7 @@
 	retcode = i810_dma_get_buffer(dev, &d, filp);
 
 	DRM_DEBUG("i810_dma: %d returning %d, granted = %d\n",
-		  current->pid, retcode, d.granted);
+		  task_pid(current), retcode, d.granted);
 
 	if (copy_to_user((drm_dma_t __user *) arg, &d, sizeof(d)))
 		return -EFAULT;
Index: linux-2.6.15/drivers/char/drm/i830_dma.c
===================================================================
--- linux-2.6.15.orig/drivers/char/drm/i830_dma.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/drm/i830_dma.c	2006-01-17 08:36:54.000000000 -0500
@@ -1433,7 +1433,7 @@
 	retcode = i830_dma_get_buffer(dev, &d, filp);
 
 	DRM_DEBUG("i830_dma: %d returning %d, granted = %d\n",
-		  current->pid, retcode, d.granted);
+		  task_pid(current), retcode, d.granted);
 
 	if (copy_to_user((drm_dma_t __user *) arg, &d, sizeof(d)))
 		return -EFAULT;
Index: linux-2.6.15/drivers/char/n_r3964.c
===================================================================
--- linux-2.6.15.orig/drivers/char/n_r3964.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/n_r3964.c	2006-01-17 08:36:54.000000000 -0500
@@ -1068,7 +1068,7 @@
    
    TRACE_L("open");
    TRACE_L("tty=%p, PID=%d, disc_data=%p", 
-          tty, current->pid, tty->disc_data);
+          tty, task_pid(current), tty->disc_data);
    
    pInfo=kmalloc(sizeof(struct r3964_info), GFP_KERNEL); 
    TRACE_M("r3964_open - info kmalloc %p",pInfo);
@@ -1193,7 +1193,7 @@
    struct r3964_client_message theMsg;
    DECLARE_WAITQUEUE (wait, current);
    
-   int pid = current->pid;
+   int pid = task_pid(current);
    int count;
    
    TRACE_L("read()");
@@ -1295,7 +1295,7 @@
    pHeader->locks = 0;
    pHeader->owner = NULL;
    
-   pid=current->pid;
+   pid=task_pid(current);
    
    pClient=findClient(pInfo, pid);
    if(pClient)
@@ -1328,7 +1328,7 @@
    switch(cmd)
    {
       case R3964_ENABLE_SIGNALS:
-         return enable_signals(pInfo, current->pid, arg);
+         return enable_signals(pInfo, task_pid(current), arg);
       case R3964_SETPRIORITY:
          if(arg<R3964_MASTER || arg>R3964_SLAVE)
             return -EINVAL;
@@ -1341,7 +1341,7 @@
             pInfo->flags &= ~R3964_BCC;
          return 0;
       case R3964_READ_TELEGRAM:
-         return read_telegram(pInfo, current->pid, (unsigned char __user *)arg);
+         return read_telegram(pInfo, task_pid(current), (unsigned char __user *)arg);
       default:
          return -ENOIOCTLCMD;
    }
@@ -1357,7 +1357,7 @@
 		      struct poll_table_struct *wait)
 {
    struct r3964_info *pInfo=(struct r3964_info*)tty->disc_data;
-   int pid=current->pid;
+   int pid=task_pid(current);
    struct r3964_client_info *pClient;
    struct r3964_message *pMsg=NULL;
    unsigned long flags;
Index: linux-2.6.15/drivers/char/random.c
===================================================================
--- linux-2.6.15.orig/drivers/char/random.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/random.c	2006-01-17 08:36:54.000000000 -0500
@@ -1640,7 +1640,7 @@
 	 * drain on it), and uses halfMD4Transform within the second. We
 	 * also mix it with jiffies and the PID:
 	 */
-	return secure_ip_id(current->pid + jiffies);
+	return secure_ip_id(task_pid(current) + jiffies);
 }
 
 /*
Index: linux-2.6.15/drivers/char/rio/linux_compat.h
===================================================================
--- linux-2.6.15.orig/drivers/char/rio/linux_compat.h	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/rio/linux_compat.h	2006-01-17 08:36:54.000000000 -0500
@@ -58,7 +58,7 @@
 #endif
 
 
-#define getpid()    (current->pid)
+#define getpid()    (task_pid(current))
 
 #define QSIZE SERIAL_XMIT_SIZE
 
Index: linux-2.6.15/drivers/char/snsc_event.c
===================================================================
--- linux-2.6.15.orig/drivers/char/snsc_event.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/snsc_event.c	2006-01-17 08:36:54.000000000 -0500
@@ -207,7 +207,7 @@
 		/* first find init's task */
 		read_lock(&tasklist_lock);
 		for_each_process(p) {
-			if (p->pid == 1)
+			if (task_pid(p) == 1)
 				break;
 		}
 		if (p) { /* we found init's task */
Index: linux-2.6.15/drivers/char/sx.c
===================================================================
--- linux-2.6.15.orig/drivers/char/sx.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/sx.c	2006-01-17 08:36:54.000000000 -0500
@@ -1437,7 +1437,7 @@
 
 	line = tty->index;
 	sx_dprintk (SX_DEBUG_OPEN, "%d: opening line %d. tty=%p ctty=%p, np=%d)\n", 
-	            current->pid, line, tty, current->signal->tty, sx_nports);
+	            task_pid(current), line, tty, current->signal->tty, sx_nports);
 
 	if ((line < 0) || (line >= SX_NPORTS) || (line >= sx_nports))
 		return -ENODEV;
Index: linux-2.6.15/drivers/char/sysrq.c
===================================================================
--- linux-2.6.15.orig/drivers/char/sysrq.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/sysrq.c	2006-01-17 08:36:54.000000000 -0500
@@ -207,7 +207,7 @@
 	struct task_struct *p;
 
 	for_each_process(p) {
-		if (p->mm && p->pid != 1)
+		if (p->mm && task_pid(p) != 1)
 			/* Not swapper, init nor kernel thread */
 			force_sig(sig, p);
 	}
Index: linux-2.6.15/drivers/char/tty_io.c
===================================================================
--- linux-2.6.15.orig/drivers/char/tty_io.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/tty_io.c	2006-01-17 08:36:54.000000000 -0500
@@ -2009,7 +2009,7 @@
 	if (on) {
 		if (!waitqueue_active(&tty->read_wait))
 			tty->minimum_to_wake = 1;
-		retval = f_setown(filp, (-tty->pgrp) ? : current->pid, 0);
+		retval = f_setown(filp, (-tty->pgrp) ? : task_pid(current), 0);
 		if (retval)
 			return retval;
 	} else {
@@ -2471,7 +2471,7 @@
 		if (p->signal->tty == tty || session > 0) {
 			printk(KERN_NOTICE "SAK: killed process %d"
 			    " (%s): p->signal->session==tty->session\n",
-			    p->pid, p->comm);
+			    task_pid(p), p->comm);
 			send_sig(SIGKILL, p, 1);
 			continue;
 		}
@@ -2487,7 +2487,7 @@
 				    filp->private_data == tty) {
 					printk(KERN_NOTICE "SAK: killed process %d"
 					    " (%s): fd#%d opened to the tty\n",
-					    p->pid, p->comm, i);
+					    task_pid(p), p->comm, i);
 					send_sig(SIGKILL, p, 1);
 					break;
 				}
Index: linux-2.6.15/drivers/char/vt_ioctl.c
===================================================================
--- linux-2.6.15.orig/drivers/char/vt_ioctl.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/char/vt_ioctl.c	2006-01-17 08:36:54.000000000 -0500
@@ -651,7 +651,7 @@
 		  return -EPERM;
 		if (!valid_signal(arg) || arg < 1 || arg == SIGKILL)
 		  return -EINVAL;
-		spawnpid = current->pid;
+		spawnpid = task_pid(current);
 		spawnsig = arg;
 		return 0;
 	}
@@ -670,7 +670,7 @@
 		vc->vt_mode = tmp;
 		/* the frsig is ignored, so we set it to 0 */
 		vc->vt_mode.frsig = 0;
-		vc->vt_pid = current->pid;
+		vc->vt_pid = task_pid(current);
 		/* no switch is required -- saw@shade.msu.ru */
 		vc->vt_newvt = -1;
 		release_console_sem();
Index: linux-2.6.15/drivers/input/joystick/iforce/iforce-main.c
===================================================================
--- linux-2.6.15.orig/drivers/input/joystick/iforce/iforce-main.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/input/joystick/iforce/iforce-main.c	2006-01-17 08:36:54.000000000 -0500
@@ -160,7 +160,7 @@
 			return -ENOMEM;
 
 		effect->id = id;
-		iforce->core_effects[id].owner = current->pid;
+		iforce->core_effects[id].owner = task_pid(current);
 		iforce->core_effects[id].flags[0] = (1 << FF_CORE_IS_USED);	/* Only IS_USED bit must be set */
 
 		is_update = FALSE;
@@ -223,8 +223,8 @@
 	struct iforce_core_effect* core_effect;
 
 	/* Check who is trying to erase this effect */
-	if (iforce->core_effects[effect_id].owner != current->pid) {
-		printk(KERN_WARNING "iforce-main.c: %d tried to erase an effect belonging to %d\n", current->pid, iforce->core_effects[effect_id].owner);
+	if (iforce->core_effects[effect_id].owner != task_pid(current)) {
+		printk(KERN_WARNING "iforce-main.c: %d tried to erase an effect belonging to %d\n", task_pid(current), iforce->core_effects[effect_id].owner);
 		return -EACCES;
 	}
 
@@ -274,7 +274,7 @@
 	for (i=0; i<dev->ff_effects_max; ++i) {
 
 		if (test_bit(FF_CORE_IS_USED, iforce->core_effects[i].flags) &&
-			current->pid == iforce->core_effects[i].owner) {
+			task_pid(current) == iforce->core_effects[i].owner) {
 
 			/* Stop effect */
 			input_report_ff(dev, i, 0);
Index: linux-2.6.15/drivers/input/joystick/iforce/iforce.h
===================================================================
--- linux-2.6.15.orig/drivers/input/joystick/iforce/iforce.h	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/input/joystick/iforce/iforce.h	2006-01-17 08:36:54.000000000 -0500
@@ -70,8 +70,8 @@
 #define CHECK_OWNERSHIP(i, iforce)	\
 	((i) < FF_EFFECTS_MAX && i >= 0 && \
 	test_bit(FF_CORE_IS_USED, (iforce)->core_effects[(i)].flags) && \
-	(current->pid == 0 || \
-	(iforce)->core_effects[(i)].owner == current->pid))
+	(task_pid(current) == 0 || \
+	(iforce)->core_effects[(i)].owner == task_pid(current)))
 
 struct iforce_core_effect {
 	/* Information about where modifiers are stored in the device's memory */
Index: linux-2.6.15/drivers/macintosh/adb.c
===================================================================
--- linux-2.6.15.orig/drivers/macintosh/adb.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/macintosh/adb.c	2006-01-17 08:36:54.000000000 -0500
@@ -138,8 +138,8 @@
 
 static __inline__ void adb_wait_ms(unsigned int ms)
 {
-	if (current->pid && adb_probe_task_pid &&
-	  adb_probe_task_pid == current->pid)
+	if (task_pid(current) && adb_probe_task_pid &&
+	  adb_probe_task_pid == task_pid(current))
 		msleep(ms);
 	else
 		mdelay(ms);
@@ -492,8 +492,8 @@
 	 * block. Beware that the "done" callback will be overriden !
 	 */
 	if ((flags & ADBREQ_SYNC) &&
-	    (current->pid && adb_probe_task_pid &&
-	    adb_probe_task_pid == current->pid)) {
+	    (task_pid(current) && adb_probe_task_pid &&
+	    adb_probe_task_pid == task_pid(current))) {
 		req->done = adb_probe_wakeup;
 		rc = adb_controller->send_request(req, 0);
 		if (rc || req->complete)
Index: linux-2.6.15/drivers/md/bitmap.c
===================================================================
--- linux-2.6.15.orig/drivers/md/bitmap.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/md/bitmap.c	2006-01-17 08:36:54.000000000 -0500
@@ -1223,7 +1223,7 @@
 	md_wakeup_thread(daemon); /* start it running */
 
 	PRINTK("%s: %s daemon (pid %d) started...\n",
-		bmname(bitmap), name, daemon->tsk->pid);
+		bmname(bitmap), name, daemon->task_pid(tsk));
 
 	return daemon;
 }
@@ -1232,7 +1232,7 @@
 {
 	/* the daemon can't stop itself... it'll just exit instead... */
 	if (bitmap->writeback_daemon && ! IS_ERR(bitmap->writeback_daemon) &&
-	    current->pid != bitmap->writeback_daemon->tsk->pid) {
+	    task_pid(current) != task_pid(bitmap->writeback_daemon->tsk)) {
 		mdk_thread_t *daemon;
 		unsigned long flags;
 
Index: linux-2.6.15/drivers/md/md.c
===================================================================
--- linux-2.6.15.orig/drivers/md/md.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/md/md.c	2006-01-17 08:36:54.000000000 -0500
@@ -3159,7 +3159,7 @@
 			printk(KERN_WARNING
 			       "md: %s(pid %d) used deprecated START_ARRAY ioctl. "
 			       "This will not be supported beyond July 2006\n",
-			       current->comm, current->pid);
+			       current->comm, task_pid(current));
 			cnt--;
 		}
 		err = autostart_array(new_decode_dev(arg));
@@ -3347,7 +3347,7 @@
 				printk(KERN_WARNING "md: %s(pid %d) used"
 					" obsolete MD ioctl, upgrade your"
 					" software to use new ictls.\n",
-					current->comm, current->pid);
+					current->comm, task_pid(current));
 			err = -EINVAL;
 			goto abort_unlock;
 	}
@@ -3496,7 +3496,7 @@
 
 void md_unregister_thread(mdk_thread_t *thread)
 {
-	dprintk("interrupting MD-thread pid %d\n", thread->tsk->pid);
+	dprintk("interrupting MD-thread pid %d\n", task_pid(thread->tsk));
 
 	kthread_stop(thread->tsk);
 	kfree(thread);
Index: linux-2.6.15/drivers/media/video/zoran_driver.c
===================================================================
--- linux-2.6.15.orig/drivers/media/video/zoran_driver.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/media/video/zoran_driver.c	2006-01-17 08:36:54.000000000 -0500
@@ -1342,7 +1342,7 @@
 	}
 
 	dprintk(1, KERN_INFO "%s: zoran_open(%s, pid=[%d]), users(-)=%d\n",
-		ZR_DEVNAME(zr), current->comm, current->pid, zr->user);
+		ZR_DEVNAME(zr), current->comm, task_pid(current), zr->user);
 
 	/* now, create the open()-specific file_ops struct */
 	fh = kmalloc(sizeof(struct zoran_fh), GFP_KERNEL);
@@ -1416,7 +1416,7 @@
 	struct zoran *zr = fh->zr;
 
 	dprintk(1, KERN_INFO "%s: zoran_close(%s, pid=[%d]), users(+)=%d\n",
-		ZR_DEVNAME(zr), current->comm, current->pid, zr->user);
+		ZR_DEVNAME(zr), current->comm, task_pid(current), zr->user);
 
 	/* kernel locks (fs/device.c), so don't do that ourselves
 	 * (prevents deadlocks) */
Index: linux-2.6.15/drivers/net/slip.c
===================================================================
--- linux-2.6.15.orig/drivers/net/slip.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/net/slip.c	2006-01-17 08:36:54.000000000 -0500
@@ -749,7 +749,7 @@
 		if (sl->tty)
 			continue;
 
-		if (current->pid == sl->pid) {
+		if (task_pid(current) == sl->pid) {
 			if (sl->line == line && score < 3) {
 				sel = i;
 				score = 3;
@@ -867,7 +867,7 @@
 	sl->tty = tty;
 	tty->disc_data = sl;
 	sl->line = tty_devnum(tty);
-	sl->pid = current->pid;
+	sl->pid = task_pid(current);
 	
 	/* FIXME: already done before we were called - seems this can go */
 	if (tty->driver->flush_buffer)
@@ -1303,7 +1303,7 @@
 		/* Resolve race condition, when ioctl'ing hanged up 
 		   and opened by another process device.
 		 */
-		if (sl->tty != current->signal->tty && sl->pid != current->pid) {
+		if (sl->tty != current->signal->tty && sl->pid != task_pid(current)) {
 			spin_unlock_bh(&sl->lock);
 			return -EPERM;
 		}
Index: linux-2.6.15/drivers/net/tun.c
===================================================================
--- linux-2.6.15.orig/drivers/net/tun.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/net/tun.c	2006-01-17 08:36:54.000000000 -0500
@@ -709,7 +709,7 @@
 		return ret; 
  
 	if (on) {
-		ret = f_setown(file, current->pid, 0);
+		ret = f_setown(file, task_pid(current), 0);
 		if (ret)
 			return ret;
 		tun->flags |= TUN_FASYNC;
Index: linux-2.6.15/drivers/net/wireless/hostap/hostap_ioctl.c
===================================================================
--- linux-2.6.15.orig/drivers/net/wireless/hostap/hostap_ioctl.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/net/wireless/hostap/hostap_ioctl.c	2006-01-17 08:36:54.000000000 -0500
@@ -2923,7 +2923,7 @@
 
 	printk(KERN_DEBUG "%s: process %d (%s) used deprecated iwpriv monitor "
 	       "- update software to use iwconfig mode monitor\n",
-	       dev->name, current->pid, current->comm);
+	       dev->name, task_pid(current), current->comm);
 
 	/* Backward compatibility code - this can be removed at some point */
 
Index: linux-2.6.15/drivers/oprofile/buffer_sync.c
===================================================================
--- linux-2.6.15.orig/drivers/oprofile/buffer_sync.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/oprofile/buffer_sync.c	2006-01-17 08:36:54.000000000 -0500
@@ -286,12 +286,12 @@
 {
 	add_event_entry(ESCAPE_CODE);
 	add_event_entry(CTX_SWITCH_CODE); 
-	add_event_entry(task->pid);
+	add_event_entry(task_pid(task));
 	add_event_entry(cookie);
 	/* Another code for daemon back-compat */
 	add_event_entry(ESCAPE_CODE);
 	add_event_entry(CTX_TGID_CODE);
-	add_event_entry(task->tgid);
+	add_event_entry(task_tgid(task));
 }
 
  
Index: linux-2.6.15/drivers/s390/char/fs3270.c
===================================================================
--- linux-2.6.15.orig/drivers/s390/char/fs3270.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/s390/char/fs3270.c	2006-01-17 08:36:54.000000000 -0500
@@ -444,7 +444,7 @@
 		return PTR_ERR(fp);
 
 	init_waitqueue_head(&fp->wait);
-	fp->fs_pid = current->pid;
+	fp->fs_pid = task_pid(current);
 	rc = raw3270_add_view(&fp->view, &fs3270_fn, minor);
 	if (rc) {
 		fs3270_free_view(&fp->view);
Index: linux-2.6.15/drivers/s390/crypto/z90main.c
===================================================================
--- linux-2.6.15.orig/drivers/s390/crypto/z90main.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/s390/crypto/z90main.c	2006-01-17 08:36:54.000000000 -0500
@@ -185,7 +185,7 @@
 /**
  * PID() expands to the process ID of the current process
  */
-#define PID() (current->pid)
+#define PID() (task_pid(current))
 
 /**
  * Selected Constants.	The number of APs and the number of devices
@@ -942,7 +942,7 @@
 	step = atomic_inc_return(&z90crypt_step);
 	memcpy(we_p->caller_id+0, (void *) &pid, sizeof(pid));
 	memcpy(we_p->caller_id+4, (void *) &step, sizeof(step));
-	we_p->pid = pid;
+	we_task_pid(p) = pid;
 	we_p->priv_data = priv_data;
 	we_p->status[0] = STAT_DEFAULT;
 	we_p->audit[0] = 0x00;
Index: linux-2.6.15/drivers/s390/s390mach.c
===================================================================
--- linux-2.6.15.orig/drivers/s390/s390mach.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/s390/s390mach.c	2006-01-17 08:36:54.000000000 -0500
@@ -183,7 +183,7 @@
 		printk(KERN_EMERG "mcck: Terminating task because of machine "
 		       "malfunction (code 0x%016llx).\n", mcck.mcck_code);
 		printk(KERN_EMERG "mcck: task: %s, pid: %d.\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		do_exit(SIGSEGV);
 	}
 }
Index: linux-2.6.15/drivers/scsi/53c7xx.c
===================================================================
--- linux-2.6.15.orig/drivers/scsi/53c7xx.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/scsi/53c7xx.c	2006-01-17 08:36:54.000000000 -0500
@@ -4208,7 +4208,7 @@
 
 	if (hostdata->options & OPTION_DEBUG_INTR) {
 	    printk ("scsi%d : command complete : pid %lu, id %d,lun %d result 0x%x ", 
-		  host->host_no, tmp->pid, tmp->device->id, tmp->device->lun, tmp->result);
+		  host->host_no, tmtask_pid(p), tmp->device->id, tmp->device->lun, tmp->result);
 	    __scsi_print_command (tmp->cmnd);
 	}
 
Index: linux-2.6.15/drivers/scsi/dc395x.c
===================================================================
--- linux-2.6.15.orig/drivers/scsi/dc395x.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/scsi/dc395x.c	2006-01-17 08:36:54.000000000 -0500
@@ -3554,7 +3554,7 @@
 			p = srb->cmd;
 			dir = p->sc_data_direction;
 			result = MK_RES(0, did_flag, 0, 0);
-			printk("G:%li(%02i-%i) ", p->pid,
+			printk("G:%li(%02i-%i) ", task_pid(p),
 			       p->device->id, p->device->lun);
 			srb_going_remove(dcb, srb);
 			free_tag(dcb, srb);
@@ -3584,7 +3584,7 @@
 			p = srb->cmd;
 
 			result = MK_RES(0, did_flag, 0, 0);
-			printk("W:%li<%02i-%i>", p->pid, p->device->id,
+			printk("W:%li<%02i-%i>", task_pid(p), p->device->id,
 			       p->device->lun);
 			srb_waiting_remove(dcb, srb);
 			srb_free_insert(acb, srb);
Index: linux-2.6.15/drivers/scsi/eata_pio.c
===================================================================
--- linux-2.6.15.orig/drivers/scsi/eata_pio.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/scsi/eata_pio.c	2006-01-17 08:36:54.000000000 -0500
@@ -512,7 +512,7 @@
 
 		sp = HD(cmd)->ccb[x].cmd;
 		HD(cmd)->ccb[x].status = RESET;
-		printk(KERN_WARNING "eata_pio_reset: slot %d in reset, pid %ld.\n", x, sp->pid);
+		printk(KERN_WARNING "eata_pio_reset: slot %d in reset, pid %ld.\n", x, stask_pid(p));
 
 		if (sp == NULL)
 			panic("eata_pio_reset: slot %d, sp==NULL.\n", x);
Index: linux-2.6.15/drivers/serial/crisv10.c
===================================================================
--- linux-2.6.15.orig/drivers/serial/crisv10.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/serial/crisv10.c	2006-01-17 08:36:54.000000000 -0500
@@ -4346,7 +4346,7 @@
 	}
 
 #ifdef SERIAL_DEBUG_OPEN
-	printk("[%d] rs_close ttyS%d, count = %d\n", current->pid,
+	printk("[%d] rs_close ttyS%d, count = %d\n", task_pid(current),
 	       info->line, info->count);
 #endif
 	if ((tty->count == 1) && (info->count != 1)) {
@@ -4639,7 +4639,7 @@
 		return -ENODEV;
 
 #ifdef SERIAL_DEBUG_OPEN
-        printk("[%d] rs_open %s, count = %d\n", current->pid, tty->name,
+        printk("[%d] rs_open %s, count = %d\n", task_pid(current), tty->name,
  	       info->count);
 #endif
 
Index: linux-2.6.15/drivers/usb/input/hid-lgff.c
===================================================================
--- linux-2.6.15.orig/drivers/usb/input/hid-lgff.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/usb/input/hid-lgff.c	2006-01-17 08:36:54.000000000 -0500
@@ -55,8 +55,8 @@
 #define DEVICE_CLOSING 0     /* The driver is being unitialised */
 
 /* Check that the current process can access an effect */
-#define CHECK_OWNERSHIP(effect) (current->pid == 0 \
-        || effect.owner == current->pid)
+#define CHECK_OWNERSHIP(effect) (task_pid(current) == 0 \
+        || effect.owner == task_pid(current))
 
 #define LGFF_CHECK_OWNERSHIP(i, l) \
         (i>=0 && i<LGFF_EFFECTS \
@@ -340,7 +340,7 @@
 		  modified is when effects are uploaded or when an effect is
 		  erased. But a process cannot close its dev/input/eventX fd
 		  and perform ioctls on the same fd all at the same time */
-		if ( current->pid == lgff->effects[i].owner
+		if ( task_pid(current) == lgff->effects[i].owner
 		     && test_bit(EFFECT_USED, lgff->effects[i].flags)) {
 
 			if (hid_lgff_erase(dev, i))
@@ -392,7 +392,7 @@
 		}
 
 		effect->id = i;
-		lgff->effects[i].owner = current->pid;
+		lgff->effects[i].owner = task_pid(current);
 		lgff->effects[i].flags[0] = 0;
 		set_bit(EFFECT_USED, lgff->effects[i].flags);
 	}
Index: linux-2.6.15/drivers/usb/input/hid-tmff.c
===================================================================
--- linux-2.6.15.orig/drivers/usb/input/hid-tmff.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/usb/input/hid-tmff.c	2006-01-17 08:36:54.000000000 -0500
@@ -51,8 +51,8 @@
 #define DEVICE_CLOSING 0	/* The driver is being unitialised */
 
 /* Check that the current process can access an effect */
-#define CHECK_OWNERSHIP(effect) (current->pid == 0 \
-        || effect.owner == current->pid)
+#define CHECK_OWNERSHIP(effect) (task_pid(current) == 0 \
+        || effect.owner == task_pid(current))
 
 #define TMFF_CHECK_ID(id)	((id) >= 0 && (id) < TMFF_EFFECTS)
 
@@ -255,7 +255,7 @@
 		erased. But a process cannot close its dev/input/eventX fd
 		and perform ioctls on the same fd all at the same time */
 
-		if (current->pid == tmff->effects[i].owner
+		if (task_pid(current) == tmff->effects[i].owner
 		     && test_bit(EFFECT_USED, tmff->effects[i].flags))
 			if (hid_tmff_erase(dev, i))
 				warn("erase effect %d failed", i);
@@ -310,7 +310,7 @@
 		}
 
 		effect->id = id;
-		tmff->effects[id].owner = current->pid;
+		tmff->effects[id].owner = task_pid(current);
 		tmff->effects[id].flags[0] = 0;
 		set_bit(EFFECT_USED, tmff->effects[id].flags);
 
Index: linux-2.6.15/drivers/usb/input/pid.c
===================================================================
--- linux-2.6.15.orig/drivers/usb/input/pid.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/usb/input/pid.c	2006-01-17 08:36:54.000000000 -0500
@@ -40,8 +40,8 @@
 #define CHECK_OWNERSHIP(i, hid_pid)	\
 	((i) < FF_EFFECTS_MAX && i >= 0 && \
 	test_bit(FF_PID_FLAGS_USED, &hid_pid->effects[(i)].flags) && \
-	(current->pid == 0 || \
-	(hid_pid)->effects[(i)].owner == current->pid))
+	(task_pid(current) == 0 || \
+	(hid_pid)->effects[(i)].owner == task_pid(current)))
 
 /* Called when a transfer is completed */
 static void hid_pid_ctrl_out(struct urb *u, struct pt_regs *regs)
@@ -151,7 +151,7 @@
 	   and perform ioctls on the same fd all at the same time */
 	/*FIXME: multiple threads, anyone? */
 	for (i = 0; i < dev->ff_effects_max; ++i)
-		if (current->pid == pid->effects[i].owner
+		if (task_pid(current) == pid->effects[i].owner
 		    && test_bit(FF_PID_FLAGS_USED, &pid->effects[i].flags))
 			if (hid_pid_erase(dev, i))
 				dev_warn(&hid->dev->dev, "erase effect %d failed", i);
@@ -197,7 +197,7 @@
 
 		effect->id = id;
 		dev_dbg(&pid_private->hid->dev->dev, "effect ID is %d.\n", id);
-		pid_private->effects[id].owner = current->pid;
+		pid_private->effects[id].owner = task_pid(current);
 		pid_private->effects[id].flags = (1 << FF_PID_FLAGS_USED);
 		spin_unlock_irqrestore(&pid_private->lock, flags);
 
Index: linux-2.6.15/drivers/usb/core/devio.c
===================================================================
--- linux-2.6.15.orig/drivers/usb/core/devio.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/drivers/usb/core/devio.c	2006-01-17 08:36:54.000000000 -0500
@@ -466,7 +466,7 @@
 		return 0;
 	/* if not yet claimed, claim it for the driver */
 	dev_warn(&ps->dev->dev, "usbfs: process %d (%s) did not claim interface %u before use\n",
-	       current->pid, current->comm, ifnum);
+	       task_pid(current), current->comm, ifnum);
 	return claimintf(ps, ifnum);
 }
 
@@ -572,7 +572,7 @@
 	INIT_LIST_HEAD(&ps->async_completed);
 	init_waitqueue_head(&ps->wait);
 	ps->discsignr = 0;
-	ps->disc_pid = current->pid;
+	ps->disc_pid = task_pid(current);
 	ps->disc_uid = current->uid;
 	ps->disc_euid = current->euid;
 	ps->disccontext = NULL;
@@ -1055,7 +1055,7 @@
 		as->userbuffer = NULL;
 	as->signr = uurb->signr;
 	as->ifnum = ifnum;
-	as->pid = current->pid;
+	as->pid = task_pid(current);
 	as->uid = current->uid;
 	as->euid = current->euid;
 	if (!(uurb->endpoint & USB_DIR_IN)) {
Index: linux-2.6.15/block/ll_rw_blk.c
===================================================================
--- linux-2.6.15.orig/block/ll_rw_blk.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/block/ll_rw_blk.c	2006-01-17 08:36:54.000000000 -0500
@@ -2923,7 +2923,7 @@
 	if (unlikely(block_dump)) {
 		char b[BDEVNAME_SIZE];
 		printk(KERN_DEBUG "%s(%d): %s block %Lu on %s\n",
-			current->comm, current->pid,
+			current->comm, task_pid(current),
 			(rw & WRITE) ? "WRITE" : "READ",
 			(unsigned long long)bio->bi_sector,
 			bdevname(bio->bi_bdev,b));
Index: linux-2.6.15/block/cfq-iosched.c
===================================================================
--- linux-2.6.15.orig/block/cfq-iosched.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/block/cfq-iosched.c	2006-01-17 08:36:54.000000000 -0500
@@ -619,7 +619,7 @@
 static struct request *cfq_find_rq_rb(struct cfq_data *cfqd, sector_t sector)
 
 {
-	struct cfq_queue *cfqq = cfq_find_cfq_hash(cfqd, current->pid, CFQ_KEY_ANY);
+	struct cfq_queue *cfqq = cfq_find_cfq_hash(cfqd, task_pid(current), CFQ_KEY_ANY);
 	struct rb_node *n;
 
 	if (!cfqq)
@@ -1788,7 +1788,7 @@
 static inline pid_t cfq_queue_pid(struct task_struct *task, int rw)
 {
 	if (rw == READ || process_sync(task))
-		return task->pid;
+		return task_pid(task);
 
 	return CFQ_KEY_ASYNC;
 }

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 02/34] PID Virtualization Change pid accesses: most archs
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
  2006-01-17 14:32 ` RFC [patch 01/34] PID Virtualization Change pid accesses: drivers Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 03/34] PID Virtualization Change pid accesses: filesystems Serge Hallyn
                   ` (32 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: B1-change-pid-tgid-references-arches --]
[-- Type: text/plain, Size: 86463 bytes --]

Change pid accesses for most architectures.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 arch/alpha/kernel/semaphore.c         |   16 ++++++++--------
 arch/alpha/kernel/signal.c            |    4 ++--
 arch/alpha/kernel/traps.c             |    6 +++---
 arch/alpha/mm/fault.c                 |    4 ++--
 arch/arm/kernel/process.c             |    2 +-
 arch/arm/kernel/ptrace.c              |    4 ++--
 arch/arm/kernel/traps.c               |   10 +++++-----
 arch/arm/mm/alignment.c               |    2 +-
 arch/arm/mm/fault.c                   |    2 +-
 arch/arm/nwfpe/fpmodule.c             |    2 +-
 arch/arm26/kernel/ptrace.c            |    2 +-
 arch/arm26/kernel/traps.c             |   12 ++++++------
 arch/arm26/mm/fault.c                 |    2 +-
 arch/arm26/nwfpe/fpmodule.c           |    2 +-
 arch/cris/arch-v10/kernel/traps.c     |    2 +-
 arch/cris/arch-v32/kernel/process.c   |    2 +-
 arch/cris/arch-v32/kernel/ptrace.c    |    4 ++--
 arch/cris/arch-v32/kernel/signal.c    |    2 +-
 arch/cris/arch-v32/kernel/traps.c     |    2 +-
 arch/cris/kernel/profile.c            |    2 +-
 arch/frv/kernel/gdb-stub.c            |    2 +-
 arch/frv/kernel/ptrace.c              |   10 +++++-----
 arch/frv/kernel/semaphore.c           |    2 +-
 arch/frv/kernel/signal.c              |    4 ++--
 arch/frv/kernel/traps.c               |    4 ++--
 arch/frv/mm/fault.c                   |    4 ++--
 arch/h8300/kernel/traps.c             |    2 +-
 arch/i386/kernel/crash.c              |    2 +-
 arch/i386/kernel/process.c            |    2 +-
 arch/i386/kernel/signal.c             |    4 ++--
 arch/i386/kernel/traps.c              |    2 +-
 arch/i386/lib/usercopy.c              |    2 +-
 arch/i386/mm/fault.c                  |    2 +-
 arch/m32r/kernel/process.c            |    8 ++++----
 arch/m32r/kernel/signal.c             |    2 +-
 arch/m32r/kernel/traps.c              |    2 +-
 arch/m32r/mm/fault.c                  |    2 +-
 arch/m68k/kernel/traps.c              |    6 +++---
 arch/m68k/mac/macints.c               |    2 +-
 arch/m68k/mm/fault.c                  |    2 +-
 arch/m68knommu/kernel/process.c       |    2 +-
 arch/m68knommu/kernel/time.c          |    2 +-
 arch/m68knommu/kernel/traps.c         |    4 ++--
 arch/m68knommu/platform/5307/timers.c |    2 +-
 arch/parisc/kernel/signal.c           |    4 ++--
 arch/parisc/kernel/smp.c              |    5 +++--
 arch/parisc/kernel/sys_parisc32.c     |    2 +-
 arch/parisc/kernel/traps.c            |   12 ++++++------
 arch/parisc/kernel/unaligned.c        |    2 +-
 arch/parisc/mm/fault.c                |    2 +-
 arch/powerpc/kernel/process.c         |    2 +-
 arch/powerpc/kernel/traps.c           |    4 ++--
 arch/powerpc/mm/fault.c               |    2 +-
 arch/powerpc/platforms/pseries/ras.c  |    4 ++--
 arch/powerpc/xmon/xmon.c              |    2 +-
 arch/ppc/kernel/process.c             |    6 +++---
 arch/ppc/kernel/softemu8xx.c          |    2 +-
 arch/ppc/kernel/traps.c               |    4 ++--
 arch/ppc/lib/locks.c                  |    6 +++---
 arch/ppc/mm/fault.c                   |    2 +-
 arch/ppc/xmon/xmon.c                  |    2 +-
 arch/s390/kernel/asm-offsets.c        |    2 +-
 arch/s390/kernel/process.c            |    2 +-
 arch/s390/math-emu/math.c             |    2 +-
 arch/s390/mm/fault.c                  |    2 +-
 arch/sh/kernel/process.c              |    2 +-
 arch/sh/kernel/signal.c               |    4 ++--
 arch/sh/kernel/traps.c                |    2 +-
 arch/sh/mm/fault.c                    |    2 +-
 arch/sh64/kernel/process.c            |    2 +-
 arch/sh64/kernel/signal.c             |    4 ++--
 arch/sh64/kernel/traps.c              |    4 ++--
 arch/sh64/lib/dbg.c                   |   12 ++++++------
 arch/sh64/mm/fault.c                  |   10 +++++-----
 arch/sparc/kernel/process.c           |    2 +-
 arch/sparc/kernel/ptrace.c            |    8 ++++----
 arch/sparc/kernel/setup.c             |    2 +-
 arch/sparc/kernel/sys_sparc.c         |    2 +-
 arch/sparc/kernel/sys_sunos.c         |    2 +-
 arch/sparc/kernel/traps.c             |    4 ++--
 arch/sparc/mm/fault.c                 |    6 +++---
 arch/sparc64/kernel/process.c         |    2 +-
 arch/sparc64/kernel/setup.c           |    2 +-
 arch/sparc64/kernel/sys_sunos32.c     |    2 +-
 arch/sparc64/kernel/traps.c           |    2 +-
 arch/sparc64/solaris/ioctl.c          |    4 ++--
 arch/um/kernel/process_kern.c         |    4 ++--
 arch/um/kernel/skas/process_kern.c    |    4 ++--
 arch/um/kernel/trap_kern.c            |    2 +-
 arch/um/sys-x86_64/sysrq.c            |    2 +-
 arch/v850/kernel/bug.c                |    4 ++--
 arch/v850/kernel/signal.c             |    4 ++--
 arch/x86_64/ia32/ia32_signal.c        |    4 ++--
 arch/x86_64/ia32/ptrace32.c           |    2 +-
 arch/x86_64/kernel/asm-offsets.c      |    2 +-
 arch/x86_64/kernel/mce.c              |    2 +-
 arch/x86_64/kernel/process.c          |    2 +-
 arch/x86_64/kernel/signal.c           |   10 +++++-----
 arch/x86_64/kernel/traps.c            |    6 +++---
 arch/x86_64/mm/fault.c                |    8 ++++----
 arch/xtensa/kernel/signal.c           |    4 ++--
 arch/xtensa/kernel/syscalls.c         |    4 ++--
 arch/xtensa/kernel/traps.c            |    6 +++---
 arch/xtensa/mm/fault.c                |    4 ++--
 drivers/s390/crypto/z90main.c         |    2 +-
 105 files changed, 194 insertions(+), 193 deletions(-)

Index: linux-2.6.15/arch/i386/kernel/crash.c
===================================================================
--- linux-2.6.15.orig/arch/i386/kernel/crash.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/i386/kernel/crash.c	2006-01-17 08:36:55.000000000 -0500
@@ -74,7 +74,7 @@
 	 */
 	buf = &crash_notes[cpu][0];
 	memset(&prstatus, 0, sizeof(prstatus));
-	prstatus.pr_pid = current->pid;
+	prstatus.pr_pid = task_pid(current);
 	elf_core_copy_regs(&prstatus.pr_reg, regs);
 	buf = append_elf_note(buf, "CORE", NT_PRSTATUS, &prstatus,
 				sizeof(prstatus));
Index: linux-2.6.15/arch/i386/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/i386/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/i386/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -290,7 +290,7 @@
 	unsigned long cr0 = 0L, cr2 = 0L, cr3 = 0L, cr4 = 0L;
 
 	printk("\n");
-	printk("Pid: %d, comm: %20s\n", current->pid, current->comm);
+	printk("Pid: %d, comm: %20s\n", task_pid(current), current->comm);
 	printk("EIP: %04x:[<%08lx>] CPU: %d\n",0xffff & regs->xcs,regs->eip, smp_processor_id());
 	print_symbol("EIP is at %s\n", regs->eip);
 
Index: linux-2.6.15/arch/i386/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/i386/kernel/signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/i386/kernel/signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -430,7 +430,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p\n",
-		current->comm, current->pid, frame, regs->eip, frame->pretcode);
+		current->comm, task_pid(current), frame, regs->eip, frame->pretcode);
 #endif
 
 	return 1;
@@ -524,7 +524,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p\n",
-		current->comm, current->pid, frame, regs->eip, frame->pretcode);
+		current->comm, task_pid(current), frame, regs->eip, frame->pretcode);
 #endif
 
 	return 1;
Index: linux-2.6.15/arch/i386/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/i386/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/i386/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -228,7 +228,7 @@
 	printk("ds: %04x   es: %04x   ss: %04x\n",
 		regs->xds & 0xffff, regs->xes & 0xffff, ss);
 	printk("Process %s (pid: %d, threadinfo=%p task=%p)",
-		current->comm, current->pid, current_thread_info(), current);
+		current->comm, task_pid(current), current_thread_info(), current);
 	/*
 	 * When in-kernel, we also print out the stack and code at the
 	 * time of the fault..
Index: linux-2.6.15/arch/i386/lib/usercopy.c
===================================================================
--- linux-2.6.15.orig/arch/i386/lib/usercopy.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/i386/lib/usercopy.c	2006-01-17 08:36:55.000000000 -0500
@@ -543,7 +543,7 @@
 			retval = get_user_pages(current, current->mm,
 					(unsigned long )to, 1, 1, 0, &pg, NULL);
 
-			if (retval == -ENOMEM && current->pid == 1) {
+			if (retval == -ENOMEM && task_pid(current) == 1) {
 				up_read(&current->mm->mmap_sem);
 				blk_congestion_wait(WRITE, HZ/50);
 				goto survive;
Index: linux-2.6.15/arch/i386/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/i386/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/i386/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -485,7 +485,7 @@
  */
 out_of_memory:
 	up_read(&mm->mmap_sem);
-	if (tsk->pid == 1) {
+	if (task_pid(tsk) == 1) {
 		yield();
 		down_read(&mm->mmap_sem);
 		goto survive;
Index: linux-2.6.15/arch/alpha/kernel/semaphore.c
===================================================================
--- linux-2.6.15.orig/arch/alpha/kernel/semaphore.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/alpha/kernel/semaphore.c	2006-01-17 08:36:55.000000000 -0500
@@ -69,7 +69,7 @@
 
 #ifdef CONFIG_DEBUG_SEMAPHORE
 	printk("%s(%d): down failed(%p)\n",
-	       tsk->comm, tsk->pid, sem);
+	       tsk->comm, task_pid(tsk), sem);
 #endif
 
 	tsk->state = TASK_UNINTERRUPTIBLE;
@@ -98,7 +98,7 @@
 
 #ifdef CONFIG_DEBUG_SEMAPHORE
 	printk("%s(%d): down acquired(%p)\n",
-	       tsk->comm, tsk->pid, sem);
+	       tsk->comm, task_pid(tsk), sem);
 #endif
 }
 
@@ -111,7 +111,7 @@
 
 #ifdef CONFIG_DEBUG_SEMAPHORE
 	printk("%s(%d): down failed(%p)\n",
-	       tsk->comm, tsk->pid, sem);
+	       tsk->comm, task_pid(tsk), sem);
 #endif
 
 	tsk->state = TASK_INTERRUPTIBLE;
@@ -139,7 +139,7 @@
 
 #ifdef CONFIG_DEBUG_SEMAPHORE
 	printk("%s(%d): down %s(%p)\n",
-	       current->comm, current->pid,
+	       current->comm, task_pid(current),
 	       (ret < 0 ? "interrupted" : "acquired"), sem);
 #endif
 	return ret;
@@ -168,7 +168,7 @@
 #endif
 #ifdef CONFIG_DEBUG_SEMAPHORE
 	printk("%s(%d): down(%p) <count=%d> from %p\n",
-	       current->comm, current->pid, sem,
+	       current->comm, task_pid(current), sem,
 	       atomic_read(&sem->count), __builtin_return_address(0));
 #endif
 	__down(sem);
@@ -182,7 +182,7 @@
 #endif
 #ifdef CONFIG_DEBUG_SEMAPHORE
 	printk("%s(%d): down(%p) <count=%d> from %p\n",
-	       current->comm, current->pid, sem,
+	       current->comm, task_pid(current), sem,
 	       atomic_read(&sem->count), __builtin_return_address(0));
 #endif
 	return __down_interruptible(sem);
@@ -201,7 +201,7 @@
 
 #ifdef CONFIG_DEBUG_SEMAPHORE
 	printk("%s(%d): down_trylock %s from %p\n",
-	       current->comm, current->pid,
+	       current->comm, task_pid(current),
 	       ret ? "failed" : "acquired",
 	       __builtin_return_address(0));
 #endif
@@ -217,7 +217,7 @@
 #endif
 #ifdef CONFIG_DEBUG_SEMAPHORE
 	printk("%s(%d): up(%p) <count=%d> from %p\n",
-	       current->comm, current->pid, sem,
+	       current->comm, task_pid(current), sem,
 	       atomic_read(&sem->count), __builtin_return_address(0));
 #endif
 	__up(sem);
Index: linux-2.6.15/arch/alpha/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/alpha/kernel/signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/alpha/kernel/signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -479,7 +479,7 @@
 	
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p\n",
-		current->comm, current->pid, frame, regs->pc, regs->r26);
+		current->comm, task_pid(current), frame, regs->pc, regs->r26);
 #endif
 
 	return;
@@ -541,7 +541,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p\n",
-		current->comm, current->pid, frame, regs->pc, regs->r26);
+		current->comm, task_pid(current), frame, regs->pc, regs->r26);
 #endif
 
 	return;
Index: linux-2.6.15/arch/alpha/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/alpha/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/alpha/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -183,7 +183,7 @@
 #ifdef CONFIG_SMP
 	printk("CPU %d ", hard_smp_processor_id());
 #endif
-	printk("%s(%d): %s %ld\n", current->comm, current->pid, str, err);
+	printk("%s(%d): %s %ld\n", current->comm, task_pid(current), str, err);
 	dik_show_regs(regs, r9_15);
 	dik_show_trace((unsigned long *)(regs+1));
 	dik_show_code((unsigned int *)regs->pc);
@@ -646,7 +646,7 @@
 	lock_kernel();
 
 	printk("%s(%d): unhandled unaligned exception\n",
-	       current->comm, current->pid);
+	       current->comm, task_pid(current));
 
 	printk("pc = [<%016lx>]  ra = [<%016lx>]  ps = %04lx\n",
 	       pc, una_reg(26), regs->ps);
@@ -786,7 +786,7 @@
 		}
 		if (++cnt < 5) {
 			printk("%s(%d): unaligned trap at %016lx: %p %lx %ld\n",
-			       current->comm, current->pid,
+			       current->comm, task_pid(current),
 			       regs->pc - 4, va, opcode, reg);
 		}
 		last_time = jiffies;
Index: linux-2.6.15/arch/alpha/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/alpha/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/alpha/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -194,13 +194,13 @@
 	/* We ran out of memory, or some other thing happened to us that
 	   made us unable to handle the page fault gracefully.  */
  out_of_memory:
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		yield();
 		down_read(&mm->mmap_sem);
 		goto survive;
 	}
 	printk(KERN_ALERT "VM: killing process %s(%d)\n",
-	       current->comm, current->pid);
+	       current->comm, task_pid(current));
 	if (!user_mode(regs))
 		goto no_context;
 	do_exit(SIGKILL);
Index: linux-2.6.15/arch/arm/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/arm/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/arm/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -227,7 +227,7 @@
 void show_regs(struct pt_regs * regs)
 {
 	printk("\n");
-	printk("Pid: %d, comm: %20s\n", current->pid, current->comm);
+	printk("Pid: %d, comm: %20s\n", task_pid(current), current->comm);
 	__show_regs(regs);
 	__backtrace();
 }
Index: linux-2.6.15/arch/arm/kernel/ptrace.c
===================================================================
--- linux-2.6.15.orig/arch/arm/kernel/ptrace.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/arm/kernel/ptrace.c	2006-01-17 08:36:55.000000000 -0500
@@ -401,7 +401,7 @@
 
 		if (ret != 2 || old_insn.thumb != BREAKINST_THUMB)
 			printk(KERN_ERR "%s:%d: corrupted Thumb breakpoint at "
-				"0x%08lx (0x%04x)\n", task->comm, task->pid,
+				"0x%08lx (0x%04x)\n", task->comm, task_pid(task),
 				addr, old_insn.thumb);
 	} else {
 		ret = swap_insn(task, addr & ~3, &old_insn.arm,
@@ -409,7 +409,7 @@
 
 		if (ret != 4 || old_insn.arm != BREAKINST_ARM)
 			printk(KERN_ERR "%s:%d: corrupted ARM breakpoint at "
-				"0x%08lx (0x%08x)\n", task->comm, task->pid,
+				"0x%08lx (0x%08x)\n", task->comm, task_pid(task),
 				addr, old_insn.arm);
 	}
 }
Index: linux-2.6.15/arch/arm/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/arm/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/arm/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -207,7 +207,7 @@
 	print_modules();
 	__show_regs(regs);
 	printk("Process %s (pid: %d, stack limit = 0x%p)\n",
-		tsk->comm, tsk->pid, thread + 1);
+		tsk->comm, task_pid(tsk), thread + 1);
 
 	if (!user_mode(regs) || in_interrupt()) {
 		dump_mem("Stack: ", regs->ARM_sp,
@@ -306,7 +306,7 @@
 #ifdef CONFIG_DEBUG_USER
 	if (user_debug & UDBG_UNDEFINED) {
 		printk(KERN_INFO "%s (%d): undefined instruction: pc=%p\n",
-			current->comm, current->pid, pc);
+			current->comm, task_pid(current), pc);
 		dump_instr(regs);
 	}
 #endif
@@ -360,7 +360,7 @@
 #ifdef CONFIG_DEBUG_USER
 	if (user_debug & UDBG_SYSCALL) {
 		printk(KERN_ERR "[%d] %s: obsolete system call %08x.\n",
-			current->pid, current->comm, n);
+			task_pid(current), current->comm, n);
 		dump_instr(regs);
 	}
 #endif
@@ -537,7 +537,7 @@
 	 */
 	if (user_debug & UDBG_SYSCALL) {
 		printk("[%d] %s: arm syscall %d\n",
-		       current->pid, current->comm, no);
+		       task_pid(current), current->comm, no);
 		dump_instr(regs);
 		if (user_mode(regs)) {
 			__show_regs(regs);
@@ -614,7 +614,7 @@
 #ifdef CONFIG_DEBUG_USER
 	if (user_debug & UDBG_BADABORT) {
 		printk(KERN_ERR "[%d] %s: bad data abort: code %d instr 0x%08lx\n",
-			current->pid, current->comm, code, instr);
+			task_pid(current), current->comm, code, instr);
 		dump_instr(regs);
 		show_pte(current->mm, addr);
 	}
Index: linux-2.6.15/arch/arm/mm/alignment.c
===================================================================
--- linux-2.6.15.orig/arch/arm/mm/alignment.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/arm/mm/alignment.c	2006-01-17 08:36:55.000000000 -0500
@@ -759,7 +759,7 @@
 	if (ai_usermode & 1)
 		printk("Alignment trap: %s (%d) PC=0x%08lx Instr=0x%0*lx "
 		       "Address=0x%08lx FSR 0x%03x\n", current->comm,
-			current->pid, instrptr,
+			task_pid(current), instrptr,
 		        thumb_mode(regs) ? 4 : 8,
 		        thumb_mode(regs) ? tinstr : instr,
 		        addr, fsr);
Index: linux-2.6.15/arch/arm/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/arm/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/arm/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -198,7 +198,7 @@
 		return fault;
 	}
 
-	if (tsk->pid != 1)
+	if (task_pid(tsk) != 1)
 		goto out;
 
 	/*
Index: linux-2.6.15/arch/arm/nwfpe/fpmodule.c
===================================================================
--- linux-2.6.15.orig/arch/arm/nwfpe/fpmodule.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/arm/nwfpe/fpmodule.c	2006-01-17 08:36:55.000000000 -0500
@@ -131,7 +131,7 @@
  	if (flags & ~BIT_IXC)
  		printk(KERN_DEBUG
 		       "NWFPE: %s[%d] takes exception %08x at %p from %08lx\n",
-		       current->comm, current->pid, flags,
+		       current->comm, task_pid(current), flags,
 		       __builtin_return_address(0), GET_USERREG()->ARM_pc);
 #endif
 
Index: linux-2.6.15/arch/arm26/kernel/ptrace.c
===================================================================
--- linux-2.6.15.orig/arch/arm26/kernel/ptrace.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/arm26/kernel/ptrace.c	2006-01-17 08:36:55.000000000 -0500
@@ -366,7 +366,7 @@
 
 	if (ret != 4 || old_insn != BREAKINST_ARM)
 		printk(KERN_ERR "%s:%d: corrupted ARM breakpoint at "
-			"0x%08lx (0x%08x)\n", task->comm, task->pid,
+			"0x%08lx (0x%08x)\n", task->comm, task_pid(task),
 			addr, old_insn);
 }
 
Index: linux-2.6.15/arch/arm26/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/arm26/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/arm26/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -187,7 +187,7 @@
 	printk("CPU: %d\n", smp_processor_id());
 	show_regs(regs);
 	printk("Process %s (pid: %d, stack limit = 0x%p)\n",
-		current->comm, current->pid, tsk->thread_info + 1);
+		current->comm, task_pid(current), tsk->thread_info + 1);
 
 	if (!user_mode(regs) || in_interrupt()) {
 		__dump_stack(tsk, (unsigned long)(regs + 1));
@@ -276,7 +276,7 @@
 
 #ifdef CONFIG_DEBUG_USER
 	printk(KERN_INFO "%s (%d): undefined instruction: pc=%p\n",
-		current->comm, current->pid, pc);
+		current->comm, task_pid(current), pc);
 	dump_instr(regs);
 #endif
 
@@ -299,7 +299,7 @@
 
 #ifdef CONFIG_DEBUG_USER
 	printk(KERN_INFO "%s (%d): address exception: pc=%08lx\n",
-		current->comm, current->pid, instruction_pointer(regs));
+		current->comm, task_pid(current), instruction_pointer(regs));
 	dump_instr(regs);
 #endif
 
@@ -363,7 +363,7 @@
 
 #ifdef CONFIG_DEBUG_USER
 	printk(KERN_ERR "[%d] %s: obsolete system call %08x.\n",
-		current->pid, current->comm, n);
+		task_pid(current), current->comm, n);
 	dump_instr(regs);
 #endif
 
@@ -442,7 +442,7 @@
 	 * experience shows that these seem to indicate that
 	 * something catastrophic has happened
 	 */
-	printk("[%d] %s: arm syscall %d\n", current->pid, current->comm, no);
+	printk("[%d] %s: arm syscall %d\n", task_pid(current), current->comm, no);
 	dump_instr(regs);
 	if (user_mode(regs)) {
 		show_regs(regs);
@@ -478,7 +478,7 @@
 
 #ifdef CONFIG_DEBUG_USER
 	printk(KERN_ERR "[%d] %s: bad data abort: code %d instr 0x%08lx\n",
-		current->pid, current->comm, code, instr);
+		task_pid(current), current->comm, code, instr);
 	dump_instr(regs);
 	show_pte(current->mm, addr);
 #endif
Index: linux-2.6.15/arch/arm26/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/arm26/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/arm26/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -186,7 +186,7 @@
 	}
 
 	fault = -3; /* out of memory */
-	if (tsk->pid != 1)
+	if (task_pid(tsk) != 1)
 		goto out;
 
 	/*
Index: linux-2.6.15/arch/arm26/nwfpe/fpmodule.c
===================================================================
--- linux-2.6.15.orig/arch/arm26/nwfpe/fpmodule.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/arm26/nwfpe/fpmodule.c	2006-01-17 08:36:55.000000000 -0500
@@ -145,7 +145,7 @@
   
 #ifdef CONFIG_DEBUG_USER
   printk(KERN_DEBUG "NWFPE: %s[%d] takes exception %08x at %p from %08x\n",
-	 current->comm, current->pid, flags,
+	 current->comm, task_pid(current), flags,
 	 __builtin_return_address(0), GET_USERREG()[15]);
 #endif
 
Index: linux-2.6.15/arch/cris/arch-v10/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/cris/arch-v10/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/cris/arch-v10/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -40,7 +40,7 @@
 	       regs->r12, regs->r13, regs->orig_r10, regs);
 	raw_printk("R_MMU_CAUSE: %08lx\n", (unsigned long)*R_MMU_CAUSE);
 	raw_printk("Process %s (pid: %d, stackpage=%08lx)\n",
-	       current->comm, current->pid, (unsigned long)current);
+	       current->comm, task_pid(current), (unsigned long)current);
 
 	/*
          * When in-kernel, we also print out the stack and code at the
Index: linux-2.6.15/arch/cris/arch-v32/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/cris/arch-v32/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/cris/arch-v32/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -45,7 +45,7 @@
 extern void deconfigure_bp(long pid);
 void exit_thread(void)
 {
-	deconfigure_bp(current->pid);
+	deconfigure_bp(task_pid(current));
 }
 
 /*
Index: linux-2.6.15/arch/cris/arch-v32/kernel/ptrace.c
===================================================================
--- linux-2.6.15.orig/arch/cris/arch-v32/kernel/ptrace.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/cris/arch-v32/kernel/ptrace.c	2006-01-17 08:36:55.000000000 -0500
@@ -52,7 +52,7 @@
 	else if (regno == PT_PPC)
 		ret = get_pseudo_pc(task);
 	else if (regno <= PT_MAX)
-		ret = get_debugreg(task->pid, regno);
+		ret = get_debugreg(task_pid(task), regno);
 	else
 		ret = 0;
 
@@ -73,7 +73,7 @@
 		if (data != get_pseudo_pc(task))
 			((unsigned long *)user_regs(task->thread_info))[PT_ERP] = data;
 	} else if (regno <= PT_MAX)
-		return put_debugreg(task->pid, regno, data);
+		return put_debugreg(task_pid(task), regno, data);
 	else
 		return -1;
 	return 0;
Index: linux-2.6.15/arch/cris/arch-v32/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/cris/arch-v32/kernel/signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/cris/arch-v32/kernel/signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -648,7 +648,7 @@
 		if (!(user_regs(ti)->erp & 0x1))
 			user_regs(ti)->erp -= 2;
 	}
-	sys_kill(ti->task->pid, sig);
+	sys_kill(ti->task_pid(task), sig);
 }
 
 void
Index: linux-2.6.15/arch/cris/arch-v32/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/cris/arch-v32/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/cris/arch-v32/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -57,7 +57,7 @@
 	raw_printk("Instruction MMU Cause: %08lx\n", i_mmu_cause);
 
 	raw_printk("Process %s (pid: %d, stackpage: %08lx)\n",
-		current->comm, current->pid, (unsigned long) current);
+		current->comm, task_pid(current), (unsigned long) current);
 
 	/* Show additional info if in kernel-mode. */
 	if (!user_mode(regs)) {
Index: linux-2.6.15/arch/cris/kernel/profile.c
===================================================================
--- linux-2.6.15.orig/arch/cris/kernel/profile.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/cris/kernel/profile.c	2006-01-17 08:36:55.000000000 -0500
@@ -18,7 +18,7 @@
   if (!prof_running)
     return;
   if (user_mode(regs))
-    *(unsigned int*)sample_buffer_pos = current->pid;
+    *(unsigned int*)sample_buffer_pos = task_pid(current);
   else
     *(unsigned int*)sample_buffer_pos = 0;
   *(unsigned int*)(sample_buffer_pos + 4) = instruction_pointer(regs);
Index: linux-2.6.15/arch/frv/kernel/gdb-stub.c
===================================================================
--- linux-2.6.15.orig/arch/frv/kernel/gdb-stub.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/frv/kernel/gdb-stub.c	2006-01-17 08:36:55.000000000 -0500
@@ -1182,7 +1182,7 @@
 			printk(" | ");
 	}
 
-	gdbstub_printk("Process %s (pid: %d)\n", current->comm, current->pid);
+	gdbstub_printk("Process %s (pid: %d)\n", current->comm, task_pid(current));
 } /* end gdbstub_show_regs() */
 
 /*****************************************************************************/
Index: linux-2.6.15/arch/frv/kernel/ptrace.c
===================================================================
--- linux-2.6.15.orig/arch/frv/kernel/ptrace.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/frv/kernel/ptrace.c	2006-01-17 08:36:55.000000000 -0500
@@ -639,7 +639,7 @@
 	if (!leaving) {
 		if (!argmask) {
 			printk(KERN_CRIT "[%d] %s(%lx,%lx,%lx,%lx,%lx,%lx)\n",
-			       current->pid,
+			       task_pid(current),
 			       name,
 			       __frame->gr8,
 			       __frame->gr9,
@@ -650,12 +650,12 @@
 		}
 		else if (argmask == 0xffffff) {
 			printk(KERN_CRIT "[%d] %s()\n",
-			       current->pid,
+			       task_pid(current),
 			       name);
 		}
 		else {
 			printk(KERN_CRIT "[%d] %s(",
-			       current->pid,
+			       task_pid(current),
 			       name);
 
 			argp = &__frame->gr8;
@@ -691,9 +691,9 @@
 	}
 	else {
 		if ((int)__frame->gr8 > -4096 && (int)__frame->gr8 < 4096)
-			printk(KERN_CRIT "[%d] %s() = %ld\n", current->pid, name, __frame->gr8);
+			printk(KERN_CRIT "[%d] %s() = %ld\n", task_pid(current), name, __frame->gr8);
 		else
-			printk(KERN_CRIT "[%d] %s() = %lx\n", current->pid, name, __frame->gr8);
+			printk(KERN_CRIT "[%d] %s() = %lx\n", task_pid(current), name, __frame->gr8);
 	}
 	return;
 #endif
Index: linux-2.6.15/arch/frv/kernel/semaphore.c
===================================================================
--- linux-2.6.15.orig/arch/frv/kernel/semaphore.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/frv/kernel/semaphore.c	2006-01-17 08:36:55.000000000 -0500
@@ -25,7 +25,7 @@
 {
 	if (sem->debug)
 		printk("[%d] %s({%d,%d})\n",
-		       current->pid,
+		       task_pid(current),
 		       str,
 		       sem->counter,
 		       list_empty(&sem->wait_list) ? 0 : 1);
Index: linux-2.6.15/arch/frv/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/frv/kernel/signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/frv/kernel/signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -364,7 +364,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver %d (%s:%d): sp=%p pc=%lx ra=%p\n",
-		sig, current->comm, current->pid, frame, regs->pc, frame->pretcode);
+		sig, current->comm, task_pid(current), frame, regs->pc, frame->pretcode);
 #endif
 
 	return;
@@ -459,7 +459,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver %d (%s:%d): sp=%p pc=%lx ra=%p\n",
-		sig, current->comm, current->pid, frame, regs->pc, frame->pretcode);
+		sig, current->comm, task_pid(current), frame, regs->pc, frame->pretcode);
 #endif
 
 	return;
Index: linux-2.6.15/arch/frv/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/frv/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/frv/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -296,7 +296,7 @@
 			printk(" | ");
 	}
 
-	printk("Process %s (pid: %d)\n", current->comm, current->pid);
+	printk("Process %s (pid: %d)\n", current->comm, task_pid(current));
 }
 
 void die_if_kernel(const char *str, ...)
@@ -365,7 +365,7 @@
 		stop = (unsigned long) frame;
 	}
 
-	printk("\nProcess %s (pid: %d)\n\n", current->comm, current->pid);
+	printk("\nProcess %s (pid: %d)\n\n", current->comm, task_pid(current));
 
 	for (;;) {
 		/* dump stack segment between frames */
Index: linux-2.6.15/arch/frv/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/frv/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/frv/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -101,10 +101,10 @@
 		if ((ear0 & PAGE_MASK) + 2 * PAGE_SIZE < __frame->sp) {
 #if 0
 			printk("[%d] ### Access below stack @%lx (sp=%lx)\n",
-			       current->pid, ear0, __frame->sp);
+			       task_pid(current), ear0, __frame->sp);
 			show_registers(__frame);
 			printk("[%d] ### Code: [%08lx] %02x %02x %02x %02x %02x %02x %02x %02x\n",
-			       current->pid,
+			       task_pid(current),
 			       __frame->pc,
 			       ((u8*)__frame->pc)[0],
 			       ((u8*)__frame->pc)[1],
Index: linux-2.6.15/arch/h8300/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/h8300/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/h8300/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -56,7 +56,7 @@
 	int		i;
 
 	printk("\nCURRENT PROCESS:\n\n");
-	printk("COMM=%s PID=%d\n", current->comm, current->pid);
+	printk("COMM=%s PID=%d\n", current->comm, task_pid(current));
 	if (current->mm) {
 		printk("TEXT=%08x-%08x DATA=%08x-%08x BSS=%08x-%08x\n",
 			(int) current->mm->start_code,
Index: linux-2.6.15/arch/m32r/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/m32r/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/m32r/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -214,19 +214,19 @@
 void exit_thread(void)
 {
 	/* Nothing to do. */
-	DPRINTK("pid = %d\n", current->pid);
+	DPRINTK("pid = %d\n", task_pid(current));
 }
 
 void flush_thread(void)
 {
-	DPRINTK("pid = %d\n", current->pid);
+	DPRINTK("pid = %d\n", task_pid(current));
 	memset(&current->thread.debug_trap, 0, sizeof(struct debug_trap));
 }
 
 void release_thread(struct task_struct *dead_task)
 {
 	/* do nothing */
-	DPRINTK("pid = %d\n", dead_task->pid);
+	DPRINTK("pid = %d\n", dead_task_pid(task));
 }
 
 /* Fill in the fpu structure for a core dump.. */
@@ -249,7 +249,7 @@
 
 	childregs->spu = spu;
 	childregs->r0 = 0;	/* Child gets zero as return value */
-	regs->r0 = tsk->pid;
+	regs->r0 = task_pid(tsk);
 	tsk->thread.sp = (unsigned long)childregs;
 	tsk->thread.lr = (unsigned long)ret_from_fork;
 
Index: linux-2.6.15/arch/m32r/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/m32r/kernel/signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/m32r/kernel/signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -294,7 +294,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%p\n",
-		current->comm, current->pid, frame, regs->pc);
+		current->comm, task_pid(current), frame, regs->pc);
 #endif
 
 	return;
Index: linux-2.6.15/arch/m32r/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/m32r/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/m32r/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -197,7 +197,7 @@
 		printk("SPI: %08lx\n", sp);
 	}
 	printk("Process %s (pid: %d, process nr: %d, stackpage=%08lx)",
-		current->comm, current->pid, 0xffff & i, 4096+(unsigned long)current);
+		current->comm, task_pid(current), 0xffff & i, 4096+(unsigned long)current);
 
 	/*
 	 * When in-kernel, we also print out the stack and code at the
Index: linux-2.6.15/arch/m32r/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/m32r/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/m32r/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -300,7 +300,7 @@
  */
 out_of_memory:
 	up_read(&mm->mmap_sem);
-	if (tsk->pid == 1) {
+	if (task_pid(tsk) == 1) {
 		yield();
 		down_read(&mm->mmap_sem);
 		goto survive;
Index: linux-2.6.15/arch/m68k/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/m68k/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/m68k/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -673,7 +673,7 @@
 #ifdef DEBUG
 	unsigned long desc;
 
-	printk ("pid = %x  ", current->pid);
+	printk ("pid = %x  ", task_pid(current));
 	printk ("SSW=%#06x  ", ssw);
 
 	if (ssw & (FC | FB))
@@ -1057,7 +1057,7 @@
 				fp->un.fmtb.daddr, space_names[ssw & DFC],
 				fp->ptregs.pc);
 	}
-	printk ("Current process id is %d\n", current->pid);
+	printk ("Current process id is %d\n", task_pid(current));
 	die_if_kernel("BAD KERNEL TRAP", &fp->ptregs, 0);
 }
 
@@ -1199,7 +1199,7 @@
 	       fp->d4, fp->d5, fp->a0, fp->a1);
 
 	printk("Process %s (pid: %d, stackpage=%08lx)\n",
-		current->comm, current->pid, PAGE_SIZE+(unsigned long)current);
+		current->comm, task_pid(current), PAGE_SIZE+(unsigned long)current);
 	show_stack(NULL, (unsigned long *)fp);
 	do_exit(SIGSEGV);
 }
Index: linux-2.6.15/arch/m68k/mac/macints.c
===================================================================
--- linux-2.6.15.orig/arch/m68k/mac/macints.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/m68k/mac/macints.c	2006-01-17 08:36:55.000000000 -0500
@@ -696,7 +696,7 @@
 		if (STACK_MAGIC != *(unsigned long *)current->kernel_stack_page)
 			printk("Corrupted stack page\n");
 		printk("Process %s (pid: %d, stackpage=%08lx)\n",
-			current->comm, current->pid, current->kernel_stack_page);
+			current->comm, task_pid(current), current->kernel_stack_page);
 		if (intr_count == 1)
 			dump_stack((struct frame *)fp);
 #else
Index: linux-2.6.15/arch/m68k/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/m68k/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/m68k/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -181,7 +181,7 @@
  */
 out_of_memory:
 	up_read(&mm->mmap_sem);
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		yield();
 		down_read(&mm->mmap_sem);
 		goto survive;
Index: linux-2.6.15/arch/m68knommu/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/m68knommu/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/m68knommu/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -331,7 +331,7 @@
 	int		i;
 
 	printk(KERN_EMERG "\nCURRENT PROCESS:\n\n");
-	printk(KERN_EMERG "COMM=%s PID=%d\n", current->comm, current->pid);
+	printk(KERN_EMERG "COMM=%s PID=%d\n", current->comm, task_pid(current));
 
 	if (current->mm) {
 		printk(KERN_EMERG "TEXT=%08x-%08x DATA=%08x-%08x BSS=%08x-%08x\n",
Index: linux-2.6.15/arch/m68knommu/kernel/time.c
===================================================================
--- linux-2.6.15.orig/arch/m68knommu/kernel/time.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/m68knommu/kernel/time.c	2006-01-17 08:36:55.000000000 -0500
@@ -56,7 +56,7 @@
 #ifndef CONFIG_SMP
 	update_process_times(user_mode(regs));
 #endif
-	if (current->pid)
+	if (task_pid(current))
 		profile_tick(CPU_PROFILING, regs);
 
 	/*
Index: linux-2.6.15/arch/m68knommu/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/m68knommu/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/m68knommu/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -82,7 +82,7 @@
 	       fp->d4, fp->d5, fp->a0, fp->a1);
 
 	printk(KERN_EMERG "Process %s (pid: %d, stackpage=%08lx)\n",
-		current->comm, current->pid, PAGE_SIZE+(unsigned long)current);
+		current->comm, task_pid(current), PAGE_SIZE+(unsigned long)current);
 	show_stack(NULL, (unsigned long *)fp);
 	do_exit(SIGSEGV);
 }
@@ -166,7 +166,7 @@
 		printk (KERN_WARNING "*** Exception %d ***   FORMAT=%X\n",
 			(fp->ptregs.vector) >> 2, 
 			fp->ptregs.format);
-	printk (KERN_WARNING "Current process id is %d\n", current->pid);
+	printk (KERN_WARNING "Current process id is %d\n", task_pid(current));
 	die_if_kernel("BAD KERNEL TRAP", &fp->ptregs, 0);
 }
 
Index: linux-2.6.15/arch/m68knommu/platform/5307/timers.c
===================================================================
--- linux-2.6.15.orig/arch/m68knommu/platform/5307/timers.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/m68knommu/platform/5307/timers.c	2006-01-17 08:36:55.000000000 -0500
@@ -110,7 +110,7 @@
 {
 	/* Reset ColdFire timer2 */
 	mcf_proftp->ter = MCFTIMER_TER_CAP | MCFTIMER_TER_REF;
-	if (current->pid)
+	if (task_pid(current))
 		profile_tick(CPU_PROFILING, regs);
 }
 
Index: linux-2.6.15/arch/parisc/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/parisc/kernel/signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/parisc/kernel/signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -228,7 +228,7 @@
 	si.si_signo = SIGSEGV;
 	si.si_errno = 0;
 	si.si_code = SI_KERNEL;
-	si.si_pid = current->pid;
+	si.si_pid = task_pid(current);
 	si.si_uid = current->uid;
 	si.si_addr = &frame->uc;
 	force_sig_info(SIGSEGV, &si, current);
@@ -482,7 +482,7 @@
 
 
 	DBG(1,"setup_rt_frame: sig deliver (%s,%d) frame=0x%p sp=%#lx iaoq=%#lx/%#lx rp=%#lx\n",
-	       current->comm, current->pid, frame, regs->gr[30],
+	       current->comm, task_pid(current), frame, regs->gr[30],
 	       regs->iaoq[0], regs->iaoq[1], rp);
 
 	return 1;
Index: linux-2.6.15/arch/parisc/kernel/smp.c
===================================================================
--- linux-2.6.15.orig/arch/parisc/kernel/smp.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/parisc/kernel/smp.c	2006-01-17 08:36:55.000000000 -0500
@@ -692,7 +692,8 @@
 		}
 #else
 		printk("\n%s  %4d      0     0 --------",
-			(current->pid)?"RUNNING ": "IDLING  ",current->pid); 
+			(task_pid(current))?"RUNNING ": "IDLING  ",
+						task_pid(current));
 #endif
 	} else if ((argc==2) && !(strcmp(argv[1],"-s"))) { 
 #ifdef DUMP_MORE_STATE
@@ -717,7 +718,7 @@
 			}	
 		}
 #else
-		printk("\n%s    CPU0",(current->pid==0)?"RUNNING ":"IDLING  "); 
+		printk("\n%s    CPU0",(task_pid(current)==0)?"RUNNING ":"IDLING  ");
 #endif
 	} else {
 		printk("sys_cpus:Unknown request\n");
Index: linux-2.6.15/arch/parisc/kernel/sys_parisc32.c
===================================================================
--- linux-2.6.15.orig/arch/parisc/kernel/sys_parisc32.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/parisc/kernel/sys_parisc32.c	2006-01-17 08:36:55.000000000 -0500
@@ -95,7 +95,7 @@
 	int r22, int r21, int r20)
 {
     printk(KERN_ERR "%s(%d): Unimplemented 32 on 64 syscall #%d!\n", 
-    	current->comm, current->pid, r20);
+    	current->comm, task_pid(current), r20);
     return -ENOSYS;
 }
 
Index: linux-2.6.15/arch/parisc/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/parisc/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/parisc/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -216,7 +216,7 @@
 			return; /* STFU */
 
 		printk(KERN_CRIT "%s (pid %d): %s (code %ld) at " RFMT "\n",
-			current->comm, current->pid, str, err, regs->iaoq[0]);
+			current->comm, task_pid(current), str, err, regs->iaoq[0]);
 #ifdef PRINT_USER_FAULTS
 		/* XXX for debugging only */
 		show_regs(regs);
@@ -248,7 +248,7 @@
 		pdc_console_restart();
 	
 	printk(KERN_CRIT "%s (pid %d): %s (code %ld)\n",
-		current->comm, current->pid, str, err);
+		current->comm, task_pid(current), str, err);
 	show_regs(regs);
 
 	/* Wot's wrong wif bein' racy? */
@@ -288,7 +288,7 @@
 	case 0x00:
 #ifdef PRINT_USER_FAULTS
 		printk(KERN_DEBUG "break 0,0: pid=%d command='%s'\n",
-		       current->pid, current->comm);
+		       task_pid(current), current->comm);
 #endif
 		die_if_kernel("Breakpoint", regs, 0);
 #ifdef PRINT_USER_FAULTS
@@ -308,7 +308,7 @@
 	default:
 #ifdef PRINT_USER_FAULTS
 		printk(KERN_DEBUG "break %#08x: pid=%d command='%s'\n",
-		       iir, current->pid, current->comm);
+		       iir, task_pid(current), current->comm);
 		show_regs(regs);
 #endif
 		si.si_signo = SIGTRAP;
@@ -746,7 +746,7 @@
 		if (user_mode(regs)) {
 #ifdef PRINT_USER_FAULTS
 			printk(KERN_DEBUG "\nhandle_interruption() pid=%d command='%s'\n",
-			    current->pid, current->comm);
+			    task_pid(current), current->comm);
 			show_regs(regs);
 #endif
 			/* SIGBUS, for lack of a better one. */
@@ -771,7 +771,7 @@
 		else
 			printk(KERN_DEBUG "User Fault (long pointer) (fault %d) ",
 			       code);
-		printk("pid=%d command='%s'\n", current->pid, current->comm);
+		printk("pid=%d command='%s'\n", task_pid(current), current->comm);
 		show_regs(regs);
 #endif
 		si.si_signo = SIGSEGV;
Index: linux-2.6.15/arch/parisc/kernel/unaligned.c
===================================================================
--- linux-2.6.15.orig/arch/parisc/kernel/unaligned.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/parisc/kernel/unaligned.c	2006-01-17 08:36:55.000000000 -0500
@@ -527,7 +527,7 @@
 		    && ++unaligned_count < 5) {
 			char buf[256];
 			sprintf(buf, "%s(%d): unaligned access to 0x" RFMT " at ip=0x" RFMT "\n",
-				current->comm, current->pid, regs->ior, regs->iaoq[0]);
+				current->comm, task_pid(current), regs->ior, regs->iaoq[0]);
 			printk(KERN_WARNING "%s", buf);
 #ifdef DEBUG_UNALIGNED
 			show_regs(regs);
Index: linux-2.6.15/arch/parisc/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/parisc/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/parisc/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -214,7 +214,7 @@
 #ifdef PRINT_USER_FAULTS
 		printk(KERN_DEBUG "\n");
 		printk(KERN_DEBUG "do_page_fault() pid=%d command='%s' type=%lu address=0x%08lx\n",
-		    tsk->pid, tsk->comm, code, address);
+		    task_pid(tsk), tsk->comm, code, address);
 		if (vma) {
 			printk(KERN_DEBUG "vm_start = 0x%08lx, vm_end = 0x%08lx\n",
 					vma->vm_start, vma->vm_end);
Index: linux-2.6.15/arch/ppc/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/ppc/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/ppc/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -112,7 +112,7 @@
 	{
 		printk("stack out of bounds: %s/%d\n"
 		       " tsk_top %08lx ksp %08lx stack_top %08lx\n",
-		       tsk->comm,tsk->pid,
+		       tsk->comm,task_pid(tsk),
 		       tsk_top, tsk->thread.ksp, stack_top);
 		ret |= 2;
 	}
@@ -122,7 +122,7 @@
 	{
 		printk("current stack ptr out of bounds: %s/%d\n"
 		       " tsk_top %08lx sp %08lx stack_top %08lx\n",
-		       current->comm,current->pid,
+		       current->comm,task_pid(current),
 		       tsk_top, _get_SP(), stack_top);
 		ret |= 4;
 	}
@@ -384,7 +384,7 @@
 	if (trap == 0x300 || trap == 0x600)
 		printk("DAR: %08lX, DSISR: %08lX\n", regs->dar, regs->dsisr);
 	printk("TASK = %p[%d] '%s' THREAD: %p\n",
-	       current, current->pid, current->comm, current->thread_info);
+	       current, task_pid(current), current->comm, current->thread_info);
 	printk("Last syscall: %ld ", current->thread.last_syscall);
 
 #ifdef CONFIG_SMP
Index: linux-2.6.15/arch/ppc/kernel/softemu8xx.c
===================================================================
--- linux-2.6.15.orig/arch/ppc/kernel/softemu8xx.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/ppc/kernel/softemu8xx.c	2006-01-17 08:36:55.000000000 -0500
@@ -120,7 +120,7 @@
 		printk("Bad emulation %s/%d\n"
 		       " NIP: %08lx instruction: %08x opcode: %x "
 		       "A: %x B: %x C: %x code: %x rc: %x\n",
-		       current->comm,current->pid,
+		       current->comm,task_pid(current),
 		       regs->nip,
 		       instword,inst,
 		       (instword>>16)&0x1f,
Index: linux-2.6.15/arch/ppc/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/ppc/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/ppc/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -131,7 +131,7 @@
 	 * generate the same exception over and over again and we get
 	 * nowhere.  Better to kill it and let the kernel panic.
 	 */
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		__sighandler_t handler;
 
 		spin_lock_irq(&current->sighand->siglock);
@@ -759,7 +759,7 @@
 void trace_syscall(struct pt_regs *regs)
 {
 	printk("Task: %p(%d), PC: %08lX/%08lX, Syscall: %3ld, Result: %s%ld    %s\n",
-	       current, current->pid, regs->nip, regs->link, regs->gpr[0],
+	       current, task_pid(current), regs->nip, regs->link, regs->gpr[0],
 	       regs->ccr&0x10000000?"Error=":"", regs->gpr[3], print_tainted());
 }
 
Index: linux-2.6.15/arch/ppc/lib/locks.c
===================================================================
--- linux-2.6.15.orig/arch/ppc/lib/locks.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/ppc/lib/locks.c	2006-01-17 08:36:55.000000000 -0500
@@ -80,7 +80,7 @@
   	if ( !lp->lock )
 		printk("_spin_unlock(%p): no lock cpu %d curr PC %p %s/%d\n",
 		       lp, smp_processor_id(), __builtin_return_address(0),
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 	if ( lp->owner_cpu != smp_processor_id() )
 		printk("_spin_unlock(%p): cpu %d trying clear of cpu %d pc %lx val %lx\n",
 		      lp, smp_processor_id(), (int)lp->owner_cpu,
@@ -142,7 +142,7 @@
 {
 	if ( rw->lock == 0 )
 		printk("_read_unlock(): %s/%d (nip %08lX) lock %d\n",
-		       current->comm,current->pid,current->thread.regs->nip,
+		       current->comm,task_pid(current),current->thread.regs->nip,
 		      rw->lock);
 	wmb();
 	atomic_dec((atomic_t *) &(rw)->lock);
@@ -180,7 +180,7 @@
 {
 	if (rw->lock >= 0)
 		printk("_write_lock(): %s/%d (nip %08lX) lock %d\n",
-		      current->comm,current->pid,current->thread.regs->nip,
+		      current->comm,task_pid(current),current->thread.regs->nip,
 		      rw->lock);
 	wmb();
 	rw->lock = 0;
Index: linux-2.6.15/arch/ppc/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/ppc/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/ppc/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -290,7 +290,7 @@
  */
 out_of_memory:
 	up_read(&mm->mmap_sem);
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		yield();
 		down_read(&mm->mmap_sem);
 		goto survive;
Index: linux-2.6.15/arch/ppc/xmon/xmon.c
===================================================================
--- linux-2.6.15.orig/arch/ppc/xmon/xmon.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/ppc/xmon/xmon.c	2006-01-17 08:36:55.000000000 -0500
@@ -895,7 +895,7 @@
 		printf("dar = %x, dsisr = %x\n", fp->dar, fp->dsisr);
 	if (current)
 		printf("current = %x, pid = %d, comm = %s\n",
-		       current, current->pid, current->comm);
+		       current, task_pid(current), current->comm);
 }
 
 void
Index: linux-2.6.15/arch/s390/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/s390/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/s390/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -155,7 +155,7 @@
 
         printk("CPU:    %d    %s\n", tsk->thread_info->cpu, print_tainted());
         printk("Process %s (pid: %d, task: %p, ksp: %p)\n",
-	       current->comm, current->pid, (void *) tsk,
+	       current->comm, task_pid(current), (void *) tsk,
 	       (void *) tsk->thread.ksp);
 
 	show_registers(regs);
Index: linux-2.6.15/arch/s390/math-emu/math.c
===================================================================
--- linux-2.6.15.orig/arch/s390/math-emu/math.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/s390/math-emu/math.c	2006-01-17 08:36:55.000000000 -0500
@@ -108,7 +108,7 @@
                 location = (__u16 *)(regs->psw.addr-S390_lowcore.pgm_ilc);
                 printk("%s ieee fpu instruction not emulated "
                        "process name: %s pid: %d \n",
-                       instr, current->comm, current->pid);
+                       instr, current->comm, task_pid(current));
                 printk("%s's PSW:    %08lx %08lx\n", instr,
                        (unsigned long) regs->psw.mask,
                        (unsigned long) location);
Index: linux-2.6.15/arch/s390/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/s390/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/s390/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -316,7 +316,7 @@
 */
 out_of_memory:
 	up_read(&mm->mmap_sem);
-	if (tsk->pid == 1) {
+	if (task_pid(tsk) == 1) {
 		yield();
 		goto survive;
 	}
Index: linux-2.6.15/arch/sh/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/sh/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sh/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -105,7 +105,7 @@
 void show_regs(struct pt_regs * regs)
 {
 	printk("\n");
-	printk("Pid : %d, Comm: %20s\n", current->pid, current->comm);
+	printk("Pid : %d, Comm: %20s\n", task_pid(current), current->comm);
 	print_symbol("PC is at %s\n", regs->pc);
 	printk("PC  : %08lx SP  : %08lx SR  : %08lx ",
 	       regs->pc, regs->regs[15], regs->sr);
Index: linux-2.6.15/arch/sh/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/sh/kernel/signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sh/kernel/signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -404,7 +404,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%08lx pr=%08lx\n",
-		current->comm, current->pid, frame, regs->pc, regs->pr);
+		current->comm, task_pid(current), frame, regs->pc, regs->pr);
 #endif
 
 	flush_cache_sigtramp(regs->pr);
@@ -479,7 +479,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%08lx pr=%08lx\n",
-		current->comm, current->pid, frame, regs->pc, regs->pr);
+		current->comm, task_pid(current), frame, regs->pc, regs->pr);
 #endif
 
 	flush_cache_sigtramp(regs->pr);
Index: linux-2.6.15/arch/sh/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/sh/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sh/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -362,7 +362,7 @@
 		handle_unaligned_notify_count--;
 
 		printk("Fixing up unaligned userspace access in \"%s\" pid=%d pc=0x%p ins=0x%04hx\n",
-		       current->comm,current->pid,(u16*)regs->pc,instruction);
+		       current->comm,task_pid(current),(u16*)regs->pc,instruction);
 	}
 
 	ret = -EFAULT;
Index: linux-2.6.15/arch/sh/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/sh/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sh/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -160,7 +160,7 @@
  */
 out_of_memory:
 	up_read(&mm->mmap_sem);
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		yield();
 		down_read(&mm->mmap_sem);
 		goto survive;
Index: linux-2.6.15/arch/sh64/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/sh64/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sh64/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -927,7 +927,7 @@
 	struct task_struct *p;
 	read_lock(&tasklist_lock);
 	for_each_process(p) {
-		int pid = p->pid;
+		int pid = task_pid(p);
 		struct mm_struct *mm;
 		if (!pid) continue;
 		mm = p->mm;
Index: linux-2.6.15/arch/sh64/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/sh64/kernel/signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sh64/kernel/signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -512,7 +512,7 @@
 	/* Broken %016Lx */
 	printk("SIG deliver (#%d,%s:%d): sp=%p pc=%08Lx%08Lx link=%08Lx%08Lx\n",
 		signal,
-		current->comm, current->pid, frame,
+		current->comm, task_pid(current), frame,
 		regs->pc >> 32, regs->pc & 0xffffffff,
 		DEREF_REG_PR >> 32, DEREF_REG_PR & 0xffffffff);
 #endif
@@ -618,7 +618,7 @@
 	/* Broken %016Lx */
 	printk("SIG deliver (#%d,%s:%d): sp=%p pc=%08Lx%08Lx link=%08Lx%08Lx\n",
 		signal,
-		current->comm, current->pid, frame,
+		current->comm, task_pid(current), frame,
 		regs->pc >> 32, regs->pc & 0xffffffff,
 		DEREF_REG_PR >> 32, DEREF_REG_PR & 0xffffffff);
 #endif
Index: linux-2.6.15/arch/sh64/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/sh64/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sh64/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -765,7 +765,7 @@
 		--user_mode_unaligned_fixup_count;
 		/* Only do 'count' worth of these reports, to remove a potential DoS against syslog */
 		printk("Fixing up unaligned userspace access in \"%s\" pid=%d pc=0x%08x ins=0x%08lx\n",
-		       current->comm, current->pid, (__u32)regs->pc, opcode);
+		       current->comm, task_pid(current), (__u32)regs->pc, opcode);
 	} else
 #endif
 	if (!user_mode(regs) && (kernel_mode_unaligned_fixup_count > 0)) {
@@ -775,7 +775,7 @@
 			       (__u32)regs->pc, opcode);
 		} else {
 			printk("Fixing up unaligned kernelspace access in \"%s\" pid=%d pc=0x%08x ins=0x%08lx\n",
-			       current->comm, current->pid, (__u32)regs->pc, opcode);
+			       current->comm, task_pid(current), (__u32)regs->pc, opcode);
 		}
 	}
 
Index: linux-2.6.15/arch/sh64/lib/dbg.c
===================================================================
--- linux-2.6.15.orig/arch/sh64/lib/dbg.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sh64/lib/dbg.c	2006-01-17 08:36:55.000000000 -0500
@@ -173,7 +173,7 @@
 	int pid;
 	struct ring_node *rr;
 
-	pid = current->pid;
+	pid = task_pid(current);
 	stack_bottom = (unsigned long) current->thread_info;
 	asm volatile("ori r15, 0, %0" : "=r" (sp));
 	rr = event_ring + event_ptr;
@@ -209,7 +209,7 @@
 			 *
 			 * Just overwrite old entries on ring overflow - this
 			 * is only for last-hope debugging. */
-			stored_syscalls[syscall_next].pid = current->pid;
+			stored_syscalls[syscall_next].pid = task_pid(current);
 			stored_syscalls[syscall_next].syscall_number = syscallno;
 			syscall_next++;
 			syscall_next &= (N_STORED_SYSCALLS - 1);
@@ -230,7 +230,7 @@
 void evt_debug2(unsigned int ret)
 {
 	drain_syscalls();
-	printk("Task %d: syscall returns %08x\n", current->pid, ret);
+	printk("Task %d: syscall returns %08x\n", task_pid(current), ret);
 }
 
 void evt_debug_ret_from_irq(struct pt_regs *regs)
@@ -238,7 +238,7 @@
 	int pid;
 	struct ring_node *rr;
 
-	pid = current->pid;
+	pid = task_pid(current);
 	rr = event_ring + event_ptr;
 	rr->evt = 0xffff;
 	rr->ret_addr = 0;
@@ -254,7 +254,7 @@
 	int pid;
 	struct ring_node *rr;
 
-	pid = current->pid;
+	pid = task_pid(current);
 	rr = event_ring + event_ptr;
 	rr->evt = 0xfffe;
 	rr->ret_addr = 0;
@@ -276,7 +276,7 @@
 
 	printk("\n");
 	printk("EXCEPTION - %s: task %d; Linux trap # %d; signal = %d\n",
-	       ((from) ? from : "???"), current->pid, trapnr, signr);
+	       ((from) ? from : "???"), task_pid(current), trapnr, signr);
 
 	asm volatile ("getcon   " __EXPEVT ", %0":"=r"(ah));
 	asm volatile ("getcon   " __EXPEVT ", %0":"=r"(al));
Index: linux-2.6.15/arch/sh64/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/sh64/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sh64/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -82,7 +82,7 @@
 
 static inline void print_task(struct task_struct *tsk)
 {
-	printk("Task pid %d\n", tsk->pid);
+	printk("Task pid %d\n", task_pid(tsk));
 }
 
 static pte_t *lookup_pte(struct mm_struct *mm, unsigned long address)
@@ -271,13 +271,13 @@
 			 * usermode, so only need a few */
 			count++;
 			printk("user mode bad_area address=%08lx pid=%d (%s) pc=%08lx\n",
-				address, current->pid, current->comm,
+				address, task_pid(current), current->comm,
 				(unsigned long) regs->pc);
 #if 0
 			show_regs(regs);
 #endif
 		}
-		if (tsk->pid == 1) {
+		if (task_pid(tsk) == 1) {
 			panic("INIT had user mode bad_area\n");
 		}
 		tsk->thread.address = address;
@@ -319,14 +319,14 @@
  * us unable to handle the page fault gracefully.
  */
 out_of_memory:
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		panic("INIT out of memory\n");
 		yield();
 		goto survive;
 	}
 	printk("fault:Out of memory\n");
 	up_read(&mm->mmap_sem);
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		yield();
 		down_read(&mm->mmap_sem);
 		goto survive;
Index: linux-2.6.15/arch/sparc/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/sparc/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -554,7 +554,7 @@
 #endif
 
 	/* Set the return value for the child. */
-	childregs->u_regs[UREG_I0] = current->pid;
+	childregs->u_regs[UREG_I0] = task_pid(current);
 	childregs->u_regs[UREG_I1] = 1;
 
 	/* Set the return value for the parent. */
Index: linux-2.6.15/arch/sparc/kernel/ptrace.c
===================================================================
--- linux-2.6.15.orig/arch/sparc/kernel/ptrace.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc/kernel/ptrace.c	2006-01-17 08:36:55.000000000 -0500
@@ -155,7 +155,7 @@
 		/* Rest of them are completely unsupported. */
 	default:
 		printk("%s [%d]: Wants to read user offset %ld\n",
-		       current->comm, current->pid, offset);
+		       current->comm, task_pid(current), offset);
 		pt_error_return(regs, EIO);
 		return;
 	}
@@ -222,7 +222,7 @@
 		/* Rest of them are completely unsupported or "no-touch". */
 	default:
 		printk("%s [%d]: Wants to write user offset %ld\n",
-		       current->comm, current->pid, offset);
+		       current->comm, task_pid(current), offset);
 		goto failure;
 	}
 success:
@@ -596,7 +596,7 @@
 asmlinkage void syscall_trace(void)
 {
 #ifdef DEBUG_PTRACE
-	printk("%s [%d]: syscall_trace\n", current->comm, current->pid);
+	printk("%s [%d]: syscall_trace\n", current->comm, task_pid(current));
 #endif
 	if (!test_thread_flag(TIF_SYSCALL_TRACE))
 		return;
@@ -612,7 +612,7 @@
 	 */
 #ifdef DEBUG_PTRACE
 	printk("%s [%d]: syscall_trace exit= %x\n", current->comm,
-		current->pid, current->exit_code);
+		task_pid(current), current->exit_code);
 #endif
 	if (current->exit_code) {
 		send_sig (current->exit_code, current, 1);
Index: linux-2.6.15/arch/sparc/kernel/setup.c
===================================================================
--- linux-2.6.15.orig/arch/sparc/kernel/setup.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc/kernel/setup.c	2006-01-17 08:36:55.000000000 -0500
@@ -85,7 +85,7 @@
 		prom_palette(1);
 	prom_printf("PROM SYNC COMMAND...\n");
 	show_free_areas();
-	if(current->pid != 0) {
+	if(task_pid(current) != 0) {
 		local_irq_enable();
 		sys_sync();
 		local_irq_disable();
Index: linux-2.6.15/arch/sparc/kernel/sys_sparc.c
===================================================================
--- linux-2.6.15.orig/arch/sparc/kernel/sys_sparc.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc/kernel/sys_sparc.c	2006-01-17 08:36:55.000000000 -0500
@@ -353,7 +353,7 @@
 	if (count++ > 5)
 		return -ENOSYS;
 	printk ("%s[%d]: Unimplemented SPARC system call %d\n",
-		current->comm, current->pid, (int)regs->u_regs[1]);
+		current->comm, task_pid(current), (int)regs->u_regs[1]);
 #ifdef DEBUG_UNIMP_SYSCALL	
 	show_regs (regs);
 #endif
Index: linux-2.6.15/arch/sparc/kernel/sys_sunos.c
===================================================================
--- linux-2.6.15.orig/arch/sparc/kernel/sys_sunos.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc/kernel/sys_sunos.c	2006-01-17 08:36:55.000000000 -0500
@@ -825,7 +825,7 @@
 	int ret;
 
 	/* So stupid... */
-	if ((!pid || pid == current->pid) &&
+	if ((!pid || pid == task_pid(current)) &&
 	    !pgid) {
 		sys_setsid();
 		ret = 0;
Index: linux-2.6.15/arch/sparc/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/sparc/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -39,7 +39,7 @@
 
 void syscall_trace_entry(struct pt_regs *regs)
 {
-	printk("%s[%d]: ", current->comm, current->pid);
+	printk("%s[%d]: ", current->comm, task_pid(current));
 	printk("scall<%d> (could be %d)\n", (int) regs->u_regs[UREG_G1],
 	       (int) regs->u_regs[UREG_I0]);
 }
@@ -100,7 +100,7 @@
 "              /_| \\__/ |_\\\n"
 "                 \\__U_/\n");
 
-	printk("%s(%d): %s [#%d]\n", current->comm, current->pid, str, ++die_counter);
+	printk("%s(%d): %s [#%d]\n", current->comm, task_pid(current), str, ++die_counter);
 	show_regs(regs);
 
 	__SAVE; __SAVE; __SAVE; __SAVE;
Index: linux-2.6.15/arch/sparc/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/sparc/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -318,7 +318,7 @@
 	if(from_user) {
 #if 0
 		printk("Fault whee %s [%d]: segfaults at %08lx pc=%08lx\n",
-		       tsk->comm, tsk->pid, address, regs->pc);
+		       tsk->comm, task_pid(tsk), address, regs->pc);
 #endif
 		info.si_signo = SIGSEGV;
 		info.si_errno = 0;
@@ -512,7 +512,7 @@
 
 #if 0
 	printk("wf<pid=%d,wr=%d,addr=%08lx>\n",
-	       tsk->pid, write, address);
+	       task_pid(tsk), write, address);
 #endif
 	down_read(&mm->mmap_sem);
 	vma = find_vma(mm, address);
@@ -544,7 +544,7 @@
 	up_read(&mm->mmap_sem);
 #if 0
 	printk("Window whee %s [%d]: segfaults at %08lx\n",
-	       tsk->comm, tsk->pid, address);
+	       tsk->comm, task_pid(tsk), address);
 #endif
 	info.si_signo = SIGSEGV;
 	info.si_errno = 0;
Index: linux-2.6.15/arch/sparc64/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/sparc64/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc64/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -669,7 +669,7 @@
 	}
 
 	/* Set the return value for the child. */
-	t->kregs->u_regs[UREG_I0] = current->pid;
+	t->kregs->u_regs[UREG_I0] = task_pid(current);
 	t->kregs->u_regs[UREG_I1] = 1;
 
 	/* Set the second return value for the parent. */
Index: linux-2.6.15/arch/sparc64/kernel/setup.c
===================================================================
--- linux-2.6.15.orig/arch/sparc64/kernel/setup.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc64/kernel/setup.c	2006-01-17 08:36:55.000000000 -0500
@@ -129,7 +129,7 @@
 	if (!strcmp(cmd, "sync")) {
 		prom_printf("PROM `%s' command...\n", cmd);
 		show_free_areas();
-		if (current->pid != 0) {
+		if (task_pid(current) != 0) {
 			local_irq_enable();
 			sys_sync();
 			local_irq_disable();
Index: linux-2.6.15/arch/sparc64/kernel/sys_sunos32.c
===================================================================
--- linux-2.6.15.orig/arch/sparc64/kernel/sys_sunos32.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc64/kernel/sys_sunos32.c	2006-01-17 08:36:55.000000000 -0500
@@ -791,7 +791,7 @@
 	int ret;
 
 	/* So stupid... */
-	if ((!pid || pid == current->pid) &&
+	if ((!pid || pid == task_pid(current)) &&
 	    !pgid) {
 		sys_setsid();
 		ret = 0;
Index: linux-2.6.15/arch/sparc64/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/sparc64/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc64/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -1896,7 +1896,7 @@
 "              /_| \\__/ |_\\\n"
 "                 \\__U_/\n");
 
-	printk("%s(%d): %s [#%d]\n", current->comm, current->pid, str, ++die_counter);
+	printk("%s(%d): %s [#%d]\n", current->comm, task_pid(current), str, ++die_counter);
 	notify_die(DIE_OOPS, str, regs, 0, 255, SIGSEGV);
 	__asm__ __volatile__("flushw");
 	__show_regs(regs);
Index: linux-2.6.15/arch/sparc64/solaris/ioctl.c
===================================================================
--- linux-2.6.15.orig/arch/sparc64/solaris/ioctl.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/sparc64/solaris/ioctl.c	2006-01-17 08:36:55.000000000 -0500
@@ -548,13 +548,13 @@
 			return solaris_ioctl(fd, si.cmd, si.data);
 		}
 	case 9: /* I_SETSIG */
-		return sys_ioctl(fd, FIOSETOWN, current->pid);
+		return sys_ioctl(fd, FIOSETOWN, task_pid(current));
 	case 10: /* I_GETSIG */
 		old_fs = get_fs();
 		set_fs(KERNEL_DS);
 		sys_ioctl(fd, FIOGETOWN, (unsigned long)&ret);
 		set_fs(old_fs);
-		if (ret == current->pid) return 0x3ff;
+		if (ret == task_pid(current)) return 0x3ff;
 		else return -EINVAL;
 	case 11: /* I_FIND */
         {
Index: linux-2.6.15/arch/um/kernel/process_kern.c
===================================================================
--- linux-2.6.15.orig/arch/um/kernel/process_kern.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/um/kernel/process_kern.c	2006-01-17 08:36:55.000000000 -0500
@@ -180,7 +180,7 @@
 
 int current_pid(void)
 {
-	return(current->pid);
+	return(task_pid(current));
 }
 
 void default_idle(void)
@@ -261,7 +261,7 @@
 void force_sigbus(void)
 {
 	printk(KERN_ERR "Killing pid %d because of a lack of memory\n", 
-	       current->pid);
+	       task_pid(current));
 	lock_kernel();
 	sigaddset(&current->pending.signal, SIGBUS);
 	recalc_sigpending();
Index: linux-2.6.15/arch/um/kernel/skas/process_kern.c
===================================================================
--- linux-2.6.15.orig/arch/um/kernel/skas/process_kern.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/um/kernel/skas/process_kern.c	2006-01-17 08:36:55.000000000 -0500
@@ -32,13 +32,13 @@
 	to = next;
 
 	/* XXX need to check runqueues[cpu].idle */
-	if(current->pid == 0)
+	if(task_pid(current) == 0)
 		switch_timers(0);
 
 	switch_threads(&from->thread.mode.skas.switch_buf, 
 		       to->thread.mode.skas.switch_buf);
 
-	if(current->pid == 0)
+	if(task_pid(current) == 0)
 		switch_timers(1);
 }
 
Index: linux-2.6.15/arch/um/kernel/trap_kern.c
===================================================================
--- linux-2.6.15.orig/arch/um/kernel/trap_kern.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/um/kernel/trap_kern.c	2006-01-17 08:36:55.000000000 -0500
@@ -116,7 +116,7 @@
  * us unable to handle the page fault gracefully.
  */
 out_of_memory:
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		up_read(&mm->mmap_sem);
 		yield();
 		down_read(&mm->mmap_sem);
Index: linux-2.6.15/arch/um/sys-x86_64/sysrq.c
===================================================================
--- linux-2.6.15.orig/arch/um/sys-x86_64/sysrq.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/um/sys-x86_64/sysrq.c	2006-01-17 08:36:55.000000000 -0500
@@ -16,7 +16,7 @@
 	printk("\n");
 	print_modules();
 	printk("Pid: %d, comm: %.20s %s %s\n",
-	       current->pid, current->comm, print_tainted(), system_utsname.release);
+	       task_pid(current), current->comm, print_tainted(), system_utsname.release);
 	printk("RIP: %04lx:[<%016lx>] ", PT_REGS_CS(regs) & 0xffff,
 	       PT_REGS_RIP(regs));
 	printk("\nRSP: %016lx  EFLAGS: %08lx\n", PT_REGS_RSP(regs),
Index: linux-2.6.15/arch/v850/kernel/bug.c
===================================================================
--- linux-2.6.15.orig/arch/v850/kernel/bug.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/v850/kernel/bug.c	2006-01-17 08:36:55.000000000 -0500
@@ -38,7 +38,7 @@
 {
 	printk (KERN_CRIT
 		"unimplemented trap %d called at 0x%08lx, pid %d!\n",
-		trap_num, regs->pc, current->pid);
+		trap_num, regs->pc, task_pid(current));
 	return -ENOSYS;
 }
 
@@ -50,7 +50,7 @@
 		"unexpected reset in %s mode, pid %d"
 		" (ret_addr = 0x%lx, sp = 0x%lx)\n",
 		kmode ? "kernel" : "user",
-		task ? task->pid : -1,
+		task ? task_pid(task) : -1,
 		ret_addr, sp);
 
 	machine_halt ();
Index: linux-2.6.15/arch/v850/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/v850/kernel/signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/v850/kernel/signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -338,7 +338,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%08lx ra=%08lx\n",
-		current->comm, current->pid, frame, regs->pc, );
+		current->comm, task_pid(current), frame, regs->pc, );
 #endif
 
 	return;
@@ -413,7 +413,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%08lx pr=%08lx\n",
-		current->comm, current->pid, frame, regs->pc, regs->pr);
+		current->comm, task_pid(current), frame, regs->pc, regs->pr);
 #endif
 
 	return;
Index: linux-2.6.15/arch/x86_64/ia32/ia32_signal.c
===================================================================
--- linux-2.6.15.orig/arch/x86_64/ia32/ia32_signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/x86_64/ia32/ia32_signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -515,7 +515,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p\n",
-		current->comm, current->pid, frame, regs->rip, frame->pretcode);
+		current->comm, task_pid(current), frame, regs->rip, frame->pretcode);
 #endif
 
 	return 1;
@@ -615,7 +615,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p\n",
-		current->comm, current->pid, frame, regs->rip, frame->pretcode);
+		current->comm, task_pid(current), frame, regs->rip, frame->pretcode);
 #endif
 
 	return 1;
Index: linux-2.6.15/arch/x86_64/kernel/mce.c
===================================================================
--- linux-2.6.15.orig/arch/x86_64/kernel/mce.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/x86_64/kernel/mce.c	2006-01-17 08:36:55.000000000 -0500
@@ -248,7 +248,7 @@
 		   but most likely they occur at boot anyways, where
 		   it is best to just halt the machine. */
 		if ((!user_space && (panic_on_oops || tolerant < 2)) ||
-		    (unsigned)current->pid <= 1)
+		    (unsigned)task_pid(current) <= 1)
 			mce_panic("Uncorrected machine check", &panicm, mcestart);
 
 		/* do_exit takes an awful lot of locks and has as
Index: linux-2.6.15/arch/x86_64/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/x86_64/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/x86_64/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -270,7 +270,7 @@
 	printk("\n");
 	print_modules();
 	printk("Pid: %d, comm: %.20s %s %s %.*s\n",
-		current->pid, current->comm, print_tainted(),
+		task_pid(current), current->comm, print_tainted(),
 		system_utsname.release,
 		(int)strcspn(system_utsname.version, " "),
 		system_utsname.version);
Index: linux-2.6.15/arch/x86_64/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/x86_64/kernel/signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/x86_64/kernel/signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -174,7 +174,7 @@
 		goto badframe;
 
 #ifdef DEBUG_SIG
-	printk("%d sigreturn rip:%lx rsp:%lx frame:%p rax:%lx\n",current->pid,regs.rip,regs.rsp,frame,eax);
+	printk("%d sigreturn rip:%lx rsp:%lx frame:%p rax:%lx\n",task_pid(current),regs.rip,regs.rsp,frame,eax);
 #endif
 
 	if (do_sigaltstack(&frame->uc.uc_stack, NULL, regs->rsp) == -EFAULT)
@@ -307,7 +307,7 @@
 		goto give_sigsegv;
 
 #ifdef DEBUG_SIG
-	printk("%d old rip %lx old rsp %lx old rax %lx\n", current->pid,regs->rip,regs->rsp,regs->rax);
+	printk("%d old rip %lx old rsp %lx old rax %lx\n", task_pid(current),regs->rip,regs->rsp,regs->rax);
 #endif
 
 	/* Set up registers for signal handler */
@@ -341,7 +341,7 @@
 		ptrace_notify(SIGTRAP);
 #ifdef DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=%p pc=%p ra=%p\n",
-		current->comm, current->pid, frame, regs->rip, frame->pretcode);
+		current->comm, task_pid(current), frame, regs->rip, frame->pretcode);
 #endif
 
 	return 1;
@@ -363,7 +363,7 @@
 
 #ifdef DEBUG_SIG
 	printk("handle_signal pid:%d sig:%lu rip:%lx rsp:%lx regs=%p\n",
-		current->pid, sig,
+		task_pid(current), sig,
 		regs->rip, regs->rsp, regs);
 #endif
 
@@ -507,7 +507,7 @@
 	struct task_struct *me = current; 
 	if (exception_trace)
 		printk("%s[%d] bad frame in %s frame:%p rip:%lx rsp:%lx orax:%lx\n",
-	       me->comm,me->pid,where,frame,regs->rip,regs->rsp,regs->orig_rax); 
+	       me->comm,task_pid(me),where,frame,regs->rip,regs->rsp,regs->orig_rax);
 
 	force_sig(SIGSEGV, me); 
 } 
Index: linux-2.6.15/arch/x86_64/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/x86_64/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/x86_64/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -282,7 +282,7 @@
 	printk("CPU %d ", cpu);
 	__show_regs(regs);
 	printk("Process %s (pid: %d, threadinfo %p, task %p)\n",
-		cur->comm, cur->pid, cur->thread_info, cur);
+		cur->comm, task_pid(cur), cur->thread_info, cur);
 
 	/*
 	 * When in-kernel, we also print out the stack and code at the
@@ -434,7 +434,7 @@
 		if (exception_trace && unhandled_signal(tsk, signr))
 			printk(KERN_INFO
 			       "%s[%d] trap %s rip:%lx rsp:%lx error:%lx\n",
-			       tsk->comm, tsk->pid, str,
+			       tsk->comm, task_pid(tsk), str,
 			       regs->rip,regs->rsp,error_code); 
 
 		tsk->thread.error_code = error_code;
@@ -506,7 +506,7 @@
 		if (exception_trace && unhandled_signal(tsk, SIGSEGV))
 			printk(KERN_INFO
 		       "%s[%d] general protection rip:%lx rsp:%lx error:%lx\n",
-			       tsk->comm, tsk->pid,
+			       tsk->comm, task_pid(tsk),
 			       regs->rip,regs->rsp,error_code); 
 
 		tsk->thread.error_code = error_code;
Index: linux-2.6.15/arch/x86_64/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/x86_64/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/x86_64/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -210,7 +210,7 @@
 
 int unhandled_signal(struct task_struct *tsk, int sig)
 {
-	if (tsk->pid == 1)
+	if (task_pid(tsk) == 1)
 		return 1;
 	if (tsk->ptrace & PT_PTRACED)
 		return 0;
@@ -470,8 +470,8 @@
 		if (exception_trace && unhandled_signal(tsk, SIGSEGV)) {
 			printk(
 		       "%s%s[%d]: segfault at %016lx rip %016lx rsp %016lx error %lx\n",
-					tsk->pid > 1 ? KERN_INFO : KERN_EMERG,
-					tsk->comm, tsk->pid, address, regs->rip,
+					task_pid(tsk) > 1 ? KERN_INFO : KERN_EMERG,
+					tsk->comm, task_pid(tsk), address, regs->rip,
 					regs->rsp, error_code);
 		}
        
@@ -533,7 +533,7 @@
  */
 out_of_memory:
 	up_read(&mm->mmap_sem);
-	if (current->pid == 1) { 
+	if (task_pid(current) == 1) {
 		yield();
 		goto again;
 	}
Index: linux-2.6.15/arch/xtensa/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/xtensa/kernel/signal.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/xtensa/kernel/signal.c	2006-01-17 08:36:55.000000000 -0500
@@ -570,7 +570,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): signal=%d sp=%p pc=%08x\n",
-		current->comm, current->pid, signal, frame, regs->pc);
+		current->comm, task_pid(current), signal, frame, regs->pc);
 #endif
 
 	return;
@@ -634,7 +634,7 @@
 
 #if DEBUG_SIG
 	printk("SIG rt deliver (%s:%d): signal=%d sp=%p pc=%08x\n",
-		current->comm, current->pid, signal, frame, regs->pc);
+		current->comm, task_pid(current), signal, frame, regs->pc);
 #endif
 
 	return;
Index: linux-2.6.15/arch/xtensa/kernel/syscalls.c
===================================================================
--- linux-2.6.15.orig/arch/xtensa/kernel/syscalls.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/xtensa/kernel/syscalls.c	2006-01-17 08:36:55.000000000 -0500
@@ -245,7 +245,7 @@
 	if (strncmp(sysname, "sys_", 4) == 0)
 		sysname = sysname + 4;
 
-	printk("\017SYSCALL:I:%x:%d:%s  %s(", regs->pc, current->pid,
+	printk("\017SYSCALL:I:%x:%d:%s  %s(", regs->pc, task_pid(current),
 	       current->comm, sysname);
 	for (i = 0; i < nargs; i++)
 		printk((i>0) ? ", %#lx" : "%#lx", parms[i]);
@@ -255,7 +255,7 @@
 	res = syscall((void *)parm0, parm1, parm2, parm3, parm4, parm5);
 
 #if DEBUG
-	printk("\017SYSCALL:O:%d:%s  %s(",current->pid, current->comm, sysname);
+	printk("\017SYSCALL:O:%d:%s  %s(",task_pid(current), current->comm, sysname);
 	for (i = 0; i < nargs; i++)
 		printk((i>0) ? ", %#lx" : "%#lx", parms[i]);
 	if (res < 4096)
Index: linux-2.6.15/arch/xtensa/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/xtensa/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/xtensa/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -176,7 +176,7 @@
 	printk("Caught unhandled exception in '%s' "
 	       "(pid = %d, pc = %#010lx) - should not happen\n"
 	       "\tEXCCAUSE is %ld\n",
-	       current->comm, current->pid, regs->pc, exccause);
+	       current->comm, task_pid(current), regs->pc, exccause);
 	force_sig(SIGILL, current);
 }
 
@@ -228,7 +228,7 @@
 	/* If in user mode, send SIGILL signal to current process. */
 
 	printk("Illegal Instruction in '%s' (pid = %d, pc = %#010lx)\n",
-	    current->comm, current->pid, regs->pc);
+	    current->comm, task_pid(current), regs->pc);
 	force_sig(SIGILL, current);
 }
 
@@ -254,7 +254,7 @@
 	current->thread.error_code = -3;
 	printk("Unaligned memory access to %08lx in '%s' "
 	       "(pid = %d, pc = %#010lx)\n",
-	       regs->excvaddr, current->comm, current->pid, regs->pc);
+	       regs->excvaddr, current->comm, task_pid(current), regs->pc);
 	info.si_signo = SIGBUS;
 	info.si_errno = 0;
 	info.si_code = BUS_ADRALN;
Index: linux-2.6.15/arch/xtensa/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/xtensa/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/xtensa/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -64,7 +64,7 @@
 		    exccause == XCHAL_EXCCAUSE_FETCH_CACHE_ATTRIBUTE) ? 1 : 0;
 
 #if 0
-	printk("[%s:%d:%08x:%d:%08x:%s%s]\n", current->comm, current->pid,
+	printk("[%s:%d:%08x:%d:%08x:%s%s]\n", current->comm, task_pid(current),
 	       address, exccause, regs->pc, is_write? "w":"", is_exec? "x":"");
 #endif
 
@@ -144,7 +144,7 @@
 	 */
 out_of_memory:
 	up_read(&mm->mmap_sem);
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		yield();
 		down_read(&mm->mmap_sem);
 		goto survive;
Index: linux-2.6.15/arch/s390/kernel/asm-offsets.c
===================================================================
--- linux-2.6.15.orig/arch/s390/kernel/asm-offsets.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/s390/kernel/asm-offsets.c	2006-01-17 08:36:55.000000000 -0500
@@ -22,7 +22,7 @@
 	DEFINE(__THREAD_mm_segment,
 	       offsetof(struct task_struct, thread.mm_segment),);
 	BLANK();
-	DEFINE(__TASK_pid, offsetof(struct task_struct, pid),);
+	DEFINE(__TASK_pid, offsetof(struct task_struct, __pid),);
 	BLANK();
 	DEFINE(__PER_atmid, offsetof(per_struct, lowcore.words.perc_atmid),);
 	DEFINE(__PER_address, offsetof(per_struct, lowcore.words.address),);
Index: linux-2.6.15/drivers/s390/crypto/z90main.c
===================================================================
--- linux-2.6.15.orig/drivers/s390/crypto/z90main.c	2006-01-17 08:36:54.000000000 -0500
+++ linux-2.6.15/drivers/s390/crypto/z90main.c	2006-01-17 08:36:55.000000000 -0500
@@ -942,7 +942,7 @@
 	step = atomic_inc_return(&z90crypt_step);
 	memcpy(we_p->caller_id+0, (void *) &pid, sizeof(pid));
 	memcpy(we_p->caller_id+4, (void *) &step, sizeof(step));
-	we_task_pid(p) = pid;
+	we_p->pid = pid;
 	we_p->priv_data = priv_data;
 	we_p->status[0] = STAT_DEFAULT;
 	we_p->audit[0] = 0x00;
Index: linux-2.6.15/arch/x86_64/ia32/ptrace32.c
===================================================================
--- linux-2.6.15.orig/arch/x86_64/ia32/ptrace32.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/x86_64/ia32/ptrace32.c	2006-01-17 08:36:55.000000000 -0500
@@ -212,7 +212,7 @@
 	read_unlock(&tasklist_lock);
 	if (child) { 
 		*err = -EPERM;
-		if (child->pid == 1) 
+		if (task_pid(child) == 1)
 			goto out;
 		*err = ptrace_check_attach(child, request == PTRACE_KILL); 
 		if (*err < 0) 
Index: linux-2.6.15/arch/x86_64/kernel/asm-offsets.c
===================================================================
--- linux-2.6.15.orig/arch/x86_64/kernel/asm-offsets.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/x86_64/kernel/asm-offsets.c	2006-01-17 08:36:55.000000000 -0500
@@ -26,7 +26,7 @@
 	ENTRY(state);
 	ENTRY(flags); 
 	ENTRY(thread); 
-	ENTRY(pid);
+	ENTRY(__pid);
 	BLANK();
 #undef ENTRY
 #define ENTRY(entry) DEFINE(threadinfo_ ## entry, offsetof(struct thread_info, entry))
Index: linux-2.6.15/arch/powerpc/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/powerpc/mm/fault.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/powerpc/mm/fault.c	2006-01-17 08:36:55.000000000 -0500
@@ -350,7 +350,7 @@
  */
 out_of_memory:
 	up_read(&mm->mmap_sem);
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		yield();
 		down_read(&mm->mmap_sem);
 		goto survive;
Index: linux-2.6.15/arch/powerpc/xmon/xmon.c
===================================================================
--- linux-2.6.15.orig/arch/powerpc/xmon/xmon.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/powerpc/xmon/xmon.c	2006-01-17 08:36:55.000000000 -0500
@@ -1369,7 +1369,7 @@
 #endif
 	if (current) {
 		printf("    pid   = %ld, comm = %s\n",
-		       current->pid, current->comm);
+		       task_pid(current), current->comm);
 	}
 
 	if (trap == 0x700)
Index: linux-2.6.15/arch/powerpc/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/powerpc/kernel/traps.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/powerpc/kernel/traps.c	2006-01-17 08:36:55.000000000 -0500
@@ -195,7 +195,7 @@
 	 * generate the same exception over and over again and we get
 	 * nowhere.  Better to kill it and let the kernel panic.
 	 */
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		__sighandler_t handler;
 
 		spin_lock_irq(&current->sighand->siglock);
@@ -860,7 +860,7 @@
 void trace_syscall(struct pt_regs *regs)
 {
 	printk("Task: %p(%d), PC: %08lX/%08lX, Syscall: %3ld, Result: %s%ld    %s\n",
-	       current, current->pid, regs->nip, regs->link, regs->gpr[0],
+	       current, task_pid(current), regs->nip, regs->link, regs->gpr[0],
 	       regs->ccr&0x10000000?"Error=":"", regs->gpr[3], print_tainted());
 }
 
Index: linux-2.6.15/arch/powerpc/platforms/pseries/ras.c
===================================================================
--- linux-2.6.15.orig/arch/powerpc/platforms/pseries/ras.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/powerpc/platforms/pseries/ras.c	2006-01-17 08:36:55.000000000 -0500
@@ -313,10 +313,10 @@
 		   err->disposition == RTAS_DISP_NOT_RECOVERED &&
 		   err->target == RTAS_TARGET_MEMORY &&
 		   err->type == RTAS_TYPE_ECC_UNCORR &&
-		   !(current->pid == 0 || current->pid == 1)) {
+		   !(task_pid(current) == 0 || task_pid(current) == 1)) {
 		/* Kill off a user process with an ECC error */
 		printk(KERN_ERR "MCE: uncorrectable ecc error for pid %d\n",
-		       current->pid);
+		       task_pid(current));
 		/* XXX something better for ECC error? */
 		_exception(SIGBUS, regs, BUS_ADRERR, regs->nip);
 		nonfatal = 1;
Index: linux-2.6.15/arch/powerpc/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/powerpc/kernel/process.c	2006-01-17 08:36:29.000000000 -0500
+++ linux-2.6.15/arch/powerpc/kernel/process.c	2006-01-17 08:36:55.000000000 -0500
@@ -424,7 +424,7 @@
 	if (trap == 0x300 || trap == 0x600)
 		printk("DAR: "REG", DSISR: "REG"\n", regs->dar, regs->dsisr);
 	printk("TASK = %p[%d] '%s' THREAD: %p",
-	       current, current->pid, current->comm, current->thread_info);
+	       current, task_pid(current), current->comm, current->thread_info);
 
 #ifdef CONFIG_SMP
 	printk(" CPU: %d", smp_processor_id());

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 03/34] PID Virtualization Change pid accesses: filesystems.
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
  2006-01-17 14:32 ` RFC [patch 01/34] PID Virtualization Change pid accesses: drivers Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 02/34] PID Virtualization Change pid accesses: most archs Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 04/34] PID Virtualization Change pid accesses: include/ Serge Hallyn
                   ` (31 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: B2-change-pid-tgid-references-fs --]
[-- Type: text/plain, Size: 33051 bytes --]

Change pid accesses for filesystems.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 9p/debug.h                |    4 ++--
 9p/fid.c                  |    5 +++--
 afs/cmservice.c           |    2 +-
 afs/kafsasyncd.c          |    2 +-
 afs/kafstimod.c           |    2 +-
 autofs/root.c             |    2 +-
 autofs4/autofs_i.h        |    2 +-
 autofs4/root.c            |    2 +-
 binfmt_elf.c              |    8 ++++----
 binfmt_elf_fdpic.c        |    2 +-
 binfmt_flat.c             |    2 +-
 cifs/cifssmb.c            |    2 +-
 cifs/connect.c            |    2 +-
 cifs/dir.c                |    2 +-
 cifs/file.c               |    4 ++--
 cifs/misc.c               |    4 ++--
 cifs/transport.c          |    2 +-
 coda/upcall.c             |    2 +-
 compat.c                  |    2 +-
 devfs/base.c              |    2 +-
 dnotify.c                 |    2 +-
 exec.c                    |    8 ++++----
 ext2/inode.c              |    2 +-
 ext3/inode.c              |    2 +-
 fs-writeback.c            |    2 +-
 fuse/dev.c                |    2 +-
 jffs2/background.c        |    2 +-
 jffs2/debug.h             |    8 ++++----
 lockd/clntproc.c          |    2 +-
 lockd/svc.c               |    6 +++---
 locks.c                   |   16 ++++++++--------
 nfs/callback.c            |    2 +-
 nfs/nfs3proc.c            |    2 +-
 nfs/nfs4proc.c            |    2 +-
 nfsd/nfs4state.c          |    8 ++++----
 nfsd/vfs.c                |    6 +++---
 proc/array.c              |   12 +++++++-----
 proc/base.c               |   16 ++++++++--------
 smbfs/proc.c              |    4 ++--
 smbfs/smbiod.c            |    4 ++--
 xfs/linux-2.6/xfs_buf.c   |    2 +-
 xfs/linux-2.6/xfs_linux.h |    2 +-
 xfs/support/debug.c       |    2 +-
 43 files changed, 86 insertions(+), 83 deletions(-)

Index: linux-2.6.15/fs/9p/debug.h
===================================================================
--- linux-2.6.15.orig/fs/9p/debug.h	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/9p/debug.h	2006-01-17 08:36:57.000000000 -0500
@@ -39,13 +39,13 @@
 do {  \
 	if((v9fs_debug_level & level)==level) \
 		printk(KERN_NOTICE "-- %s (%d): " \
-		format , __FUNCTION__, current->pid , ## arg); \
+		format , __FUNCTION__, task_pid(current) , ## arg); \
 } while(0)
 
 #define eprintk(level, format, arg...) \
 do { \
 	printk(level "v9fs: %s (%d): " \
-		format , __FUNCTION__, current->pid , ## arg); \
+		format , __FUNCTION__, task_pid(current) , ## arg); \
 } while(0)
 
 #if DEBUG_DUMP_PKT
Index: linux-2.6.15/fs/9p/fid.c
===================================================================
--- linux-2.6.15.orig/fs/9p/fid.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/9p/fid.c	2006-01-17 08:36:57.000000000 -0500
@@ -60,7 +60,7 @@
 	}
 
 	fid->uid = current->uid;
-	fid->pid = current->pid;
+	fid->pid = task_pid(current);
 	list_add(&fid->list, fid_list);
 	return 0;
 }
@@ -242,7 +242,8 @@
 	ret = NULL;
 	if (fid_list) {
 		list_for_each_entry_safe(fid, ftmp, fid_list, list) {
-			if (fid->fidcreate && fid->pid == current->pid) {
+			if (fid->fidcreate && fid->pid ==
+			    task_pid(current)) {
 				list_del(&fid->list);
 				ret = fid;
 				break;
Index: linux-2.6.15/fs/afs/cmservice.c
===================================================================
--- linux-2.6.15.orig/fs/afs/cmservice.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/afs/cmservice.c	2006-01-17 08:36:57.000000000 -0500
@@ -118,7 +118,7 @@
 	_SRXAFSCM_xxxx_t func;
 	int die;
 
-	printk("kAFS: Started kafscmd %d\n", current->pid);
+	printk("kAFS: Started kafscmd %d\n", task_pid(current));
 
 	daemonize("kafscmd");
 
Index: linux-2.6.15/fs/afs/kafsasyncd.c
===================================================================
--- linux-2.6.15.orig/fs/afs/kafsasyncd.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/afs/kafsasyncd.c	2006-01-17 08:36:57.000000000 -0500
@@ -92,7 +92,7 @@
 
 	kafsasyncd_task = current;
 
-	printk("kAFS: Started kafsasyncd %d\n", current->pid);
+	printk("kAFS: Started kafsasyncd %d\n", task_pid(current));
 
 	daemonize("kafsasyncd");
 
Index: linux-2.6.15/fs/afs/kafstimod.c
===================================================================
--- linux-2.6.15.orig/fs/afs/kafstimod.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/afs/kafstimod.c	2006-01-17 08:36:57.000000000 -0500
@@ -69,7 +69,7 @@
 
 	DECLARE_WAITQUEUE(myself, current);
 
-	printk("kAFS: Started kafstimod %d\n", current->pid);
+	printk("kAFS: Started kafstimod %d\n", task_pid(current));
 
 	daemonize("kafstimod");
 
Index: linux-2.6.15/fs/autofs/root.c
===================================================================
--- linux-2.6.15.orig/fs/autofs/root.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/autofs/root.c	2006-01-17 08:36:57.000000000 -0500
@@ -213,7 +213,7 @@
 
 	oz_mode = autofs_oz_mode(sbi);
 	DPRINTK(("autofs_lookup: pid = %u, pgrp = %u, catatonic = %d, oz_mode = %d\n",
-		 current->pid, process_group(current), sbi->catatonic, oz_mode));
+		 task_pid(current), process_group(current), sbi->catatonic, oz_mode));
 
 	/*
 	 * Mark the dentry incomplete, but add it. This is needed so
Index: linux-2.6.15/fs/autofs4/autofs_i.h
===================================================================
--- linux-2.6.15.orig/fs/autofs4/autofs_i.h	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/autofs4/autofs_i.h	2006-01-17 08:36:57.000000000 -0500
@@ -33,7 +33,7 @@
 /* #define DEBUG */
 
 #ifdef DEBUG
-#define DPRINTK(fmt,args...) do { printk(KERN_DEBUG "pid %d: %s: " fmt "\n" , current->pid , __FUNCTION__ , ##args); } while(0)
+#define DPRINTK(fmt,args...) do { printk(KERN_DEBUG "pid %d: %s: " fmt "\n" , task_pid(current) , __FUNCTION__ , ##args); } while(0)
 #else
 #define DPRINTK(fmt,args...) do {} while(0)
 #endif
Index: linux-2.6.15/fs/autofs4/root.c
===================================================================
--- linux-2.6.15.orig/fs/autofs4/root.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/autofs4/root.c	2006-01-17 08:36:57.000000000 -0500
@@ -465,7 +465,7 @@
 
 	oz_mode = autofs4_oz_mode(sbi);
 	DPRINTK("pid = %u, pgrp = %u, catatonic = %d, oz_mode = %d",
-		 current->pid, process_group(current), sbi->catatonic, oz_mode);
+		 task_pid(current), process_group(current), sbi->catatonic, oz_mode);
 
 	/*
 	 * Mark the dentry incomplete, but add it. This is needed so
Index: linux-2.6.15/fs/binfmt_elf.c
===================================================================
--- linux-2.6.15.orig/fs/binfmt_elf.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/binfmt_elf.c	2006-01-17 08:36:57.000000000 -0500
@@ -1270,8 +1270,8 @@
 	prstatus->pr_info.si_signo = prstatus->pr_cursig = signr;
 	prstatus->pr_sigpend = p->pending.signal.sig[0];
 	prstatus->pr_sighold = p->blocked.sig[0];
-	prstatus->pr_pid = p->pid;
-	prstatus->pr_ppid = p->parent->pid;
+	prstatus->pr_pid = task_pid(p);
+	prstatus->pr_ppid = task_pid(p->parent);
 	prstatus->pr_pgrp = process_group(p);
 	prstatus->pr_sid = p->signal->session;
 	if (thread_group_leader(p)) {
@@ -1316,8 +1316,8 @@
 			psinfo->pr_psargs[i] = ' ';
 	psinfo->pr_psargs[len] = 0;
 
-	psinfo->pr_pid = p->pid;
-	psinfo->pr_ppid = p->parent->pid;
+	psinfo->pr_pid = task_pid(p);
+	psinfo->pr_ppid = task_pid(p->parent);
 	psinfo->pr_pgrp = process_group(p);
 	psinfo->pr_sid = p->signal->session;
 
Index: linux-2.6.15/fs/binfmt_elf_fdpic.c
===================================================================
--- linux-2.6.15.orig/fs/binfmt_elf_fdpic.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/binfmt_elf_fdpic.c	2006-01-17 08:36:57.000000000 -0500
@@ -479,7 +479,7 @@
 	 * removed for 2.5
 	 */
 	if (smp_num_siblings > 1)
-		sp = sp - ((current->pid % 64) << 7);
+		sp = sp - ((task_pid(current) % 64) << 7);
 #endif
 
 	sp &= ~7UL;
Index: linux-2.6.15/fs/binfmt_flat.c
===================================================================
--- linux-2.6.15.orig/fs/binfmt_flat.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/binfmt_flat.c	2006-01-17 08:36:57.000000000 -0500
@@ -95,7 +95,7 @@
 static int flat_core_dump(long signr, struct pt_regs * regs, struct file *file)
 {
 	printk("Process %s:%d received signr %d and should have core dumped\n",
-			current->comm, current->pid, (int) signr);
+			current->comm, task_pid(current), (int) signr);
 	return(1);
 }
 
Index: linux-2.6.15/fs/cifs/cifssmb.c
===================================================================
--- linux-2.6.15.orig/fs/cifs/cifssmb.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/cifs/cifssmb.c	2006-01-17 08:36:57.000000000 -0500
@@ -1274,7 +1274,7 @@
 	pSMB->Fid = smb_file_id; /* netfid stays le */
 
 	if((numLock != 0) || (numUnlock != 0)) {
-		pSMB->Locks[0].Pid = cpu_to_le16(current->tgid);
+		pSMB->Locks[0].Pid = cpu_to_le16(task_tgid(current));
 		/* BB where to store pid high? */
 		pSMB->Locks[0].LengthLow = cpu_to_le32((u32)len);
 		pSMB->Locks[0].LengthHigh = cpu_to_le32((u32)(len>>32));
Index: linux-2.6.15/fs/cifs/connect.c
===================================================================
--- linux-2.6.15.orig/fs/cifs/connect.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/cifs/connect.c	2006-01-17 08:36:57.000000000 -0500
@@ -345,7 +345,7 @@
 	allow_signal(SIGKILL);
 	current->flags |= PF_MEMALLOC;
 	server->tsk = current;	/* save process info to wake at shutdown */
-	cFYI(1, ("Demultiplex PID: %d", current->pid));
+	cFYI(1, ("Demultiplex PID: %d", task_pid(current)));
 	write_lock(&GlobalSMBSeslock); 
 	atomic_inc(&tcpSesAllocCount);
 	length = tcpSesAllocCount.counter;
Index: linux-2.6.15/fs/cifs/dir.c
===================================================================
--- linux-2.6.15.orig/fs/cifs/dir.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/cifs/dir.c	2006-01-17 08:36:57.000000000 -0500
@@ -262,7 +262,7 @@
 			memset((char *)pCifsFile, 0,
 			       sizeof (struct cifsFileInfo));
 			pCifsFile->netfid = fileHandle;
-			pCifsFile->pid = current->tgid;
+			pCifsFile->pid = task_tgid(current);
 			pCifsFile->pInode = newinode;
 			pCifsFile->invalidHandle = FALSE;
 			pCifsFile->closePend     = FALSE;
Index: linux-2.6.15/fs/cifs/file.c
===================================================================
--- linux-2.6.15.orig/fs/cifs/file.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/cifs/file.c	2006-01-17 08:36:57.000000000 -0500
@@ -45,7 +45,7 @@
 {
 	memset(private_data, 0, sizeof(struct cifsFileInfo));
 	private_data->netfid = netfid;
-	private_data->pid = current->tgid;	
+	private_data->pid = task_tgid(current);
 	init_MUTEX(&private_data->fh_sem);
 	private_data->pfile = file; /* needed for writepage */
 	private_data->pInode = inode;
@@ -182,7 +182,7 @@
 			pCifsFile = list_entry(tmp, struct cifsFileInfo,
 					       flist);
 			if ((pCifsFile->pfile == NULL) &&
-			    (pCifsFile->pid == current->tgid)) {
+			    (pCifsFile->pid == task_tgid(current))) {
 				/* mode set in cifs_create */
 
 				/* needed for writepage */
Index: linux-2.6.15/fs/cifs/misc.c
===================================================================
--- linux-2.6.15.orig/fs/cifs/misc.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/cifs/misc.c	2006-01-17 08:36:57.000000000 -0500
@@ -308,8 +308,8 @@
 	buffer->Command = smb_command;
 	buffer->Flags = 0x00;	/* case sensitive */
 	buffer->Flags2 = SMBFLG2_KNOWS_LONG_NAMES;
-	buffer->Pid = cpu_to_le16((__u16)current->tgid);
-	buffer->PidHigh = cpu_to_le16((__u16)(current->tgid >> 16));
+	buffer->Pid = cpu_to_le16((__u16)task_tgid(current));
+	buffer->PidHigh = cpu_to_le16((__u16)(task_tgid(current) >> 16));
 	spin_lock(&GlobalMid_Lock);
 	spin_unlock(&GlobalMid_Lock);
 	if (treeCon) {
Index: linux-2.6.15/fs/cifs/transport.c
===================================================================
--- linux-2.6.15.orig/fs/cifs/transport.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/cifs/transport.c	2006-01-17 08:36:57.000000000 -0500
@@ -56,7 +56,7 @@
 	else {
 		memset(temp, 0, sizeof (struct mid_q_entry));
 		temp->mid = smb_buffer->Mid;	/* always LE */
-		temp->pid = current->pid;
+		temp->pid = task_pid(current);
 		temp->command = smb_buffer->Command;
 		cFYI(1, ("For smb_command %d", temp->command));
 	/*	do_gettimeofday(&temp->when_sent);*/ /* easier to use jiffies */
Index: linux-2.6.15/fs/coda/upcall.c
===================================================================
--- linux-2.6.15.orig/fs/coda/upcall.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/coda/upcall.c	2006-01-17 08:36:57.000000000 -0500
@@ -52,7 +52,7 @@
 		return ERR_PTR(-ENOMEM);
 
         inp->ih.opcode = opcode;
-	inp->ih.pid = current->pid;
+	inp->ih.pid = task_pid(current);
 	inp->ih.pgid = process_group(current);
 #ifdef CONFIG_CODA_FS_OLD_API
 	memset(&inp->ih.cred, 0, sizeof(struct coda_cred));
Index: linux-2.6.15/fs/compat.c
===================================================================
--- linux-2.6.15.orig/fs/compat.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/compat.c	2006-01-17 08:36:57.000000000 -0500
@@ -331,7 +331,7 @@
 		sprintf(buf, "%02x", buf[1]);
 	printk("ioctl32(%s:%d): Unknown cmd fd(%d) "
 			"cmd(%08x){%s} arg(%08x) on %s\n",
-			current->comm, current->pid,
+			current->comm, task_pid(current),
 			(int)fd, (unsigned int)cmd, buf,
 			(unsigned int)arg, fn);
 
Index: linux-2.6.15/fs/devfs/base.c
===================================================================
--- linux-2.6.15.orig/fs/devfs/base.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/devfs/base.c	2006-01-17 08:36:57.000000000 -0500
@@ -2695,7 +2695,7 @@
 			spin_unlock(&lock);
 			fs_info->devfsd_pgrp =
 			    (process_group(current) ==
-			     current->pid) ? process_group(current) : 0;
+			     task_pid(current)) ? process_group(current) : 0;
 			fs_info->devfsd_file = file;
 			fs_info->devfsd_info =
 			    kmalloc(sizeof *fs_info->devfsd_info, GFP_KERNEL);
Index: linux-2.6.15/fs/dnotify.c
===================================================================
--- linux-2.6.15.orig/fs/dnotify.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/dnotify.c	2006-01-17 08:36:57.000000000 -0500
@@ -92,7 +92,7 @@
 		prev = &odn->dn_next;
 	}
 
-	error = f_setown(filp, current->pid, 0);
+	error = f_setown(filp, task_pid(current), 0);
 	if (error)
 		goto out_free;
 
Index: linux-2.6.15/fs/exec.c
===================================================================
--- linux-2.6.15.orig/fs/exec.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/exec.c	2006-01-17 08:36:57.000000000 -0500
@@ -675,8 +675,8 @@
 		proc_dentry2 = proc_pid_unhash(leader);
 		write_lock_irq(&tasklist_lock);
 
-		BUG_ON(leader->tgid != current->tgid);
-		BUG_ON(current->pid == current->tgid);
+		BUG_ON(task_tgid(leader) != task_tgid(current));
+		BUG_ON(task_pid(current) == task_tgid(current));
 		/*
 		 * An exec() starts a new thread group with the
 		 * TGID of the previous thread group. Rehash the
@@ -1282,7 +1282,7 @@
 			case 'p':
 				pid_in_pattern = 1;
 				rc = snprintf(out_ptr, out_end - out_ptr,
-					      "%d", current->tgid);
+					      "%d", task_tgid(current));
 				if (rc > out_end - out_ptr)
 					goto out;
 				out_ptr += rc;
@@ -1354,7 +1354,7 @@
 	if (!pid_in_pattern
             && (core_uses_pid || atomic_read(&current->mm->mm_users) != 1)) {
 		rc = snprintf(out_ptr, out_end - out_ptr,
-			      ".%d", current->tgid);
+			      ".%d", task_tgid(current));
 		if (rc > out_end - out_ptr)
 			goto out;
 		out_ptr += rc;
Index: linux-2.6.15/fs/ext2/inode.c
===================================================================
--- linux-2.6.15.orig/fs/ext2/inode.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/ext2/inode.c	2006-01-17 08:36:57.000000000 -0500
@@ -344,7 +344,7 @@
 	 */
 	bg_start = (ei->i_block_group * EXT2_BLOCKS_PER_GROUP(inode->i_sb)) +
 		le32_to_cpu(EXT2_SB(inode->i_sb)->s_es->s_first_data_block);
-	colour = (current->pid % 16) *
+	colour = (task_pid(current) % 16) *
 			(EXT2_BLOCKS_PER_GROUP(inode->i_sb) / 16);
 	return bg_start + colour;
 }
Index: linux-2.6.15/fs/ext3/inode.c
===================================================================
--- linux-2.6.15.orig/fs/ext3/inode.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/ext3/inode.c	2006-01-17 08:36:57.000000000 -0500
@@ -443,7 +443,7 @@
 	 */
 	bg_start = (ei->i_block_group * EXT3_BLOCKS_PER_GROUP(inode->i_sb)) +
 		le32_to_cpu(EXT3_SB(inode->i_sb)->s_es->s_first_data_block);
-	colour = (current->pid % 16) *
+	colour = (task_pid(current) % 16) *
 			(EXT3_BLOCKS_PER_GROUP(inode->i_sb) / 16);
 	return bg_start + colour;
 }
Index: linux-2.6.15/fs/fs-writeback.c
===================================================================
--- linux-2.6.15.orig/fs/fs-writeback.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/fs-writeback.c	2006-01-17 08:36:57.000000000 -0500
@@ -89,7 +89,7 @@
 		if (inode->i_ino || strcmp(inode->i_sb->s_id, "bdev"))
 			printk(KERN_DEBUG
 			       "%s(%d): dirtied inode %lu (%s) on %s\n",
-			       current->comm, current->pid, inode->i_ino,
+			       current->comm, task_pid(current), inode->i_ino,
 			       name, inode->i_sb->s_id);
 	}
 
Index: linux-2.6.15/fs/fuse/dev.c
===================================================================
--- linux-2.6.15.orig/fs/fuse/dev.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/fuse/dev.c	2006-01-17 08:36:57.000000000 -0500
@@ -99,7 +99,7 @@
 	req->preallocated = 1;
 	req->in.h.uid = current->fsuid;
 	req->in.h.gid = current->fsgid;
-	req->in.h.pid = current->pid;
+	req->in.h.pid = task_pid(current);
 	return req;
 }
 
Index: linux-2.6.15/fs/jffs2/background.c
===================================================================
--- linux-2.6.15.orig/fs/jffs2/background.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/jffs2/background.c	2006-01-17 08:36:57.000000000 -0500
@@ -60,7 +60,7 @@
 	int wait = 0;
 	spin_lock(&c->erase_completion_lock);
 	if (c->gc_task) {
-		D1(printk(KERN_DEBUG "jffs2: Killing GC task %d\n", c->gc_task->pid));
+		D1(printk(KERN_DEBUG "jffs2: Killing GC task %d\n", c->gc_task_pid(task)));
 		send_sig(SIGKILL, c->gc_task, 1);
 		wait = 1;
 	}
Index: linux-2.6.15/fs/lockd/clntproc.c
===================================================================
--- linux-2.6.15.orig/fs/lockd/clntproc.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/lockd/clntproc.c	2006-01-17 08:36:57.000000000 -0500
@@ -130,7 +130,7 @@
 	lock->caller  = system_utsname.nodename;
 	lock->oh.data = req->a_owner;
 	lock->oh.len  = sprintf(req->a_owner, "%d@%s",
-				current->pid, system_utsname.nodename);
+				task_pid(current), system_utsname.nodename);
 	locks_copy_lock(&lock->fl, fl);
 }
 
Index: linux-2.6.15/fs/lockd/svc.c
===================================================================
--- linux-2.6.15.orig/fs/lockd/svc.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/lockd/svc.c	2006-01-17 08:36:57.000000000 -0500
@@ -111,7 +111,7 @@
 	/*
 	 * Let our maker know we're running.
 	 */
-	nlmsvc_pid = current->pid;
+	nlmsvc_pid = task_pid(current);
 	up(&lockd_start);
 
 	daemonize("lockd");
@@ -135,7 +135,7 @@
 	 * NFS mount or NFS daemon has gone away, and we've been sent a
 	 * signal, or else another process has taken over our job.
 	 */
-	while ((nlmsvc_users || !signalled()) && nlmsvc_pid == current->pid) {
+	while ((nlmsvc_users || !signalled()) && nlmsvc_pid == task_pid(current)) {
 		long timeout = MAX_SCHEDULE_TIMEOUT;
 
 		if (signalled()) {
@@ -182,7 +182,7 @@
 	 * Check whether there's a new lockd process before
 	 * shutting down the hosts and clearing the slot.
 	 */
-	if (!nlmsvc_pid || current->pid == nlmsvc_pid) {
+	if (!nlmsvc_pid || task_pid(current) == nlmsvc_pid) {
 		if (nlmsvc_ops)
 			nlmsvc_invalidate_all();
 		nlm_shutdown_hosts();
Index: linux-2.6.15/fs/locks.c
===================================================================
--- linux-2.6.15.orig/fs/locks.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/locks.c	2006-01-17 08:36:57.000000000 -0500
@@ -268,7 +268,7 @@
 		return -ENOMEM;
 
 	fl->fl_file = filp;
-	fl->fl_pid = current->tgid;
+	fl->fl_pid = task_tgid(current);
 	fl->fl_flags = FL_FLOCK;
 	fl->fl_type = type;
 	fl->fl_end = OFFSET_MAX;
@@ -334,7 +334,7 @@
 		return -EOVERFLOW;
 	
 	fl->fl_owner = current->files;
-	fl->fl_pid = current->tgid;
+	fl->fl_pid = task_tgid(current);
 	fl->fl_file = filp;
 	fl->fl_flags = FL_POSIX;
 	fl->fl_ops = NULL;
@@ -380,7 +380,7 @@
 		return -EOVERFLOW;
 	
 	fl->fl_owner = current->files;
-	fl->fl_pid = current->tgid;
+	fl->fl_pid = task_tgid(current);
 	fl->fl_file = filp;
 	fl->fl_flags = FL_POSIX;
 	fl->fl_ops = NULL;
@@ -433,7 +433,7 @@
 static int lease_init(struct file *filp, int type, struct file_lock *fl)
  {
 	fl->fl_owner = current->files;
-	fl->fl_pid = current->tgid;
+	fl->fl_pid = task_tgid(current);
 
 	fl->fl_file = filp;
 	fl->fl_flags = FL_LEASE;
@@ -1043,7 +1043,7 @@
 
 	locks_init_lock(&fl);
 	fl.fl_owner = current->files;
-	fl.fl_pid = current->tgid;
+	fl.fl_pid = task_tgid(current);
 	fl.fl_file = filp;
 	fl.fl_flags = FL_POSIX | FL_ACCESS;
 	if (filp && !(filp->f_flags & O_NONBLOCK))
@@ -1441,7 +1441,7 @@
 		goto out_unlock;
 	}
 
-	error = f_setown(filp, current->pid, 0);
+	error = f_setown(filp, task_pid(current), 0);
 out_unlock:
 	unlock_kernel();
 	return error;
@@ -1860,7 +1860,7 @@
 	lock.fl_start = 0;
 	lock.fl_end = OFFSET_MAX;
 	lock.fl_owner = owner;
-	lock.fl_pid = current->tgid;
+	lock.fl_pid = task_tgid(current);
 	lock.fl_file = filp;
 	lock.fl_ops = NULL;
 	lock.fl_lmops = NULL;
@@ -1904,7 +1904,7 @@
 
 	if (filp->f_op && filp->f_op->flock) {
 		struct file_lock fl = {
-			.fl_pid = current->tgid,
+			.fl_pid = task_tgid(current),
 			.fl_file = filp,
 			.fl_flags = FL_FLOCK,
 			.fl_type = F_UNLCK,
Index: linux-2.6.15/fs/nfs/callback.c
===================================================================
--- linux-2.6.15.orig/fs/nfs/callback.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/nfs/callback.c	2006-01-17 08:36:57.000000000 -0500
@@ -44,7 +44,7 @@
 	__module_get(THIS_MODULE);
 	lock_kernel();
 
-	nfs_callback_info.pid = current->pid;
+	nfs_callback_info.pid = task_pid(current);
 	daemonize("nfsv4-svc");
 	/* Process request with signals blocked, but allow SIGKILL.  */
 	allow_signal(SIGKILL);
Index: linux-2.6.15/fs/nfs/nfs3proc.c
===================================================================
--- linux-2.6.15.orig/fs/nfs/nfs3proc.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/nfs/nfs3proc.c	2006-01-17 08:36:57.000000000 -0500
@@ -323,7 +323,7 @@
 	if (flags & O_EXCL) {
 		arg.createmode  = NFS3_CREATE_EXCLUSIVE;
 		arg.verifier[0] = jiffies;
-		arg.verifier[1] = current->pid;
+		arg.verifier[1] = task_pid(current);
 	}
 
 	sattr->ia_mode &= ~current->fs->umask;
Index: linux-2.6.15/fs/nfs/nfs4proc.c
===================================================================
--- linux-2.6.15.orig/fs/nfs/nfs4proc.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/nfs/nfs4proc.c	2006-01-17 08:36:57.000000000 -0500
@@ -724,7 +724,7 @@
 	if (flags & O_EXCL) {
 		u32 *p = (u32 *) o_arg.u.verifier.data;
 		p[0] = jiffies;
-		p[1] = current->pid;
+		p[1] = task_pid(current);
 	} else
 		o_arg.u.attrs = sattr;
 	/* Serialization for the sequence id */
Index: linux-2.6.15/fs/nfsd/nfs4state.c
===================================================================
--- linux-2.6.15.orig/fs/nfsd/nfs4state.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/nfsd/nfs4state.c	2006-01-17 08:36:57.000000000 -0500
@@ -1740,7 +1740,7 @@
 	fl.fl_end = OFFSET_MAX;
 	fl.fl_owner =  (fl_owner_t)dp;
 	fl.fl_file = stp->st_vfs_file;
-	fl.fl_pid = current->tgid;
+	fl.fl_pid = task_tgid(current);
 
 	/* setlease checks to see if delegation should be handed out.
 	 * the lock_manager callbacks fl_mylease and fl_change are used
@@ -2784,7 +2784,7 @@
 		goto out;
 	}
 	file_lock.fl_owner = (fl_owner_t)lock_sop;
-	file_lock.fl_pid = current->tgid;
+	file_lock.fl_pid = task_tgid(current);
 	file_lock.fl_file = filp;
 	file_lock.fl_flags = FL_POSIX;
 
@@ -2903,7 +2903,7 @@
 			&lockt->lt_clientid, &lockt->lt_owner);
 	if (lockt->lt_stateowner)
 		file_lock.fl_owner = (fl_owner_t)lockt->lt_stateowner;
-	file_lock.fl_pid = current->tgid;
+	file_lock.fl_pid = task_tgid(current);
 	file_lock.fl_flags = FL_POSIX;
 
 	file_lock.fl_start = lockt->lt_offset;
@@ -2962,7 +2962,7 @@
 	locks_init_lock(&file_lock);
 	file_lock.fl_type = F_UNLCK;
 	file_lock.fl_owner = (fl_owner_t) locku->lu_stateowner;
-	file_lock.fl_pid = current->tgid;
+	file_lock.fl_pid = task_tgid(current);
 	file_lock.fl_file = filp;
 	file_lock.fl_flags = FL_POSIX; 
 	file_lock.fl_start = locku->lu_offset;
Index: linux-2.6.15/fs/nfsd/vfs.c
===================================================================
--- linux-2.6.15.orig/fs/nfsd/vfs.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/nfsd/vfs.c	2006-01-17 08:36:57.000000000 -0500
@@ -955,13 +955,13 @@
 		if (EX_WGATHER(exp)) {
 			if (atomic_read(&inode->i_writecount) > 1
 			    || (last_ino == inode->i_ino && last_dev == inode->i_sb->s_dev)) {
-				dprintk("nfsd: write defer %d\n", current->pid);
+				dprintk("nfsd: write defer %d\n", task_pid(current));
 				msleep(10);
-				dprintk("nfsd: write resume %d\n", current->pid);
+				dprintk("nfsd: write resume %d\n", task_pid(current));
 			}
 
 			if (inode->i_state & I_DIRTY) {
-				dprintk("nfsd: write sync %d\n", current->pid);
+				dprintk("nfsd: write sync %d\n", task_pid(current));
 				nfsd_sync(file);
 			}
 #if 0
Index: linux-2.6.15/fs/proc/array.c
===================================================================
--- linux-2.6.15.orig/fs/proc/array.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/proc/array.c	2006-01-17 08:36:57.000000000 -0500
@@ -174,9 +174,10 @@
 		"Gid:\t%d\t%d\t%d\t%d\n",
 		get_task_state(p),
 		(p->sleep_avg/1024)*100/(1020000000/1024),
-	       	p->tgid,
-		p->pid, pid_alive(p) ? p->group_leader->real_parent->tgid : 0,
-		pid_alive(p) && p->ptrace ? p->parent->pid : 0,
+	       	task_tgid(p),
+		task_pid(p), pid_alive(p) ?
+			task_tgid(p->group_leader->real_parent) : 0,
+		pid_alive(p) && p->ptrace ? task_pid(p->parent) : 0,
 		p->uid, p->euid, p->suid, p->fsuid,
 		p->gid, p->egid, p->sgid, p->fsgid);
 	read_unlock(&tasklist_lock);
@@ -388,7 +389,8 @@
 		}
 		it_real_value = task->signal->it_real_value;
 	}
-	ppid = pid_alive(task) ? task->group_leader->real_parent->tgid : 0;
+	ppid = pid_alive(task) ?
+		task_tgid(task->group_leader->real_parent) : 0;
 	read_unlock(&tasklist_lock);
 
 	if (!whole || num_threads<2)
@@ -415,7 +417,7 @@
 	res = sprintf(buffer,"%d (%s) %c %d %d %d %d %d %lu %lu \
 %lu %lu %lu %lu %lu %ld %ld %ld %ld %d %ld %llu %lu %ld %lu %lu %lu %lu %lu \
 %lu %lu %lu %lu %lu %lu %lu %lu %d %d %lu %lu\n",
-		task->pid,
+		task_pid(task),
 		tcomm,
 		state,
 		ppid,
Index: linux-2.6.15/fs/proc/base.c
===================================================================
--- linux-2.6.15.orig/fs/proc/base.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/proc/base.c	2006-01-17 08:36:57.000000000 -0500
@@ -1160,7 +1160,7 @@
 	if (!pid_alive(p))
 		goto out;
 	retval = 0;
-	tid = p->pid;
+	tid = task_pid(p);
 
 	fd = filp->f_pos;
 	switch (fd) {
@@ -1227,7 +1227,7 @@
 		goto out;
 
 	ret = 0;
-	pid = proc_task(inode)->pid;
+	pid = task_pid(proc_task(inode));
 	i = filp->f_pos;
 	switch (i) {
 	case 0:
@@ -1312,7 +1312,7 @@
 	ei = PROC_I(inode);
 	ei->task = NULL;
 	inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME;
-	inode->i_ino = fake_ino(task->pid, ino);
+	inode->i_ino = fake_ino(task_pid(task), ino);
 
 	if (!pid_alive(task))
 		goto out_unlock;
@@ -1878,14 +1878,14 @@
 			      int buflen)
 {
 	char tmp[30];
-	sprintf(tmp, "%d", current->tgid);
+	sprintf(tmp, "%d", task_tgid(current));
 	return vfs_readlink(dentry,buffer,buflen,tmp);
 }
 
 static void *proc_self_follow_link(struct dentry *dentry, struct nameidata *nd)
 {
 	char tmp[30];
-	sprintf(tmp, "%d", current->tgid);
+	sprintf(tmp, "%d", task_tgid(current));
 	return ERR_PTR(vfs_follow_link(nd,tmp));
 }	
 
@@ -2042,7 +2042,7 @@
 	read_unlock(&tasklist_lock);
 	if (!task)
 		goto out;
-	if (leader->tgid != task->tgid)
+	if (task_tgid(leader) != task_tgid(task))
 		goto out_drop_task;
 
 	inode = proc_pid_make_inode(dir->i_sb, task, PROC_TID_INO);
@@ -2100,7 +2100,7 @@
 		p = next_task(&init_task);
 
 	for ( ; p != &init_task; p = next_task(p)) {
-		int tgid = p->pid;
+		int tgid = task_pid(p);
 		if (!pid_alive(p))
 			continue;
 		if (--index >= 0)
@@ -2133,7 +2133,7 @@
 	 * via next_thread().
 	 */
 	if (pid_alive(task)) do {
-		int tid = task->pid;
+		int tid = task_pid(task);
 
 		if (--index >= 0)
 			continue;
Index: linux-2.6.15/fs/smbfs/proc.c
===================================================================
--- linux-2.6.15.orig/fs/smbfs/proc.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/smbfs/proc.c	2006-01-17 08:36:57.000000000 -0500
@@ -852,7 +852,7 @@
 	struct sock *sk;
 	int error;
 
-	VERBOSE("fd=%d, pid=%d\n", opt->fd, current->pid);
+	VERBOSE("fd=%d, pid=%d\n", opt->fd, task_pid(current));
 
 	smb_lock_server(server);
 
@@ -876,7 +876,7 @@
 		goto out_putf;
 
 	server->sock_file = filp;
-	server->conn_pid = current->pid;
+	server->conn_pid = task_pid(current);
 	server->opt = *opt;
 	server->generation += 1;
 	server->state = CONN_VALID;
Index: linux-2.6.15/fs/smbfs/smbiod.c
===================================================================
--- linux-2.6.15.orig/fs/smbfs/smbiod.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/smbfs/smbiod.c	2006-01-17 08:36:57.000000000 -0500
@@ -294,7 +294,7 @@
 
 	allow_signal(SIGKILL);
 
-	VERBOSE("SMB Kernel thread starting (%d) ...\n", current->pid);
+	VERBOSE("SMB Kernel thread starting (%d) ...\n", task_pid(current));
 
 	for (;;) {
 		struct smb_sb_info *server;
@@ -336,6 +336,6 @@
 		spin_unlock(&servers_lock);
 	}
 
-	VERBOSE("SMB Kernel thread exiting (%d) ...\n", current->pid);
+	VERBOSE("SMB Kernel thread exiting (%d) ...\n", task_pid(current));
 	module_put_and_exit(0);
 }
Index: linux-2.6.15/fs/xfs/linux-2.6/xfs_buf.c
===================================================================
--- linux-2.6.15.orig/fs/xfs/linux-2.6/xfs_buf.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/xfs/linux-2.6/xfs_buf.c	2006-01-17 08:36:57.000000000 -0500
@@ -68,7 +68,7 @@
 #endif
 
 #ifdef PAGEBUF_LOCK_TRACKING
-# define PB_SET_OWNER(pb)	((pb)->pb_last_holder = current->pid)
+# define PB_SET_OWNER(pb)	((pb)->pb_last_holder = task_pid(current))
 # define PB_CLEAR_OWNER(pb)	((pb)->pb_last_holder = -1)
 # define PB_GET_OWNER(pb)	((pb)->pb_last_holder)
 #else
Index: linux-2.6.15/fs/xfs/linux-2.6/xfs_linux.h
===================================================================
--- linux-2.6.15.orig/fs/xfs/linux-2.6/xfs_linux.h	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/xfs/linux-2.6/xfs_linux.h	2006-01-17 08:36:57.000000000 -0500
@@ -134,7 +134,7 @@
 #define raw_smp_processor_id()	smp_processor_id()
 #endif
 #define current_cpu()		raw_smp_processor_id()
-#define current_pid()		(current->pid)
+#define current_pid()		(task_pid(current))
 #define current_fsuid(cred)	(current->fsuid)
 #define current_fsgid(cred)	(current->fsgid)
 
Index: linux-2.6.15/fs/xfs/support/debug.c
===================================================================
--- linux-2.6.15.orig/fs/xfs/support/debug.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/xfs/support/debug.c	2006-01-17 08:36:57.000000000 -0500
@@ -60,7 +60,7 @@
 int
 get_thread_id(void)
 {
-	return current->pid;
+	return task_pid(current);
 }
 
 #endif /* DEBUG || INDUCE_IO_ERRROR || !NO_WANT_RANDOM */
Index: linux-2.6.15/fs/jffs2/debug.h
===================================================================
--- linux-2.6.15.orig/fs/jffs2/debug.h	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/fs/jffs2/debug.h	2006-01-17 08:36:57.000000000 -0500
@@ -81,28 +81,28 @@
 #define JFFS2_ERROR(fmt, ...)						\
 	do {								\
 		printk(JFFS2_ERR_MSG_PREFIX				\
-			" (%d) %s: " fmt, current->pid,			\
+			" (%d) %s: " fmt, task_pid(current),		\
 			__FUNCTION__ , ##__VA_ARGS__);			\
 	} while(0)
 
 #define JFFS2_WARNING(fmt, ...)						\
 	do {								\
 		printk(JFFS2_WARN_MSG_PREFIX				\
-			" (%d) %s: " fmt, current->pid,			\
+			" (%d) %s: " fmt, task_pid(current),		\
 			__FUNCTION__ , ##__VA_ARGS__);			\
 	} while(0)
 
 #define JFFS2_NOTICE(fmt, ...)						\
 	do {								\
 		printk(JFFS2_NOTICE_MSG_PREFIX				\
-			" (%d) %s: " fmt, current->pid,			\
+			" (%d) %s: " fmt, task_pid(current),		\
 			__FUNCTION__ , ##__VA_ARGS__);			\
 	} while(0)
 
 #define JFFS2_DEBUG(fmt, ...)						\
 	do {								\
 		printk(JFFS2_DBG_MSG_PREFIX				\
-			" (%d) %s: " fmt, current->pid,			\
+			" (%d) %s: " fmt, task_pid(current),		\
 			__FUNCTION__ , ##__VA_ARGS__);			\
 	} while(0)
 

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 04/34] PID Virtualization Change pid accesses: include/.
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (2 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 03/34] PID Virtualization Change pid accesses: filesystems Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 05/34] PID Virtualization Change pid accesses: ipc Serge Hallyn
                   ` (30 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: B3-change-pid-tgid-references-include --]
[-- Type: text/plain, Size: 1499 bytes --]

Change pid accesses under include/.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 linux/reiserfs_fs.h |    2 +-
 net/scm.h           |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Index: linux-2.6.15/include/linux/reiserfs_fs.h
===================================================================
--- linux-2.6.15.orig/include/linux/reiserfs_fs.h	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/include/linux/reiserfs_fs.h	2006-01-17 08:36:59.000000000 -0500
@@ -83,7 +83,7 @@
 if( !( cond ) ) 								\
   reiserfs_panic( NULL, "reiserfs[%i]: assertion " #cond " failed at "	\
 		  __FILE__ ":%i:%s: " format "\n",		\
-		  in_interrupt() ? -1 : current -> pid, __LINE__ , __FUNCTION__ , ##args )
+		  in_interrupt() ? -1 : task_pid(current), __LINE__ , __FUNCTION__ , ##args )
 
 #if defined( CONFIG_REISERFS_CHECK )
 #define RFALSE( cond, format, args... ) RASSERT( !( cond ), format, ##args )
Index: linux-2.6.15/include/net/scm.h
===================================================================
--- linux-2.6.15.orig/include/net/scm.h	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/include/net/scm.h	2006-01-17 08:36:59.000000000 -0500
@@ -40,7 +40,7 @@
 	memset(scm, 0, sizeof(*scm));
 	scm->creds.uid = current->uid;
 	scm->creds.gid = current->gid;
-	scm->creds.pid = current->tgid;
+	scm->creds.pid = task_tgid(current);
 	if (msg->msg_controllen <= 0)
 		return 0;
 	return __scm_send(sock, msg, scm);

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 05/34] PID Virtualization Change pid accesses: ipc
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (3 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 04/34] PID Virtualization Change pid accesses: include/ Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 06/34] PID Virtualization Change pid accesses: kernel/ Serge Hallyn
                   ` (29 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: B4-change-pid-tgid-references-ipc --]
[-- Type: text/plain, Size: 4678 bytes --]

Change pid accesses for ipc/.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 mqueue.c |    8 ++++----
 msg.c    |    6 +++---
 sem.c    |    8 ++++----
 shm.c    |    6 +++---
 4 files changed, 14 insertions(+), 14 deletions(-)

Index: linux-2.6.15/ipc/mqueue.c
===================================================================
--- linux-2.6.15.orig/ipc/mqueue.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/ipc/mqueue.c	2006-01-17 08:36:59.000000000 -0500
@@ -359,7 +359,7 @@
 	struct mqueue_inode_info *info = MQUEUE_I(filp->f_dentry->d_inode);
 
 	spin_lock(&info->lock);
-	if (current->tgid == info->notify_owner)
+	if (task_tgid(current) == info->notify_owner)
 		remove_notification(info);
 
 	spin_unlock(&info->lock);
@@ -511,7 +511,7 @@
 			sig_i.si_errno = 0;
 			sig_i.si_code = SI_MESGQ;
 			sig_i.si_value = info->notify.sigev_value;
-			sig_i.si_pid = current->tgid;
+			sig_i.si_pid = task_tgid(current);
 			sig_i.si_uid = current->uid;
 
 			kill_proc_info(info->notify.sigev_signo,
@@ -1034,7 +1034,7 @@
 	ret = 0;
 	spin_lock(&info->lock);
 	if (u_notification == NULL) {
-		if (info->notify_owner == current->tgid) {
+		if (info->notify_owner == task_tgid(current)) {
 			remove_notification(info);
 			inode->i_atime = inode->i_ctime = CURRENT_TIME;
 		}
@@ -1058,7 +1058,7 @@
 			info->notify.sigev_notify = SIGEV_SIGNAL;
 			break;
 		}
-		info->notify_owner = current->tgid;
+		info->notify_owner = task_tgid(current);
 		inode->i_atime = inode->i_ctime = CURRENT_TIME;
 	}
 	spin_unlock(&info->lock);
Index: linux-2.6.15/ipc/msg.c
===================================================================
--- linux-2.6.15.orig/ipc/msg.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/ipc/msg.c	2006-01-17 08:36:59.000000000 -0500
@@ -539,7 +539,7 @@
 				msr->r_msg = ERR_PTR(-E2BIG);
 			} else {
 				msr->r_msg = NULL;
-				msq->q_lrpid = msr->r_tsk->pid;
+				msq->q_lrpid = task_pid(msr->r_tsk);
 				msq->q_rtime = get_seconds();
 				wake_up_process(msr->r_tsk);
 				smp_mb();
@@ -621,7 +621,7 @@
 		}
 	}
 
-	msq->q_lspid = current->tgid;
+	msq->q_lspid = task_tgid(current);
 	msq->q_stime = get_seconds();
 
 	if(!pipelined_send(msq,msg)) {
@@ -717,7 +717,7 @@
 			list_del(&msg->m_list);
 			msq->q_qnum--;
 			msq->q_rtime = get_seconds();
-			msq->q_lrpid = current->tgid;
+			msq->q_lrpid = task_tgid(current);
 			msq->q_cbytes -= msg->m_ts;
 			atomic_sub(msg->m_ts,&msg_bytes);
 			atomic_dec(&msg_hdrs);
Index: linux-2.6.15/ipc/sem.c
===================================================================
--- linux-2.6.15.orig/ipc/sem.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/ipc/sem.c	2006-01-17 08:36:59.000000000 -0500
@@ -742,7 +742,7 @@
 		for (un = sma->undo; un; un = un->id_next)
 			un->semadj[semnum] = 0;
 		curr->semval = val;
-		curr->sempid = current->tgid;
+		curr->sempid = task_tgid(current);
 		sma->sem_ctime = get_seconds();
 		/* maybe some queued-up processes were waiting for this */
 		update_queue(sma);
@@ -1134,7 +1134,7 @@
 	if (error)
 		goto out_unlock_free;
 
-	error = try_atomic_semop (sma, sops, nsops, un, current->tgid);
+	error = try_atomic_semop (sma, sops, nsops, un, task_tgid(current));
 	if (error <= 0) {
 		if (alter && error == 0)
 			update_queue (sma);
@@ -1149,7 +1149,7 @@
 	queue.sops = sops;
 	queue.nsops = nsops;
 	queue.undo = un;
-	queue.pid = current->tgid;
+	queue.pid = task_tgid(current);
 	queue.id = semid;
 	queue.alter = alter;
 	if (alter)
@@ -1319,7 +1319,7 @@
 					sem->semval = 0;
 				if (sem->semval > SEMVMX)
 					sem->semval = SEMVMX;
-				sem->sempid = current->tgid;
+				sem->sempid = task_tgid(current);
 			}
 		}
 		sma->sem_otime = get_seconds();
Index: linux-2.6.15/ipc/shm.c
===================================================================
--- linux-2.6.15.orig/ipc/shm.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/ipc/shm.c	2006-01-17 08:36:59.000000000 -0500
@@ -94,7 +94,7 @@
 	if(!(shp = shm_lock(id)))
 		BUG();
 	shp->shm_atim = get_seconds();
-	shp->shm_lprid = current->tgid;
+	shp->shm_lprid = task_tgid(current);
 	shp->shm_nattch++;
 	shm_unlock(shp);
 }
@@ -144,7 +144,7 @@
 	/* remove from the list of attaches of the shm segment */
 	if(!(shp = shm_lock(id)))
 		BUG();
-	shp->shm_lprid = current->tgid;
+	shp->shm_lprid = task_tgid(current);
 	shp->shm_dtim = get_seconds();
 	shp->shm_nattch--;
 	if(shp->shm_nattch == 0 &&
@@ -232,7 +232,7 @@
 	if(id == -1) 
 		goto no_id;
 
-	shp->shm_cprid = current->tgid;
+	shp->shm_cprid = task_tgid(current);
 	shp->shm_lprid = 0;
 	shp->shm_atim = shp->shm_dtim = 0;
 	shp->shm_ctim = get_seconds();

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 06/34] PID Virtualization Change pid accesses: kernel/
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (4 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 05/34] PID Virtualization Change pid accesses: ipc Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 07/34] PID Virtualization Change pid accesses: lib/ Serge Hallyn
                   ` (28 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: B5-change-pid-tgid-references-kernel --]
[-- Type: text/plain, Size: 24110 bytes --]

Change pid accesses for kernel/.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 acct.c             |    4 ++--
 auditsc.c          |   14 +++++++-------
 capability.c       |    8 ++++----
 cpu.c              |    2 +-
 cpuset.c           |    2 +-
 exit.c             |   30 +++++++++++++++---------------
 fork.c             |   16 ++++++++--------
 futex.c            |    2 +-
 kexec.c            |    3 ++-
 pid.c              |   12 ++++++------
 posix-cpu-timers.c |   15 ++++++++-------
 posix-timers.c     |    5 +++--
 ptrace.c           |    4 ++--
 sched.c            |   15 ++++++++-------
 signal.c           |   22 +++++++++++-----------
 stop_machine.c     |    2 +-
 sys.c              |   10 +++++-----
 sysctl.c           |    2 +-
 timer.c            |    8 ++++----
 19 files changed, 90 insertions(+), 86 deletions(-)

Index: linux-2.6.15/kernel/acct.c
===================================================================
--- linux-2.6.15.orig/kernel/acct.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/acct.c	2006-01-17 08:36:59.000000000 -0500
@@ -485,8 +485,8 @@
 	ac.ac_gid16 = current->gid;
 #endif
 #if ACCT_VERSION==3
-	ac.ac_pid = current->tgid;
-	ac.ac_ppid = current->parent->tgid;
+	ac.ac_pid = task_tgid(current);
+	ac.ac_ppid = task_tgid(current->parent);
 #endif
 
 	read_lock(&tasklist_lock);	/* pin current->signal */
Index: linux-2.6.15/kernel/auditsc.c
===================================================================
--- linux-2.6.15.orig/kernel/auditsc.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/auditsc.c	2006-01-17 08:36:59.000000000 -0500
@@ -401,7 +401,7 @@
 
 		switch (field) {
 		case AUDIT_PID:
-			result = (tsk->pid == value);
+			result = (task_pid(tsk) == value);
 			break;
 		case AUDIT_UID:
 			result = (tsk->uid == value);
@@ -536,7 +536,7 @@
 	struct audit_entry *e;
 	enum audit_state state;
 
-	if (audit_pid && tsk->tgid == audit_pid)
+	if (audit_pid && task_tgid(tsk) == audit_pid)
 		return AUDIT_DISABLED;
 
 	rcu_read_lock();
@@ -633,7 +633,7 @@
 			context->auditable = 1;
 	}
 
-	context->pid = tsk->pid;
+	context->pid = task_pid(tsk);
 	context->uid = tsk->uid;
 	context->gid = tsk->gid;
 	context->euid = tsk->euid;
@@ -975,7 +975,7 @@
 		printk(KERN_ERR
 		       "audit(:%d) pid=%d in syscall=%d;"
 		       " entering syscall=%d\n",
-		       context->serial, tsk->pid, context->major, major);
+		       context->serial, task_pid(tsk), context->major, major);
 #endif
 		newctx = audit_alloc_context(context->state);
 		if (newctx) {
@@ -1179,7 +1179,7 @@
 		if (ab) {
 			audit_log_format(ab, "login pid=%d uid=%u "
 				"old auid=%u new auid=%u",
-				task->pid, task->uid, 
+				task_pid(task), task->uid,
 				task->audit_context->loginuid, loginuid);
 			audit_log_end(ab);
 		}
@@ -1284,10 +1284,10 @@
 	extern pid_t audit_sig_pid;
 	extern uid_t audit_sig_uid;
 
-	if (unlikely(audit_pid && t->tgid == audit_pid)) {
+	if (unlikely(audit_pid && task_tgid(t) == audit_pid)) {
 		if (sig == SIGTERM || sig == SIGHUP) {
 			struct audit_context *ctx = current->audit_context;
-			audit_sig_pid = current->pid;
+			audit_sig_pid = task_pid(current);
 			if (ctx)
 				audit_sig_uid = ctx->loginuid;
 			else
Index: linux-2.6.15/kernel/capability.c
===================================================================
--- linux-2.6.15.orig/kernel/capability.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/capability.c	2006-01-17 08:36:59.000000000 -0500
@@ -66,7 +66,7 @@
      spin_lock(&task_capability_lock);
      read_lock(&tasklist_lock); 
 
-     if (pid && pid != current->pid) {
+     if (pid && pid != task_pid(current)) {
 	     target = find_task_by_pid(pid);
 	     if (!target) {
 	          ret = -ESRCH;
@@ -132,7 +132,7 @@
      int found = 0;
 
      do_each_thread(g, target) {
-             if (target == current || target->pid == 1)
+             if (target == current || task_pid(target) == 1)
                      continue;
              found = 1;
 	     if (security_capset_check(target, effective, inheritable,
@@ -187,7 +187,7 @@
      if (get_user(pid, &header->pid))
 	     return -EFAULT; 
 
-     if (pid && pid != current->pid && !capable(CAP_SETPCAP))
+     if (pid && pid != task_pid(current) && !capable(CAP_SETPCAP))
              return -EPERM;
 
      if (copy_from_user(&effective, &data->effective, sizeof(effective)) ||
@@ -198,7 +198,7 @@
      spin_lock(&task_capability_lock);
      read_lock(&tasklist_lock);
 
-     if (pid > 0 && pid != current->pid) {
+     if (pid > 0 && pid != task_pid(current)) {
           target = find_task_by_pid(pid);
           if (!target) {
                ret = -ESRCH;
Index: linux-2.6.15/kernel/cpu.c
===================================================================
--- linux-2.6.15.orig/kernel/cpu.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/cpu.c	2006-01-17 08:36:59.000000000 -0500
@@ -101,7 +101,7 @@
 		     !cputime_eq(p->stime, cputime_zero)))
 			printk(KERN_WARNING "Task %s (pid = %d) is on cpu %d\
 				(state = %ld, flags = %lx) \n",
-				 p->comm, p->pid, cpu, p->state, p->flags);
+				 p->comm, task_pid(p), cpu, p->state, p->flags);
 	}
 	write_unlock_irq(&tasklist_lock);
 }
Index: linux-2.6.15/kernel/cpuset.c
===================================================================
--- linux-2.6.15.orig/kernel/cpuset.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/cpuset.c	2006-01-17 08:36:59.000000000 -0500
@@ -1261,7 +1261,7 @@
 
 	do_each_thread(g, p) {
 		if (p->cpuset == cs) {
-			pidarray[n++] = p->pid;
+			pidarray[n++] = task_pid(p);
 			if (unlikely(n == npids))
 				goto array_full;
 		}
Index: linux-2.6.15/kernel/exit.c
===================================================================
--- linux-2.6.15.orig/kernel/exit.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/exit.c	2006-01-17 08:36:59.000000000 -0500
@@ -50,7 +50,7 @@
 	if (thread_group_leader(p)) {
 		detach_pid(p, PIDTYPE_PGID);
 		detach_pid(p, PIDTYPE_SID);
-		if (p->pid)
+		if (task_pid(p))
 			__get_cpu_var(process_counts)--;
 	}
 
@@ -170,7 +170,7 @@
 	do_each_task_pid(pgrp, PIDTYPE_PGID, p) {
 		if (p == ignored_task
 				|| p->exit_state
-				|| p->real_parent->pid == 1)
+				|| task_pid(p->real_parent) == 1)
 			continue;
 		if (process_group(p->real_parent) != pgrp
 			    && p->real_parent->signal->session == p->signal->session) {
@@ -797,9 +797,9 @@
 
 	if (unlikely(in_interrupt()))
 		panic("Aiee, killing interrupt handler!");
-	if (unlikely(!tsk->pid))
+	if (unlikely(!task_pid(tsk)))
 		panic("Attempted to kill the idle task!");
-	if (unlikely(tsk->pid == 1))
+	if (unlikely(task_pid(tsk) == 1))
 		panic("Attempted to kill init!");
 	if (tsk->io_context)
 		exit_io_context();
@@ -832,7 +832,7 @@
 
 	if (unlikely(in_atomic()))
 		printk(KERN_INFO "note: %s[%d] exited with preempt_count %d\n",
-				current->comm, current->pid,
+				current->comm, task_pid(current),
 				preempt_count());
 
 	acct_update_integrals(tsk);
@@ -951,7 +951,7 @@
 static int eligible_child(pid_t pid, int options, task_t *p)
 {
 	if (pid > 0) {
-		if (p->pid != pid)
+		if (task_pid(p) != pid)
 			return 0;
 	} else if (!pid) {
 		if (process_group(p) != process_group(current))
@@ -980,7 +980,7 @@
 	 * Do not consider thread group leaders that are
 	 * in a non-empty thread group:
 	 */
-	if (current->tgid != p->tgid && delay_group_leader(p))
+	if (task_tgid(current) != task_tgid(p) && delay_group_leader(p))
 		return 2;
 
 	if (security_task_wait(p))
@@ -1028,7 +1028,7 @@
 	int status;
 
 	if (unlikely(noreap)) {
-		pid_t pid = p->pid;
+		pid_t pid = task_pid(p);
 		uid_t uid = p->uid;
 		int exit_code = p->exit_code;
 		int why, status;
@@ -1135,7 +1135,7 @@
 			retval = put_user(status, &infop->si_status);
 	}
 	if (!retval && infop)
-		retval = put_user(p->pid, &infop->si_pid);
+		retval = put_user(task_pid(p), &infop->si_pid);
 	if (!retval && infop)
 		retval = put_user(p->uid, &infop->si_uid);
 	if (retval) {
@@ -1143,7 +1143,7 @@
 		p->exit_state = EXIT_ZOMBIE;
 		return retval;
 	}
-	retval = p->pid;
+	retval = task_pid(p);
 	if (p->real_parent != p->parent) {
 		write_lock_irq(&tasklist_lock);
 		/* Double-check with lock held.  */
@@ -1203,7 +1203,7 @@
 	read_unlock(&tasklist_lock);
 
 	if (unlikely(noreap)) {
-		pid_t pid = p->pid;
+		pid_t pid = task_pid(p);
 		uid_t uid = p->uid;
 		int why = (p->ptrace & PT_PTRACED) ? CLD_TRAPPED : CLD_STOPPED;
 
@@ -1274,11 +1274,11 @@
 	if (!retval && infop)
 		retval = put_user(exit_code, &infop->si_status);
 	if (!retval && infop)
-		retval = put_user(p->pid, &infop->si_pid);
+		retval = put_user(task_pid(p), &infop->si_pid);
 	if (!retval && infop)
 		retval = put_user(p->uid, &infop->si_uid);
 	if (!retval)
-		retval = p->pid;
+		retval = task_pid(p);
 	put_task_struct(p);
 
 	BUG_ON(!retval);
@@ -1315,7 +1315,7 @@
 		p->signal->flags &= ~SIGNAL_STOP_CONTINUED;
 	spin_unlock_irq(&p->sighand->siglock);
 
-	pid = p->pid;
+	pid = task_pid(p);
 	uid = p->uid;
 	get_task_struct(p);
 	read_unlock(&tasklist_lock);
@@ -1326,7 +1326,7 @@
 		if (!retval && stat_addr)
 			retval = put_user(0xffff, stat_addr);
 		if (!retval)
-			retval = p->pid;
+			retval = task_pid(p);
 	} else {
 		retval = wait_noreap_copyout(p, pid, uid,
 					     CLD_CONTINUED, SIGCONT,
Index: linux-2.6.15/kernel/fork.c
===================================================================
--- linux-2.6.15.orig/kernel/fork.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/fork.c	2006-01-17 08:36:59.000000000 -0500
@@ -848,7 +848,7 @@
 {
 	current->clear_child_tid = tidptr;
 
-	return current->pid;
+	return task_pid(current);
 }
 
 /*
@@ -925,10 +925,10 @@
 
 	p->did_exec = 0;
 	copy_flags(clone_flags, p);
-	p->pid = pid;
+	p->__pid = pid;
 	retval = -EFAULT;
 	if (clone_flags & CLONE_PARENT_SETTID)
-		if (put_user(p->pid, parent_tidptr))
+		if (put_user(task_pid(p), parent_tidptr))
 			goto bad_fork_cleanup;
 
 	p->proc_dentry = NULL;
@@ -973,9 +973,9 @@
  	}
 #endif
 
-	p->tgid = p->pid;
+	p->__tgid = task_pid(p);
 	if (clone_flags & CLONE_THREAD)
-		p->tgid = current->tgid;
+		p->__tgid = task_tgid(current);
 
 	if ((retval = security_task_alloc(p)))
 		goto bad_fork_cleanup_policy;
@@ -1124,12 +1124,12 @@
 	if (unlikely(p->ptrace & PT_PTRACED))
 		__ptrace_link(p, current->parent);
 
-	attach_pid(p, PIDTYPE_PID, p->pid);
-	attach_pid(p, PIDTYPE_TGID, p->tgid);
+	attach_pid(p, PIDTYPE_PID, task_pid(p));
+	attach_pid(p, PIDTYPE_TGID, task_tgid(p));
 	if (thread_group_leader(p)) {
 		attach_pid(p, PIDTYPE_PGID, process_group(p));
 		attach_pid(p, PIDTYPE_SID, p->signal->session);
-		if (p->pid)
+		if (task_pid(p))
 			__get_cpu_var(process_counts)++;
 	}
 
Index: linux-2.6.15/kernel/futex.c
===================================================================
--- linux-2.6.15.orig/kernel/futex.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/futex.c	2006-01-17 08:36:59.000000000 -0500
@@ -780,7 +780,7 @@
 	filp->f_mapping = filp->f_dentry->d_inode->i_mapping;
 
 	if (signal) {
-		err = f_setown(filp, current->pid, 1);
+		err = f_setown(filp, task_pid(current), 1);
 		if (err < 0) {
 			goto error;
 		}
Index: linux-2.6.15/kernel/kexec.c
===================================================================
--- linux-2.6.15.orig/kernel/kexec.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/kexec.c	2006-01-17 08:36:59.000000000 -0500
@@ -36,7 +36,8 @@
 
 int kexec_should_crash(struct task_struct *p)
 {
-	if (in_interrupt() || !p->pid || p->pid == 1 || panic_on_oops)
+	if (in_interrupt() || !task_pid(p) ||
+	    task_pid(p) == 1 || panic_on_oops)
 		return 1;
 	return 0;
 }
Index: linux-2.6.15/kernel/pid.c
===================================================================
--- linux-2.6.15.orig/kernel/pid.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/pid.c	2006-01-17 08:36:59.000000000 -0500
@@ -232,17 +232,17 @@
 	__detach_pid(thread, PIDTYPE_PID);
 	__detach_pid(thread, PIDTYPE_TGID);
 
-	leader->pid = leader->tgid = thread->pid;
-	thread->pid = thread->tgid;
+	leader->__pid = leader->__tgid = thread->__pid;
+	thread->__pid = thread->__tgid;
 
-	attach_pid(thread, PIDTYPE_PID, thread->pid);
-	attach_pid(thread, PIDTYPE_TGID, thread->tgid);
+	attach_pid(thread, PIDTYPE_PID, thread->__pid);
+	attach_pid(thread, PIDTYPE_TGID, thread->__tgid);
 	attach_pid(thread, PIDTYPE_PGID, thread->signal->pgrp);
 	attach_pid(thread, PIDTYPE_SID, thread->signal->session);
 	list_add_tail(&thread->tasks, &init_task.tasks);
 
-	attach_pid(leader, PIDTYPE_PID, leader->pid);
-	attach_pid(leader, PIDTYPE_TGID, leader->tgid);
+	attach_pid(leader, PIDTYPE_PID, leader->__pid);
+	attach_pid(leader, PIDTYPE_TGID, leader->__tgid);
 	attach_pid(leader, PIDTYPE_PGID, leader->signal->pgrp);
 	attach_pid(leader, PIDTYPE_SID, leader->signal->session);
 }
Index: linux-2.6.15/kernel/posix-cpu-timers.c
===================================================================
--- linux-2.6.15.orig/kernel/posix-cpu-timers.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/posix-cpu-timers.c	2006-01-17 08:36:59.000000000 -0500
@@ -22,7 +22,8 @@
 	read_lock(&tasklist_lock);
 	p = find_task_by_pid(pid);
 	if (!p || (CPUCLOCK_PERTHREAD(which_clock) ?
-		   p->tgid != current->tgid : p->tgid != pid)) {
+		   task_tgid(p) != task_tgid(current)
+		   : task_tgid(p) != pid)) {
 		error = -EINVAL;
 	}
 	read_unlock(&tasklist_lock);
@@ -238,7 +239,7 @@
 		while ((t = next_thread(t)) != p) {
 			cpu->sched += t->sched_time;
 		}
-		if (p->tgid == current->tgid) {
+		if (task_tgid(p) == task_tgid(current)) {
 			/*
 			 * We're sampling ourselves, so include the
 			 * cycles not yet banked.  We still omit
@@ -306,11 +307,11 @@
 		p = find_task_by_pid(pid);
 		if (p) {
 			if (CPUCLOCK_PERTHREAD(which_clock)) {
-				if (p->tgid == current->tgid) {
+				if (task_tgid(p) == task_tgid(current)) {
 					error = cpu_clock_sample(which_clock,
 								 p, &rtn);
 				}
-			} else if (p->tgid == pid && p->signal) {
+			} else if (task_tgid(p) == pid && p->signal) {
 				error = cpu_clock_sample_group(which_clock,
 							       p, &rtn);
 			}
@@ -348,7 +349,7 @@
 			p = current;
 		} else {
 			p = find_task_by_pid(pid);
-			if (p && p->tgid != current->tgid)
+			if (p && task_tgid(p) != task_tgid(current))
 				p = NULL;
 		}
 	} else {
@@ -356,7 +357,7 @@
 			p = current->group_leader;
 		} else {
 			p = find_task_by_pid(pid);
-			if (p && p->tgid != pid)
+			if (p && task_tgid(p) != pid)
 				p = NULL;
 		}
 	}
@@ -1423,7 +1424,7 @@
 	 */
 	if (CPUCLOCK_PERTHREAD(which_clock) &&
 	    (CPUCLOCK_PID(which_clock) == 0 ||
-	     CPUCLOCK_PID(which_clock) == current->pid))
+	     CPUCLOCK_PID(which_clock) == task_pid(current)))
 		return -EINVAL;
 
 	/*
Index: linux-2.6.15/kernel/posix-timers.c
===================================================================
--- linux-2.6.15.orig/kernel/posix-timers.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/posix-timers.c	2006-01-17 08:36:59.000000000 -0500
@@ -519,7 +519,7 @@
 
 	if ((event->sigev_notify & SIGEV_THREAD_ID ) &&
 		(!(rtn = find_task_by_pid(event->sigev_notify_thread_id)) ||
-		 rtn->tgid != current->tgid ||
+		 task_tgid(rtn) != task_tgid(current) ||
 		 (event->sigev_notify & ~SIGEV_THREAD_ID) != SIGEV_SIGNAL))
 		return NULL;
 
@@ -747,7 +747,8 @@
 		spin_unlock(&idr_lock);
 
 		if ((timr->it_id != timer_id) || !(timr->it_process) ||
-				timr->it_process->tgid != current->tgid) {
+			task_tgid(timr->it_process) !=
+				task_tgid(current)) {
 			unlock_timer(timr, *flags);
 			timr = NULL;
 		}
Index: linux-2.6.15/kernel/ptrace.c
===================================================================
--- linux-2.6.15.orig/kernel/ptrace.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/ptrace.c	2006-01-17 08:36:59.000000000 -0500
@@ -153,9 +153,9 @@
 	int retval;
 	task_lock(task);
 	retval = -EPERM;
-	if (task->pid <= 1)
+	if (task_pid(task) <= 1)
 		goto bad;
-	if (task->tgid == current->tgid)
+	if (task_tgid(task) == task_tgid(current))
 		goto bad;
 	/* the same process cannot be attached many times */
 	if (task->ptrace & PT_PTRACED)
Index: linux-2.6.15/kernel/sched.c
===================================================================
--- linux-2.6.15.orig/kernel/sched.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/sched.c	2006-01-17 08:36:59.000000000 -0500
@@ -1653,7 +1653,7 @@
 	preempt_enable();
 #endif
 	if (current->set_child_tid)
-		put_user(current->pid, current->set_child_tid);
+		put_user(task_pid(current), current->set_child_tid);
 }
 
 /*
@@ -2961,7 +2961,7 @@
 		if (unlikely(in_atomic())) {
 			printk(KERN_ERR "scheduling while atomic: "
 				"%s/0x%08x/%d\n",
-				current->comm, preempt_count(), current->pid);
+				current->comm, preempt_count(), task_pid(current));
 			dump_stack();
 		}
 	}
@@ -4333,17 +4333,18 @@
 		free = (unsigned long)n - (unsigned long)end_of_stack(p);
 	}
 #endif
-	printk("%5lu %5d %6d ", free, p->pid, p->parent->pid);
+	printk("%5lu %5d %6d ", free, task_pid(p),
+				      task_pid(p->parent));
 	if ((relative = eldest_child(p)))
-		printk("%5d ", relative->pid);
+		printk("%5d ", task_pid(relative));
 	else
 		printk("      ");
 	if ((relative = younger_sibling(p)))
-		printk("%7d", relative->pid);
+		printk("%7d", task_pid(relative));
 	else
 		printk("       ");
 	if ((relative = older_sibling(p)))
-		printk(" %5d", relative->pid);
+		printk(" %5d", task_pid(relative));
 	else
 		printk("      ");
 	if (!p->mm)
@@ -4623,7 +4624,7 @@
 		if (tsk->mm && printk_ratelimit())
 			printk(KERN_INFO "process %d (%s) no "
 			       "longer affine to cpu%d\n",
-			       tsk->pid, tsk->comm, dead_cpu);
+			       task_pid(tsk), tsk->comm, dead_cpu);
 	}
 	__migrate_task(tsk, dead_cpu, dest_cpu);
 }
Index: linux-2.6.15/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/kernel/signal.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/signal.c	2006-01-17 08:36:59.000000000 -0500
@@ -800,7 +800,7 @@
 			q->info.si_signo = sig;
 			q->info.si_errno = 0;
 			q->info.si_code = SI_USER;
-			q->info.si_pid = current->pid;
+			q->info.si_pid = task_pid(current);
 			q->info.si_uid = current->uid;
 			break;
 		case (unsigned long) SEND_SIG_PRIV:
@@ -937,7 +937,7 @@
 		if (t == NULL)
 			/* restart balancing at this thread */
 			t = p->signal->curr_target = p;
-		BUG_ON(t->tgid != p->tgid);
+		BUG_ON(task_tgid(t) != task_tgid(p));
 
 		while (!wants_signal(sig, t)) {
 			t = next_thread(t);
@@ -1199,7 +1199,7 @@
 
 		read_lock(&tasklist_lock);
 		for_each_process(p) {
-			if (p->pid > 1 && p->tgid != current->tgid) {
+			if (task_pid(p) > 1 && task_tgid(p) != task_tgid(current)) {
 				int err = group_send_sig_info(sig, info, p);
 				++count;
 				if (err != -EPERM)
@@ -1469,7 +1469,7 @@
 
 	info.si_signo = sig;
 	info.si_errno = 0;
-	info.si_pid = tsk->pid;
+	info.si_pid = task_pid(tsk);
 	info.si_uid = tsk->uid;
 
 	/* FIXME: find out whether or not this is supposed to be c*time. */
@@ -1534,7 +1534,7 @@
 
 	info.si_signo = SIGCHLD;
 	info.si_errno = 0;
-	info.si_pid = tsk->pid;
+	info.si_pid = task_pid(tsk);
 	info.si_uid = tsk->uid;
 
 	/* FIXME: find out whether or not this is supposed to be c*time. */
@@ -1637,7 +1637,7 @@
 	memset(&info, 0, sizeof info);
 	info.si_signo = SIGTRAP;
 	info.si_code = exit_code;
-	info.si_pid = current->pid;
+	info.si_pid = task_pid(current);
 	info.si_uid = current->uid;
 
 	/* Let the debugger run.  */
@@ -1862,7 +1862,7 @@
 				info->si_signo = signr;
 				info->si_errno = 0;
 				info->si_code = SI_USER;
-				info->si_pid = current->parent->pid;
+				info->si_pid = task_pid(current->parent);
 				info->si_uid = current->parent->uid;
 			}
 
@@ -1893,7 +1893,7 @@
 			continue;
 
 		/* Init gets no signals it doesn't want.  */
-		if (current->pid == 1)
+		if (task_pid(current) == 1)
 			continue;
 
 		if (sig_kernel_stop(signr)) {
@@ -2245,7 +2245,7 @@
 	info.si_signo = sig;
 	info.si_errno = 0;
 	info.si_code = SI_USER;
-	info.si_pid = current->tgid;
+	info.si_pid = task_tgid(current);
 	info.si_uid = current->uid;
 
 	return kill_something_info(sig, &info, pid);
@@ -2261,12 +2261,12 @@
 	info.si_signo = sig;
 	info.si_errno = 0;
 	info.si_code = SI_TKILL;
-	info.si_pid = current->tgid;
+	info.si_pid = task_tgid(current);
 	info.si_uid = current->uid;
 
 	read_lock(&tasklist_lock);
 	p = find_task_by_pid(pid);
-	if (p && (tgid <= 0 || p->tgid == tgid)) {
+	if (p && (tgid <= 0 || task_tgid(p) == tgid)) {
 		error = check_kill_permission(sig, &info, p);
 		/*
 		 * The null signal is a permissions and process existence
Index: linux-2.6.15/kernel/stop_machine.c
===================================================================
--- linux-2.6.15.orig/kernel/stop_machine.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/stop_machine.c	2006-01-17 08:36:59.000000000 -0500
@@ -91,7 +91,7 @@
 
 	/* One high-prio thread per cpu.  We'll do this one. */
 	set_fs(KERNEL_DS);
-	sys_sched_setscheduler(current->pid, SCHED_FIFO,
+	sys_sched_setscheduler(task_pid(current), SCHED_FIFO,
 				(struct sched_param __user *)&param);
 	set_fs(old_fs);
 
Index: linux-2.6.15/kernel/sys.c
===================================================================
--- linux-2.6.15.orig/kernel/sys.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/sys.c	2006-01-17 08:36:59.000000000 -0500
@@ -268,7 +268,7 @@
 	switch (which) {
 		case PRIO_PROCESS:
 			if (!who)
-				who = current->pid;
+				who = task_pid(current);
 			p = find_task_by_pid(who);
 			if (p)
 				error = set_one_prio(p, niceval, error);
@@ -321,7 +321,7 @@
 	switch (which) {
 		case PRIO_PROCESS:
 			if (!who)
-				who = current->pid;
+				who = task_pid(current);
 			p = find_task_by_pid(who);
 			if (p) {
 				niceval = 20 - task_nice(p);
@@ -1087,7 +1087,7 @@
 	int err = -EINVAL;
 
 	if (!pid)
-		pid = current->pid;
+		pid = task_pid(current);
 	if (!pgid)
 		pgid = pid;
 	if (pgid < 0)
@@ -1217,12 +1217,12 @@
 	down(&tty_sem);
 	write_lock_irq(&tasklist_lock);
 
-	pid = find_pid(PIDTYPE_PGID, current->pid);
+	pid = find_pid(PIDTYPE_PGID, task_pid(current));
 	if (pid)
 		goto out;
 
 	current->signal->leader = 1;
-	__set_special_pids(current->pid, current->pid);
+	__set_special_pids(task_pid(current), task_pid(current));
 	current->signal->tty = NULL;
 	current->signal->tty_old_pgrp = 0;
 	err = process_group(current);
Index: linux-2.6.15/kernel/sysctl.c
===================================================================
--- linux-2.6.15.orig/kernel/sysctl.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/sysctl.c	2006-01-17 08:36:59.000000000 -0500
@@ -1764,7 +1764,7 @@
 		return -EPERM;
 	}
 
-	op = (current->pid == 1) ? OP_SET : OP_AND;
+	op = (task_pid(current) == 1) ? OP_SET : OP_AND;
 	return do_proc_dointvec(table,write,filp,buffer,lenp,ppos,
 				do_proc_dointvec_bset_conv,&op);
 }
Index: linux-2.6.15/kernel/timer.c
===================================================================
--- linux-2.6.15.orig/kernel/timer.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/kernel/timer.c	2006-01-17 08:36:59.000000000 -0500
@@ -937,11 +937,11 @@
  * the pid are identical unless CLONE_THREAD was specified on clone() in
  * which case the tgid is the same in all threads of the same group.
  *
- * This is SMP safe as current->tgid does not change.
+ * This is SMP safe as task_tgid(current) does not change.
  */
 asmlinkage long sys_getpid(void)
 {
-	return current->tgid;
+	return task_tgid(current);
 }
 
 /*
@@ -968,7 +968,7 @@
 
 	parent = me->group_leader->real_parent;
 	for (;;) {
-		pid = parent->tgid;
+		pid = task_tgid(parent);
 #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
 {
 		struct task_struct *old = parent;
@@ -1115,7 +1115,7 @@
 /* Thread ID - the internal kernel "pid" */
 asmlinkage long sys_gettid(void)
 {
-	return current->pid;
+	return task_pid(current);
 }
 
 static long __sched nanosleep_restart(struct restart_block *restart)

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 07/34] PID Virtualization Change pid accesses: lib/
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (5 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 06/34] PID Virtualization Change pid accesses: kernel/ Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 08/34] PID Virtualization Change pid accesses: mm/ Serge Hallyn
                   ` (27 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: B6-change-pid-tgid-references-lib --]
[-- Type: text/plain, Size: 3619 bytes --]

Change pid accesses under lib/.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 rwsem-spinlock.c   |    2 +-
 rwsem.c            |    2 +-
 smp_processor_id.c |    2 +-
 spinlock_debug.c   |   12 ++++++------
 4 files changed, 9 insertions(+), 9 deletions(-)

Index: linux-2.6.15/lib/rwsem-spinlock.c
===================================================================
--- linux-2.6.15.orig/lib/rwsem-spinlock.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/lib/rwsem-spinlock.c	2006-01-17 08:37:00.000000000 -0500
@@ -22,7 +22,7 @@
 {
 	if (sem->debug)
 		printk("[%d] %s({%d,%d})\n",
-		       current->pid, str, sem->activity,
+		       task_pid(current), str, sem->activity,
 		       list_empty(&sem->wait_list) ? 0 : 1);
 }
 #endif
Index: linux-2.6.15/lib/rwsem.c
===================================================================
--- linux-2.6.15.orig/lib/rwsem.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/lib/rwsem.c	2006-01-17 08:37:00.000000000 -0500
@@ -23,7 +23,7 @@
 	printk("sem=%p\n", sem);
 	printk("(sem)=%08lx\n", sem->count);
 	if (sem->debug)
-		printk("[%d] %s({%08lx})\n", current->pid, str, sem->count);
+		printk("[%d] %s({%08lx})\n", task_pid(current), str, sem->count);
 }
 #endif
 
Index: linux-2.6.15/lib/smp_processor_id.c
===================================================================
--- linux-2.6.15.orig/lib/smp_processor_id.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/lib/smp_processor_id.c	2006-01-17 08:37:00.000000000 -0500
@@ -42,7 +42,7 @@
 	if (!printk_ratelimit())
 		goto out_enable;
 
-	printk(KERN_ERR "BUG: using smp_processor_id() in preemptible [%08x] code: %s/%d\n", preempt_count(), current->comm, current->pid);
+	printk(KERN_ERR "BUG: using smp_processor_id() in preemptible [%08x] code: %s/%d\n", preempt_count(), current->comm, task_pid(current));
 	print_symbol("caller is %s\n", (long)__builtin_return_address(0));
 	dump_stack();
 
Index: linux-2.6.15/lib/spinlock_debug.c
===================================================================
--- linux-2.6.15.orig/lib/spinlock_debug.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/lib/spinlock_debug.c	2006-01-17 08:37:00.000000000 -0500
@@ -21,11 +21,11 @@
 			owner = lock->owner;
 		printk("BUG: spinlock %s on CPU#%d, %s/%d\n",
 			msg, raw_smp_processor_id(),
-			current->comm, current->pid);
+			current->comm, task_pid(current));
 		printk(" lock: %p, .magic: %08x, .owner: %s/%d, .owner_cpu: %d\n",
 			lock, lock->magic,
 			owner ? owner->comm : "<none>",
-			owner ? owner->pid : -1,
+			owner ? task_pid(owner) : -1,
 			lock->owner_cpu);
 		dump_stack();
 #ifdef CONFIG_SMP
@@ -80,7 +80,7 @@
 			print_once = 0;
 			printk("BUG: spinlock lockup on CPU#%d, %s/%d, %p\n",
 				raw_smp_processor_id(), current->comm,
-				current->pid, lock);
+				task_pid(current), lock);
 			dump_stack();
 		}
 	}
@@ -122,7 +122,7 @@
 	if (xchg(&print_once, 0)) {
 		printk("BUG: rwlock %s on CPU#%d, %s/%d, %p\n", msg,
 			raw_smp_processor_id(), current->comm,
-			current->pid, lock);
+			task_pid(current), lock);
 		dump_stack();
 #ifdef CONFIG_SMP
 		/*
@@ -151,7 +151,7 @@
 			print_once = 0;
 			printk("BUG: read-lock lockup on CPU#%d, %s/%d, %p\n",
 				raw_smp_processor_id(), current->comm,
-				current->pid, lock);
+				task_pid(current), lock);
 			dump_stack();
 		}
 	}
@@ -223,7 +223,7 @@
 			print_once = 0;
 			printk("BUG: write-lock lockup on CPU#%d, %s/%d, %p\n",
 				raw_smp_processor_id(), current->comm,
-				current->pid, lock);
+				task_pid(current), lock);
 			dump_stack();
 		}
 	}

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 08/34] PID Virtualization Change pid accesses: mm/
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (6 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 07/34] PID Virtualization Change pid accesses: lib/ Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 09/34] PID Virtualization Change pid accesses: net/ Serge Hallyn
                   ` (26 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: B7-change-pid-tgid-references-mm --]
[-- Type: text/plain, Size: 3443 bytes --]

Change pid accesses for mm/.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 nommu.c    |   14 +++++++-------
 oom_kill.c |   10 +++++-----
 2 files changed, 12 insertions(+), 12 deletions(-)

Index: linux-2.6.15/mm/nommu.c
===================================================================
--- linux-2.6.15.orig/mm/nommu.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/mm/nommu.c	2006-01-17 08:37:00.000000000 -0500
@@ -264,7 +264,7 @@
 {
 	struct vm_list_struct *vml;
 
-	printk("Process blocks %d:", current->pid);
+	printk("Process blocks %d:", task_pid(current));
 
 	for (vml = &current->mm->context.vmlist; vml; vml = vml->next) {
 		printk(" %p: %p", vml, vml->vma);
@@ -380,7 +380,7 @@
 	if (flags & MAP_FIXED || addr) {
 		printk(KERN_DEBUG
 		       "%d: Can't do fixed-address/overlay mmap of RAM\n",
-		       current->pid);
+		       task_pid(current));
 		return -EINVAL;
 	}
 
@@ -626,7 +626,7 @@
 #ifdef WARN_ON_SLACK
 	if (len + WARN_ON_SLACK <= kobjsize(result))
 		printk("Allocation of %lu bytes from process %d has %lu bytes of slack\n",
-		       len, current->pid, kobjsize(result) - len);
+		       len, task_pid(current), kobjsize(result) - len);
 #endif
 
 	if (vma->vm_file) {
@@ -663,7 +663,7 @@
 
 enomem:
 	printk("Allocation of length %lu from process %d failed\n",
-	       len, current->pid);
+	       len, task_pid(current));
 	show_free_areas();
 	return -ENOMEM;
 }
@@ -855,13 +855,13 @@
 	up_write(&nommu_vma_sem);
 	kfree(vml);
 	printk("Allocation of vma for %lu byte allocation from process %d failed\n",
-	       len, current->pid);
+	       len, task_pid(current));
 	show_free_areas();
 	return -ENOMEM;
 
  error_getting_vml:
 	printk("Allocation of vml for %lu byte allocation from process %d failed\n",
-	       len, current->pid);
+	       len, task_pid(current));
 	show_free_areas();
 	return -ENOMEM;
 }
@@ -915,7 +915,7 @@
 			goto found;
 
 	printk("munmap of non-mmaped memory by process %d (%s): %p\n",
-	       current->pid, current->comm, (void *) addr);
+	       task_pid(current), current->comm, (void *) addr);
 	return -EINVAL;
 
  found:
Index: linux-2.6.15/mm/oom_kill.c
===================================================================
--- linux-2.6.15.orig/mm/oom_kill.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/mm/oom_kill.c	2006-01-17 08:37:00.000000000 -0500
@@ -125,7 +125,7 @@
 
 #ifdef DEBUG
 	printk(KERN_DEBUG "OOMkill: task %d (%s) got %d points\n",
-	p->pid, p->comm, points);
+	task_pid(p), p->comm, points);
 #endif
 	return points;
 }
@@ -149,7 +149,7 @@
 		int releasing;
 
 		/* skip the init task with pid == 1 */
-		if (p->pid == 1)
+		if (task_pid(p) == 1)
 			continue;
 		if (p->oomkilladj == OOM_DISABLE)
 			continue;
@@ -184,7 +184,7 @@
  */
 static void __oom_kill_task(task_t *p)
 {
-	if (p->pid == 1) {
+	if (task_pid(p) == 1) {
 		WARN_ON(1);
 		printk(KERN_WARNING "tried to kill init!\n");
 		return;
@@ -199,7 +199,7 @@
 	}
 	task_unlock(p);
 	printk(KERN_ERR "Out of Memory: Killed process %d (%s).\n",
-							p->pid, p->comm);
+							task_pid(p), p->comm);
 
 	/*
 	 * We give our sacrificial lamb high priority and access to
@@ -230,7 +230,7 @@
 	 * but are in a different thread group
 	 */
 	do_each_thread(g, q)
-		if (q->mm == mm && q->tgid != p->tgid)
+		if (q->mm == mm && task_tgid(q) != task_tgid(p))
 			__oom_kill_task(q);
 	while_each_thread(g, q);
 

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 09/34] PID Virtualization Change pid accesses: net/
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (7 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 08/34] PID Virtualization Change pid accesses: mm/ Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 10/34] PID Virtualization Change pid accesses: security/ Serge Hallyn
                   ` (25 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: B8-change-pid-tgid-references-net --]
[-- Type: text/plain, Size: 11553 bytes --]

Change pid accesses for net/.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 core/pktgen.c          |    4 ++--
 core/rtnetlink.c       |    3 ++-
 core/scm.c             |    2 +-
 ipv4/devinet.c         |    2 +-
 ipv4/fib_semantics.c   |    2 +-
 ipv4/ipvs/ip_vs_sync.c |    6 +++---
 ipv4/tcp.c             |    2 +-
 ipv6/addrconf.c        |    6 +++---
 ipv6/ip6_flowlabel.c   |    2 +-
 ipv6/route.c           |    2 +-
 llc/af_llc.c           |    2 +-
 netlink/af_netlink.c   |    2 +-
 rxrpc/krxiod.c         |    2 +-
 rxrpc/krxsecd.c        |    2 +-
 rxrpc/krxtimod.c       |    2 +-
 sunrpc/sched.c         |    2 +-
 unix/af_unix.c         |    6 +++---
 17 files changed, 25 insertions(+), 24 deletions(-)

Index: linux-2.6.15/net/core/pktgen.c
===================================================================
--- linux-2.6.15.orig/net/core/pktgen.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/core/pktgen.c	2006-01-17 08:37:01.000000000 -0500
@@ -2699,9 +2699,9 @@
 	t->control &= ~(T_STOP);
 	t->control &= ~(T_REMDEV);
 
-        t->pid = current->pid;        
+        t->pid = task_pid(current);
 
-        PG_DEBUG(printk("pktgen: starting pktgen/%d:  pid=%d\n", cpu, current->pid));
+        PG_DEBUG(printk("pktgen: starting pktgen/%d:  pid=%d\n", cpu, task_pid(current)));
 
 	max_before_softirq = t->max_before_softirq;
         
Index: linux-2.6.15/net/core/rtnetlink.c
===================================================================
--- linux-2.6.15.orig/net/core/rtnetlink.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/core/rtnetlink.c	2006-01-17 08:37:01.000000000 -0500
@@ -455,7 +455,8 @@
 	if (!skb)
 		return;
 
-	if (rtnetlink_fill_ifinfo(skb, dev, type, current->pid, 0, change, 0) < 0) {
+	if (rtnetlink_fill_ifinfo(skb, dev, type, task_pid(current),
+				  0, change, 0) < 0) {
 		kfree_skb(skb);
 		return;
 	}
Index: linux-2.6.15/net/core/scm.c
===================================================================
--- linux-2.6.15.orig/net/core/scm.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/core/scm.c	2006-01-17 08:37:01.000000000 -0500
@@ -41,7 +41,7 @@
 
 static __inline__ int scm_check_creds(struct ucred *creds)
 {
-	if ((creds->pid == current->tgid || capable(CAP_SYS_ADMIN)) &&
+	if ((creds->pid == task_tgid(current) || capable(CAP_SYS_ADMIN)) &&
 	    ((creds->uid == current->uid || creds->uid == current->euid ||
 	      creds->uid == current->suid) || capable(CAP_SETUID)) &&
 	    ((creds->gid == current->gid || creds->gid == current->egid ||
Index: linux-2.6.15/net/ipv4/devinet.c
===================================================================
--- linux-2.6.15.orig/net/ipv4/devinet.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/ipv4/devinet.c	2006-01-17 08:37:01.000000000 -0500
@@ -1133,7 +1133,7 @@
 
 	if (!skb)
 		netlink_set_err(rtnl, 0, RTNLGRP_IPV4_IFADDR, ENOBUFS);
-	else if (inet_fill_ifaddr(skb, ifa, current->pid, 0, event, 0) < 0) {
+	else if (inet_fill_ifaddr(skb, ifa, task_pid(current), 0, event, 0) < 0) {
 		kfree_skb(skb);
 		netlink_set_err(rtnl, 0, RTNLGRP_IPV4_IFADDR, EINVAL);
 	} else {
Index: linux-2.6.15/net/ipv4/fib_semantics.c
===================================================================
--- linux-2.6.15.orig/net/ipv4/fib_semantics.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/ipv4/fib_semantics.c	2006-01-17 08:37:01.000000000 -0500
@@ -1043,7 +1043,7 @@
 	}
 
 	nl->nlmsg_flags = NLM_F_REQUEST;
-	nl->nlmsg_pid = current->pid;
+	nl->nlmsg_pid = task_pid(current);
 	nl->nlmsg_seq = 0;
 	nl->nlmsg_len = NLMSG_LENGTH(sizeof(*rtm));
 	if (cmd == SIOCDELRT) {
Index: linux-2.6.15/net/ipv4/ipvs/ip_vs_sync.c
===================================================================
--- linux-2.6.15.orig/net/ipv4/ipvs/ip_vs_sync.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/ipv4/ipvs/ip_vs_sync.c	2006-01-17 08:37:01.000000000 -0500
@@ -786,7 +786,7 @@
 
 	add_wait_queue(&sync_wait, &wait);
 
-	set_sync_pid(state, current->pid);
+	set_sync_pid(state, task_pid(current));
 	complete((struct completion *)startup);
 
 	/* processing master/backup loop here */
@@ -841,7 +841,7 @@
 	    (state == IP_VS_STATE_BACKUP && sync_backup_pid))
 		return -EEXIST;
 
-	IP_VS_DBG(7, "%s: pid %d\n", __FUNCTION__, current->pid);
+	IP_VS_DBG(7, "%s: pid %d\n", __FUNCTION__, task_pid(current));
 	IP_VS_DBG(7, "Each ip_vs_sync_conn entry need %Zd bytes\n",
 		  sizeof(struct ip_vs_sync_conn));
 
@@ -876,7 +876,7 @@
 	    (state == IP_VS_STATE_BACKUP && !sync_backup_pid))
 		return -ESRCH;
 
-	IP_VS_DBG(7, "%s: pid %d\n", __FUNCTION__, current->pid);
+	IP_VS_DBG(7, "%s: pid %d\n", __FUNCTION__, task_pid(current));
 	IP_VS_INFO("stopping sync thread %d ...\n",
 		   (state == IP_VS_STATE_MASTER) ? sync_master_pid : sync_backup_pid);
 
Index: linux-2.6.15/net/ipv4/tcp.c
===================================================================
--- linux-2.6.15.orig/net/ipv4/tcp.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/ipv4/tcp.c	2006-01-17 08:37:01.000000000 -0500
@@ -1299,7 +1299,7 @@
 		if ((flags & MSG_PEEK) && peek_seq != tp->copied_seq) {
 			if (net_ratelimit())
 				printk(KERN_DEBUG "TCP(%s:%d): Application bug, race in MSG_PEEK.\n",
-				       current->comm, current->pid);
+				       current->comm, task_pid(current));
 			peek_seq = tp->copied_seq;
 		}
 		continue;
Index: linux-2.6.15/net/ipv6/addrconf.c
===================================================================
--- linux-2.6.15.orig/net/ipv6/addrconf.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/ipv6/addrconf.c	2006-01-17 08:37:01.000000000 -0500
@@ -3097,7 +3097,7 @@
 		netlink_set_err(rtnl, 0, RTNLGRP_IPV6_IFADDR, ENOBUFS);
 		return;
 	}
-	if (inet6_fill_ifaddr(skb, ifa, current->pid, 0, event, 0) < 0) {
+	if (inet6_fill_ifaddr(skb, ifa, task_pid(current), 0, event, 0) < 0) {
 		kfree_skb(skb);
 		netlink_set_err(rtnl, 0, RTNLGRP_IPV6_IFADDR, EINVAL);
 		return;
@@ -3232,7 +3232,7 @@
 		netlink_set_err(rtnl, 0, RTNLGRP_IPV6_IFINFO, ENOBUFS);
 		return;
 	}
-	if (inet6_fill_ifinfo(skb, idev, current->pid, 0, event, 0) < 0) {
+	if (inet6_fill_ifinfo(skb, idev, task_pid(current), 0, event, 0) < 0) {
 		kfree_skb(skb);
 		netlink_set_err(rtnl, 0, RTNLGRP_IPV6_IFINFO, EINVAL);
 		return;
@@ -3292,7 +3292,7 @@
 		netlink_set_err(rtnl, 0, RTNLGRP_IPV6_PREFIX, ENOBUFS);
 		return;
 	}
-	if (inet6_fill_prefix(skb, idev, pinfo, current->pid, 0, event, 0) < 0) {
+	if (inet6_fill_prefix(skb, idev, pinfo, task_pid(current), 0, event, 0) < 0) {
 		kfree_skb(skb);
 		netlink_set_err(rtnl, 0, RTNLGRP_IPV6_PREFIX, EINVAL);
 		return;
Index: linux-2.6.15/net/ipv6/ip6_flowlabel.c
===================================================================
--- linux-2.6.15.orig/net/ipv6/ip6_flowlabel.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/ipv6/ip6_flowlabel.c	2006-01-17 08:37:01.000000000 -0500
@@ -338,7 +338,7 @@
 	case IPV6_FL_S_ANY:
 		break;
 	case IPV6_FL_S_PROCESS:
-		fl->owner = current->pid;
+		fl->owner = task_pid(current);
 		break;
 	case IPV6_FL_S_USER:
 		fl->owner = current->euid;
Index: linux-2.6.15/net/ipv6/route.c
===================================================================
--- linux-2.6.15.orig/net/ipv6/route.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/ipv6/route.c	2006-01-17 08:37:01.000000000 -0500
@@ -1843,7 +1843,7 @@
 {
 	struct sk_buff *skb;
 	int size = NLMSG_SPACE(sizeof(struct rtmsg)+256);
-	u32 pid = current->pid;
+	u32 pid = task_pid(current);
 	u32 seq = 0;
 
 	if (req)
Index: linux-2.6.15/net/netlink/af_netlink.c
===================================================================
--- linux-2.6.15.orig/net/netlink/af_netlink.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/netlink/af_netlink.c	2006-01-17 08:37:01.000000000 -0500
@@ -476,7 +476,7 @@
 	struct hlist_head *head;
 	struct sock *osk;
 	struct hlist_node *node;
-	s32 pid = current->tgid;
+	s32 pid = task_tgid(current);
 	int err;
 	static s32 rover = -4097;
 
Index: linux-2.6.15/net/rxrpc/krxiod.c
===================================================================
--- linux-2.6.15.orig/net/rxrpc/krxiod.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/rxrpc/krxiod.c	2006-01-17 08:37:01.000000000 -0500
@@ -40,7 +40,7 @@
 {
 	DECLARE_WAITQUEUE(krxiod,current);
 
-	printk("Started krxiod %d\n",current->pid);
+	printk("Started krxiod %d\n",task_pid(current));
 
 	daemonize("krxiod");
 
Index: linux-2.6.15/net/rxrpc/krxsecd.c
===================================================================
--- linux-2.6.15.orig/net/rxrpc/krxsecd.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/rxrpc/krxsecd.c	2006-01-17 08:37:01.000000000 -0500
@@ -53,7 +53,7 @@
 
 	int die;
 
-	printk("Started krxsecd %d\n", current->pid);
+	printk("Started krxsecd %d\n", task_pid(current));
 
 	daemonize("krxsecd");
 
Index: linux-2.6.15/net/rxrpc/krxtimod.c
===================================================================
--- linux-2.6.15.orig/net/rxrpc/krxtimod.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/rxrpc/krxtimod.c	2006-01-17 08:37:01.000000000 -0500
@@ -68,7 +68,7 @@
 
 	rxrpc_timer_t *timer;
 
-	printk("Started krxtimod %d\n", current->pid);
+	printk("Started krxtimod %d\n", task_pid(current));
 
 	daemonize("krxtimod");
 
Index: linux-2.6.15/net/sunrpc/sched.c
===================================================================
--- linux-2.6.15.orig/net/sunrpc/sched.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/sunrpc/sched.c	2006-01-17 08:37:01.000000000 -0500
@@ -792,7 +792,7 @@
 	spin_unlock(&rpc_sched_lock);
 
 	dprintk("RPC: %4d new task procpid %d\n", task->tk_pid,
-				current->pid);
+				task_pid(current));
 }
 
 static struct rpc_task *
Index: linux-2.6.15/net/unix/af_unix.c
===================================================================
--- linux-2.6.15.orig/net/unix/af_unix.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/unix/af_unix.c	2006-01-17 08:37:01.000000000 -0500
@@ -439,7 +439,7 @@
 	sk->sk_max_ack_backlog	= backlog;
 	sk->sk_state		= TCP_LISTEN;
 	/* set credentials so connect can copy them */
-	sk->sk_peercred.pid	= current->tgid;
+	sk->sk_peercred.pid	= task_tgid(current);
 	sk->sk_peercred.uid	= current->euid;
 	sk->sk_peercred.gid	= current->egid;
 	err = 0;
@@ -1043,7 +1043,7 @@
 	unix_peer(newsk)	= sk;
 	newsk->sk_state		= TCP_ESTABLISHED;
 	newsk->sk_type		= sk->sk_type;
-	newsk->sk_peercred.pid	= current->tgid;
+	newsk->sk_peercred.pid	= task_tgid(current);
 	newsk->sk_peercred.uid	= current->euid;
 	newsk->sk_peercred.gid	= current->egid;
 	newu = unix_sk(newsk);
@@ -1105,7 +1105,7 @@
 	sock_hold(skb);
 	unix_peer(ska)=skb;
 	unix_peer(skb)=ska;
-	ska->sk_peercred.pid = skb->sk_peercred.pid = current->tgid;
+	ska->sk_peercred.pid = skb->sk_peercred.pid = task_tgid(current);
 	ska->sk_peercred.uid = skb->sk_peercred.uid = current->euid;
 	ska->sk_peercred.gid = skb->sk_peercred.gid = current->egid;
 
Index: linux-2.6.15/net/llc/af_llc.c
===================================================================
--- linux-2.6.15.orig/net/llc/af_llc.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/net/llc/af_llc.c	2006-01-17 08:37:01.000000000 -0500
@@ -760,7 +760,7 @@
 			if (net_ratelimit())
 				printk(KERN_DEBUG "LLC(%s:%d): Application "
 						  "bug, race in MSG_PEEK.\n",
-				       current->comm, current->pid);
+				       current->comm, task_pid(current));
 			peek_seq = llc->copied_seq;
 		}
 		continue;

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 10/34] PID Virtualization Change pid accesses: security/
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (8 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 09/34] PID Virtualization Change pid accesses: net/ Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 11/34] PID Virtualization Change pid accesses: sound/ Serge Hallyn
                   ` (24 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: B9-change-pid-tgid-references-security --]
[-- Type: text/plain, Size: 5095 bytes --]

Change pid accesses for security modules.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 commoncap.c             |    2 +-
 keys/process_keys.c     |    6 +++---
 keys/request_key_auth.c |    2 +-
 seclvl.c                |   16 ++++++++--------
 selinux/avc.c           |    4 ++--
 5 files changed, 15 insertions(+), 15 deletions(-)

Index: linux-2.6.15/security/commoncap.c
===================================================================
--- linux-2.6.15.orig/security/commoncap.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/security/commoncap.c	2006-01-17 08:37:01.000000000 -0500
@@ -169,7 +169,7 @@
 	/* For init, we want to retain the capabilities set
 	 * in the init_task struct. Thus we skip the usual
 	 * capability rules */
-	if (current->pid != 1) {
+	if (task_pid(current) != 1) {
 		current->cap_permitted = new_permitted;
 		current->cap_effective =
 		    cap_intersect (new_permitted, bprm->cap_effective);
Index: linux-2.6.15/security/keys/process_keys.c
===================================================================
--- linux-2.6.15.orig/security/keys/process_keys.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/security/keys/process_keys.c	2006-01-17 08:37:01.000000000 -0500
@@ -140,7 +140,7 @@
 	char buf[20];
 	int ret;
 
-	sprintf(buf, "_tid.%u", tsk->pid);
+	sprintf(buf, "_tid.%u", task_pid(tsk));
 
 	keyring = keyring_alloc(buf, tsk->uid, tsk->gid, 1, NULL);
 	if (IS_ERR(keyring)) {
@@ -173,7 +173,7 @@
 	int ret;
 
 	if (!tsk->signal->process_keyring) {
-		sprintf(buf, "_pid.%u", tsk->tgid);
+		sprintf(buf, "_pid.%u", task_tgid(tsk));
 
 		keyring = keyring_alloc(buf, tsk->uid, tsk->gid, 1, NULL);
 		if (IS_ERR(keyring)) {
@@ -213,7 +213,7 @@
 
 	/* create an empty session keyring */
 	if (!keyring) {
-		sprintf(buf, "_ses.%u", tsk->tgid);
+		sprintf(buf, "_ses.%u", task_tgid(tsk));
 
 		keyring = keyring_alloc(buf, tsk->uid, tsk->gid, 1, NULL);
 		if (IS_ERR(keyring)) {
Index: linux-2.6.15/security/keys/request_key_auth.c
===================================================================
--- linux-2.6.15.orig/security/keys/request_key_auth.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/security/keys/request_key_auth.c	2006-01-17 08:37:01.000000000 -0500
@@ -60,7 +60,7 @@
 		else {
 			/* it isn't - use this process as the context */
 			rka->context = current;
-			rka->pid = current->pid;
+			rka->pid = task_pid(current);
 		}
 
 		rka->target_key = key_get((struct key *) data);
Index: linux-2.6.15/security/seclvl.c
===================================================================
--- linux-2.6.15.orig/security/seclvl.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/security/seclvl.c	2006-01-17 08:37:01.000000000 -0500
@@ -296,7 +296,7 @@
 static int seclvl_ptrace(struct task_struct *parent, struct task_struct *child)
 {
 	if (seclvl >= 0) {
-		if (child->pid == 1) {
+		if (task_pid(child) == 1) {
 			seclvl_printk(1, KERN_WARNING, "Attempt to ptrace "
 				      "the init process dissallowed in "
 				      "secure level %d\n", seclvl);
@@ -313,7 +313,7 @@
 static int seclvl_capable(struct task_struct *tsk, int cap)
 {
 	/* init can do anything it wants */
-	if (tsk->pid == 1)
+	if (task_pid(tsk) == 1)
 		return 0;
 
 	switch (seclvl) {
@@ -375,10 +375,10 @@
 		    (tv->tv_sec == now.tv_sec && tv->tv_nsec < now.tv_nsec)) {
 			seclvl_printk(1, KERN_WARNING, "Attempt to decrement "
 				      "time in secure level %d denied: "
-				      "current->pid = [%d], "
-				      "current->group_leader->pid = [%d]\n",
-				      seclvl, current->pid,
-				      current->group_leader->pid);
+				      "current pid = [%d], "
+				      "current->group_leader pid = [%d]\n",
+				      seclvl, task_pid(current),
+				      task_pid(current->group_leader));
 			return -EPERM;
 		}		/* if attempt to decrement time */
 	}			/* if seclvl > 1 */
@@ -424,7 +424,7 @@
 static int
 seclvl_inode_permission(struct inode *inode, int mask, struct nameidata *nd)
 {
-	if (current->pid != 1 && S_ISBLK(inode->i_mode) && (mask & MAY_WRITE)) {
+	if (task_pid(current) != 1 && S_ISBLK(inode->i_mode) && (mask & MAY_WRITE)) {
 		switch (seclvl) {
 		case 2:
 			seclvl_printk(1, KERN_WARNING, "Write to block device "
@@ -479,7 +479,7 @@
  */
 static int seclvl_umount(struct vfsmount *mnt, int flags)
 {
-	if (current->pid == 1)
+	if (task_pid(current) == 1)
 		return 0;
 	if (seclvl == 2) {
 		seclvl_printk(1, KERN_WARNING, "Attempt to unmount in secure "
Index: linux-2.6.15/security/selinux/avc.c
===================================================================
--- linux-2.6.15.orig/security/selinux/avc.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/security/selinux/avc.c	2006-01-17 08:37:01.000000000 -0500
@@ -558,8 +558,8 @@
 	audit_log_format(ab, " for ");
 	if (a && a->tsk)
 		tsk = a->tsk;
-	if (tsk && tsk->pid) {
-		audit_log_format(ab, " pid=%d comm=", tsk->pid);
+	if (tsk && task_pid(tsk)) {
+		audit_log_format(ab, " pid=%d comm=", task_pid(tsk));
 		audit_log_untrustedstring(ab, tsk->comm);
 	}
 	if (a) {

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 11/34] PID Virtualization Change pid accesses: sound/
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (9 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 10/34] PID Virtualization Change pid accesses: security/ Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 12/34] PID Virtualization Change pid accesses: ia64 and mips Serge Hallyn
                   ` (23 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: BA-change-pid-tgid-references-sound --]
[-- Type: text/plain, Size: 6786 bytes --]

Change pid accesses for sound drivers.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 core/control.c          |    4 ++--
 core/pcm.c              |    2 +-
 core/rawmidi.c          |    2 +-
 core/timer.c            |    4 ++--
 oss/forte.c             |    2 +-
 pci/korg1212/korg1212.c |    4 ++--
 pci/rme9652/hdsp.c      |    4 ++--
 pci/rme9652/hdspm.c     |    4 ++--
 pci/rme9652/rme9652.c   |    4 ++--
 9 files changed, 15 insertions(+), 15 deletions(-)

Index: linux-2.6.15/sound/core/control.c
===================================================================
--- linux-2.6.15.orig/sound/core/control.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/sound/core/control.c	2006-01-17 08:37:02.000000000 -0500
@@ -78,7 +78,7 @@
 	init_waitqueue_head(&ctl->change_sleep);
 	spin_lock_init(&ctl->read_lock);
 	ctl->card = card;
-	ctl->pid = current->pid;
+	ctl->pid = task_pid(current);
 	file->private_data = ctl;
 	write_lock_irqsave(&card->ctl_files_rwlock, flags);
 	list_add_tail(&ctl->list, &card->ctl_files);
@@ -781,7 +781,7 @@
 			result = -EBUSY;
 		else {
 			vd->owner = file;
-			vd->owner_pid = current->pid;
+			vd->owner_pid = task_pid(current);
 			result = 0;
 		}
 	}
Index: linux-2.6.15/sound/core/pcm.c
===================================================================
--- linux-2.6.15.orig/sound/core/pcm.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/sound/core/pcm.c	2006-01-17 08:37:02.000000000 -0500
@@ -754,7 +754,7 @@
 	down_read(&card->controls_rwsem);
 	list_for_each(list, &card->ctl_files) {
 		kctl = snd_ctl_file(list);
-		if (kctl->pid == current->pid) {
+		if (kctl->pid == task_pid(current)) {
 			prefer_subdevice = kctl->prefer_pcm_subdevice;
 			break;
 		}
Index: linux-2.6.15/sound/core/rawmidi.c
===================================================================
--- linux-2.6.15.orig/sound/core/rawmidi.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/sound/core/rawmidi.c	2006-01-17 08:37:02.000000000 -0500
@@ -423,7 +423,7 @@
 		down_read(&card->controls_rwsem);
 		list_for_each(list, &card->ctl_files) {
 			kctl = snd_ctl_file(list);
-			if (kctl->pid == current->pid) {
+			if (kctl->pid == task_pid(current)) {
 				subdevice = kctl->prefer_rawmidi_subdevice;
 				break;
 			}
Index: linux-2.6.15/sound/core/timer.c
===================================================================
--- linux-2.6.15.orig/sound/core/timer.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/sound/core/timer.c	2006-01-17 08:37:02.000000000 -0500
@@ -1512,10 +1512,10 @@
 		err = -EFAULT;
 		goto __err;
 	}
-	sprintf(str, "application %i", current->pid);
+	sprintf(str, "application %i", task_pid(current));
 	if (tselect.id.dev_class != SNDRV_TIMER_CLASS_SLAVE)
 		tselect.id.dev_sclass = SNDRV_TIMER_SCLASS_APPLICATION;
-	err = snd_timer_open(&tu->timeri, str, &tselect.id, current->pid);
+	err = snd_timer_open(&tu->timeri, str, &tselect.id, task_pid(current));
 	if (err < 0)
 		goto __err;
 
Index: linux-2.6.15/sound/oss/forte.c
===================================================================
--- linux-2.6.15.orig/sound/oss/forte.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/sound/oss/forte.c	2006-01-17 08:37:02.000000000 -0500
@@ -1256,7 +1256,7 @@
 
 	file->private_data = forte;
 
-	DPRINTK ("%s: dsp opened by %d\n", __FUNCTION__, current->pid);
+	DPRINTK ("%s: dsp opened by %d\n", __FUNCTION__, task_pid(current));
 
 	if (file->f_mode & FMODE_WRITE)
 		forte_channel_init (forte, &forte->play);
Index: linux-2.6.15/sound/pci/korg1212/korg1212.c
===================================================================
--- linux-2.6.15.orig/sound/pci/korg1212/korg1212.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/sound/pci/korg1212/korg1212.c	2006-01-17 08:37:02.000000000 -0500
@@ -1443,7 +1443,7 @@
         spin_lock_irqsave(&korg1212->lock, flags);
 
         korg1212->playback_substream = substream;
-	korg1212->playback_pid = current->pid;
+	korg1212->playback_pid = task_pid(current);
         korg1212->periodsize = K1212_PERIODS;
 	korg1212->channels = K1212_CHANNELS;
 	korg1212->errorcnt = 0;
@@ -1475,7 +1475,7 @@
         spin_lock_irqsave(&korg1212->lock, flags);
 
         korg1212->capture_substream = substream;
-	korg1212->capture_pid = current->pid;
+	korg1212->capture_pid = task_pid(current);
         korg1212->periodsize = K1212_PERIODS;
 	korg1212->channels = K1212_CHANNELS;
 
Index: linux-2.6.15/sound/pci/rme9652/hdsp.c
===================================================================
--- linux-2.6.15.orig/sound/pci/rme9652/hdsp.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/sound/pci/rme9652/hdsp.c	2006-01-17 08:37:02.000000000 -0500
@@ -4188,7 +4188,7 @@
 	runtime->dma_area = hdsp->playback_buffer;
 	runtime->dma_bytes = HDSP_DMA_AREA_BYTES;
 
-	hdsp->playback_pid = current->pid;
+	hdsp->playback_pid = task_pid(current);
 	hdsp->playback_substream = substream;
 
 	spin_unlock_irq(&hdsp->lock);
@@ -4261,7 +4261,7 @@
 	runtime->dma_area = hdsp->capture_buffer;
 	runtime->dma_bytes = HDSP_DMA_AREA_BYTES;
 
-	hdsp->capture_pid = current->pid;
+	hdsp->capture_pid = task_pid(current);
 	hdsp->capture_substream = substream;
 
 	spin_unlock_irq(&hdsp->lock);
Index: linux-2.6.15/sound/pci/rme9652/hdspm.c
===================================================================
--- linux-2.6.15.orig/sound/pci/rme9652/hdspm.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/sound/pci/rme9652/hdspm.c	2006-01-17 08:37:02.000000000 -0500
@@ -3102,7 +3102,7 @@
 	if (hdspm->capture_substream == NULL)
 		hdspm_stop_audio(hdspm);
 
-	hdspm->playback_pid = current->pid;
+	hdspm->playback_pid = task_pid(current);
 	hdspm->playback_substream = substream;
 
 	spin_unlock_irq(&hdspm->lock);
@@ -3151,7 +3151,7 @@
 	if (hdspm->playback_substream == NULL)
 		hdspm_stop_audio(hdspm);
 
-	hdspm->capture_pid = current->pid;
+	hdspm->capture_pid = task_pid(current);
 	hdspm->capture_substream = substream;
 
 	spin_unlock_irq(&hdspm->lock);
Index: linux-2.6.15/sound/pci/rme9652/rme9652.c
===================================================================
--- linux-2.6.15.orig/sound/pci/rme9652/rme9652.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/sound/pci/rme9652/rme9652.c	2006-01-17 08:37:02.000000000 -0500
@@ -2321,7 +2321,7 @@
 		rme9652_set_thru(rme9652, -1, 0);
 	}
 
-	rme9652->playback_pid = current->pid;
+	rme9652->playback_pid = task_pid(current);
 	rme9652->playback_substream = substream;
 
 	spin_unlock_irq(&rme9652->lock);
@@ -2381,7 +2381,7 @@
 		rme9652_set_thru(rme9652, -1, 0);
 	}
 
-	rme9652->capture_pid = current->pid;
+	rme9652->capture_pid = task_pid(current);
 	rme9652->capture_substream = substream;
 
 	spin_unlock_irq(&rme9652->lock);

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 12/34] PID Virtualization Change pid accesses: ia64 and mips
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (10 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 11/34] PID Virtualization Change pid accesses: sound/ Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 13/34] PID Virtualization Define new task_pid api Serge Hallyn
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: BB-ia64-and-mips --]
[-- Type: text/plain, Size: 63829 bytes --]

Change pid accesses for ia64 and mips architectures.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 ia64/ia32/ia32_signal.c            |    4 
 ia64/ia32/sys_ia32.c               |    6 -
 ia64/kernel/asm-offsets.c          |    4 
 ia64/kernel/mca.c                  |   12 +-
 ia64/kernel/mca_drv.c              |    2 
 ia64/kernel/perfmon.c              |  166 ++++++++++++++++++-------------------
 ia64/kernel/perfmon_default_smpl.c |   16 +--
 ia64/kernel/process.c              |    2 
 ia64/kernel/signal.c               |    8 -
 ia64/kernel/traps.c                |    6 -
 ia64/kernel/unaligned.c            |    2 
 ia64/mm/fault.c                    |    2 
 ia64/sn/kernel/xpc_main.c          |    2 
 mips/kernel/irixelf.c              |    8 -
 mips/kernel/irixioctl.c            |    4 
 mips/kernel/irixsig.c              |   20 ++--
 mips/kernel/process.c              |    2 
 mips/kernel/signal.c               |    4 
 mips/kernel/signal32.c             |    4 
 mips/kernel/signal_n32.c           |    2 
 mips/kernel/sysirix.c              |  136 +++++++++++++++---------------
 mips/kernel/time.c                 |    2 
 mips/kernel/traps.c                |    2 
 mips/mm/fault.c                    |    4 
 24 files changed, 210 insertions(+), 210 deletions(-)

Index: linux-2.6.15/arch/ia64/ia32/ia32_signal.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/ia32/ia32_signal.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/ia32/ia32_signal.c	2006-01-17 08:37:02.000000000 -0500
@@ -868,7 +868,7 @@
 
 #if 0
 	printk("SIG deliver (%s:%d): sig=%d sp=%p pc=%lx ra=%x\n",
-               current->comm, current->pid, sig, (void *) frame, regs->cr_iip, frame->pretcode);
+               current->comm, task_pid(current), sig, (void *) frame, regs->cr_iip, frame->pretcode);
 #endif
 
 	return 1;
@@ -946,7 +946,7 @@
 
 #if 0
 	printk("SIG deliver (%s:%d): sp=%p pc=%lx ra=%x\n",
-               current->comm, current->pid, (void *) frame, regs->cr_iip, frame->pretcode);
+               current->comm, task_pid(current), (void *) frame, regs->cr_iip, frame->pretcode);
 #endif
 
 	return 1;
Index: linux-2.6.15/arch/ia64/ia32/sys_ia32.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/ia32/sys_ia32.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/ia32/sys_ia32.c	2006-01-17 08:37:02.000000000 -0500
@@ -769,7 +769,7 @@
 			if (flags & MAP_SHARED)
 				printk(KERN_INFO
 				       "%s(%d): emulate_mmap() can't share head (addr=0x%lx)\n",
-				       current->comm, current->pid, start);
+				       current->comm, task_pid(current), start);
 			ret = mmap_subpage(file, start, min(PAGE_ALIGN(start), end), prot, flags,
 					   off);
 			if (IS_ERR((void *) ret))
@@ -782,7 +782,7 @@
 			if (flags & MAP_SHARED)
 				printk(KERN_INFO
 				       "%s(%d): emulate_mmap() can't share tail (end=0x%lx)\n",
-				       current->comm, current->pid, end);
+				       current->comm, task_pid(current), end);
 			ret = mmap_subpage(file, max(start, PAGE_START(end)), end, prot, flags,
 					   (off + len) - offset_in_page(end));
 			if (IS_ERR((void *) ret))
@@ -812,7 +812,7 @@
 
 	if ((flags & MAP_SHARED) && !is_congruent)
 		printk(KERN_INFO "%s(%d): emulate_mmap() can't share contents of incongruent mmap "
-		       "(addr=0x%lx,off=0x%llx)\n", current->comm, current->pid, start, off);
+		       "(addr=0x%lx,off=0x%llx)\n", current->comm, task_pid(current), start, off);
 
 	DBG("mmap_body: mapping [0x%lx-0x%lx) %s with poff 0x%llx\n", pstart, pend,
 	    is_congruent ? "congruent" : "not congruent", poff);
Index: linux-2.6.15/arch/ia64/kernel/mca.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/kernel/mca.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/kernel/mca.c	2006-01-17 08:37:02.000000000 -0500
@@ -755,9 +755,9 @@
 	 * (swapper or nested MCA/INIT) then use the start of the previous comm
 	 * field suffixed with its cpu.
 	 */
-	if (previous_current->pid)
+	if (task_pid(previous_current))
 		snprintf(comm, sizeof(comm), "%s %d",
-			current->comm, previous_current->pid);
+			current->comm, task_pid(previous_current));
 	else {
 		int l;
 		if ((p = strchr(previous_current->comm, ' ')))
@@ -1233,16 +1233,16 @@
 		s = (struct ia64_sal_os_state *)((char *)t + MCA_SOS_OFFSET);
 		g = s->prev_task;
 		if (g) {
-			if (g->pid)
-				printk(" %d", g->pid);
+			if (task_pid(g))
+				printk(" %d", task_pid(g));
 			else
-				printk(" %d (cpu %d task 0x%p)", g->pid, task_cpu(g), g);
+				printk(" %d (cpu %d task 0x%p)", task_pid(g), task_cpu(g), g);
 		}
 	}
 	printk("\n\n");
 	if (read_trylock(&tasklist_lock)) {
 		do_each_thread (g, t) {
-			printk("\nBacktrace of pid %d (%s)\n", t->pid, t->comm);
+			printk("\nBacktrace of pid %d (%s)\n", task_pid(t), t->comm);
 			show_stack(t, NULL);
 		} while_each_thread (g, t);
 		read_unlock(&tasklist_lock);
Index: linux-2.6.15/arch/ia64/kernel/mca_drv.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/kernel/mca_drv.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/kernel/mca_drv.c	2006-01-17 08:37:02.000000000 -0500
@@ -124,7 +124,7 @@
 mca_handler_bh(unsigned long paddr)
 {
 	printk(KERN_DEBUG "OS_MCA: process [pid: %d](%s) encounters MCA.\n",
-		current->pid, current->comm);
+		task_pid(current), current->comm);
 
 	spin_lock(&mca_bh_lock);
 	switch (mca_page_isolate(paddr)) {
Index: linux-2.6.15/arch/ia64/kernel/perfmon.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/kernel/perfmon.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/kernel/perfmon.c	2006-01-17 08:37:02.000000000 -0500
@@ -154,14 +154,14 @@
  */
 #define PROTECT_CTX(c, f) \
 	do {  \
-		DPRINT(("spinlock_irq_save ctx %p by [%d]\n", c, current->pid)); \
+		DPRINT(("spinlock_irq_save ctx %p by [%d]\n", c, task_pid(current))); \
 		spin_lock_irqsave(&(c)->ctx_lock, f); \
-		DPRINT(("spinlocked ctx %p  by [%d]\n", c, current->pid)); \
+		DPRINT(("spinlocked ctx %p  by [%d]\n", c, task_pid(current))); \
 	} while(0)
 
 #define UNPROTECT_CTX(c, f) \
 	do { \
-		DPRINT(("spinlock_irq_restore ctx %p by [%d]\n", c, current->pid)); \
+		DPRINT(("spinlock_irq_restore ctx %p by [%d]\n", c, task_pid(current))); \
 		spin_unlock_irqrestore(&(c)->ctx_lock, f); \
 	} while(0)
 
@@ -223,12 +223,12 @@
 #ifdef PFM_DEBUGGING
 #define DPRINT(a) \
 	do { \
-		if (unlikely(pfm_sysctl.debug >0)) { printk("%s.%d: CPU%d [%d] ", __FUNCTION__, __LINE__, smp_processor_id(), current->pid); printk a; } \
+		if (unlikely(pfm_sysctl.debug >0)) { printk("%s.%d: CPU%d [%d] ", __FUNCTION__, __LINE__, smp_processor_id(), task_pid(current)); printk a; } \
 	} while (0)
 
 #define DPRINT_ovfl(a) \
 	do { \
-		if (unlikely(pfm_sysctl.debug > 0 && pfm_sysctl.debug_ovfl >0)) { printk("%s.%d: CPU%d [%d] ", __FUNCTION__, __LINE__, smp_processor_id(), current->pid); printk a; } \
+		if (unlikely(pfm_sysctl.debug > 0 && pfm_sysctl.debug_ovfl >0)) { printk("%s.%d: CPU%d [%d] ", __FUNCTION__, __LINE__, smp_processor_id(), task_pid(current)); printk a; } \
 	} while (0)
 #endif
 
@@ -868,7 +868,7 @@
 	unsigned long mask, val, ovfl_mask;
 	int i;
 
-	DPRINT_ovfl(("masking monitoring for [%d]\n", task->pid));
+	DPRINT_ovfl(("masking monitoring for [%d]\n", task_pid(task)));
 
 	ovfl_mask = pmu_conf->ovfl_val;
 	/*
@@ -948,12 +948,12 @@
 	ovfl_mask = pmu_conf->ovfl_val;
 
 	if (task != current) {
-		printk(KERN_ERR "perfmon.%d: invalid task[%d] current[%d]\n", __LINE__, task->pid, current->pid);
+		printk(KERN_ERR "perfmon.%d: invalid task[%d] current[%d]\n", __LINE__, task_pid(task), task_pid(current));
 		return;
 	}
 	if (ctx->ctx_state != PFM_CTX_MASKED) {
 		printk(KERN_ERR "perfmon.%d: task[%d] current[%d] invalid state=%d\n", __LINE__,
-			task->pid, current->pid, ctx->ctx_state);
+			task_pid(task), task_pid(current), ctx->ctx_state);
 		return;
 	}
 	psr = pfm_get_psr();
@@ -1007,7 +1007,7 @@
 		if ((mask & 0x1) == 0UL) continue;
 		th->pmcs[i] = ctx->ctx_pmcs[i];
 		ia64_set_pmc(i, th->pmcs[i]);
-		DPRINT(("[%d] pmc[%d]=0x%lx\n", task->pid, i, th->pmcs[i]));
+		DPRINT(("[%d] pmc[%d]=0x%lx\n", task_pid(task), i, th->pmcs[i]));
 	}
 	ia64_srlz_d();
 
@@ -1328,7 +1328,7 @@
 
 error_conflict:
 	DPRINT(("system wide not possible, conflicting session [%d] on CPU%d\n",
-  		pfm_sessions.pfs_sys_session[cpu]->pid,
+  		task_pid(pfm_sessions.pfs_sys_session[cpu]),
 		cpu));
 abort:
 	UNLOCK_PFS(flags);
@@ -1400,7 +1400,7 @@
 
 	/* sanity checks */
 	if (task->mm == NULL || size == 0UL || vaddr == NULL) {
-		printk(KERN_ERR "perfmon: pfm_remove_smpl_mapping [%d] invalid context mm=%p\n", task->pid, task->mm);
+		printk(KERN_ERR "perfmon: pfm_remove_smpl_mapping [%d] invalid context mm=%p\n", task_pid(task), task->mm);
 		return -EINVAL;
 	}
 
@@ -1417,7 +1417,7 @@
 
 	up_write(&task->mm->mmap_sem);
 	if (r !=0) {
-		printk(KERN_ERR "perfmon: [%d] unable to unmap sampling buffer @%p size=%lu\n", task->pid, vaddr, size);
+		printk(KERN_ERR "perfmon: [%d] unable to unmap sampling buffer @%p size=%lu\n", task_pid(task), vaddr, size);
 	}
 
 	DPRINT(("do_unmap(%p, %lu)=%d\n", vaddr, size, r));
@@ -1459,7 +1459,7 @@
 	return 0;
 
 invalid_free:
-	printk(KERN_ERR "perfmon: pfm_free_smpl_buffer [%d] no buffer\n", current->pid);
+	printk(KERN_ERR "perfmon: pfm_free_smpl_buffer [%d] no buffer\n", task_pid(current));
 	return -EINVAL;
 }
 #endif
@@ -1512,13 +1512,13 @@
 	unsigned long flags;
   	DECLARE_WAITQUEUE(wait, current);
 	if (PFM_IS_FILE(filp) == 0) {
-		printk(KERN_ERR "perfmon: pfm_poll: bad magic [%d]\n", current->pid);
+		printk(KERN_ERR "perfmon: pfm_poll: bad magic [%d]\n", task_pid(current));
 		return -EINVAL;
 	}
 
 	ctx = (pfm_context_t *)filp->private_data;
 	if (ctx == NULL) {
-		printk(KERN_ERR "perfmon: pfm_read: NULL ctx [%d]\n", current->pid);
+		printk(KERN_ERR "perfmon: pfm_read: NULL ctx [%d]\n", task_pid(current));
 		return -EINVAL;
 	}
 
@@ -1572,7 +1572,7 @@
 
 		PROTECT_CTX(ctx, flags);
 	}
-	DPRINT(("[%d] back to running ret=%ld\n", current->pid, ret));
+	DPRINT(("[%d] back to running ret=%ld\n", task_pid(current), ret));
   	set_current_state(TASK_RUNNING);
 	remove_wait_queue(&ctx->ctx_msgq_wait, &wait);
 
@@ -1581,7 +1581,7 @@
 	ret = -EINVAL;
 	msg = pfm_get_next_msg(ctx);
 	if (msg == NULL) {
-		printk(KERN_ERR "perfmon: pfm_read no msg for ctx=%p [%d]\n", ctx, current->pid);
+		printk(KERN_ERR "perfmon: pfm_read no msg for ctx=%p [%d]\n", ctx, task_pid(current));
 		goto abort_locked;
 	}
 
@@ -1612,13 +1612,13 @@
 	unsigned int mask = 0;
 
 	if (PFM_IS_FILE(filp) == 0) {
-		printk(KERN_ERR "perfmon: pfm_poll: bad magic [%d]\n", current->pid);
+		printk(KERN_ERR "perfmon: pfm_poll: bad magic [%d]\n", task_pid(current));
 		return 0;
 	}
 
 	ctx = (pfm_context_t *)filp->private_data;
 	if (ctx == NULL) {
-		printk(KERN_ERR "perfmon: pfm_poll: NULL ctx [%d]\n", current->pid);
+		printk(KERN_ERR "perfmon: pfm_poll: NULL ctx [%d]\n", task_pid(current));
 		return 0;
 	}
 
@@ -1657,7 +1657,7 @@
 	ret = fasync_helper (fd, filp, on, &ctx->ctx_async_queue);
 
 	DPRINT(("pfm_fasync called by [%d] on ctx_fd=%d on=%d async_queue=%p ret=%d\n",
-		current->pid,
+		task_pid(current),
 		fd,
 		on,
 		ctx->ctx_async_queue, ret));
@@ -1672,13 +1672,13 @@
 	int ret;
 
 	if (PFM_IS_FILE(filp) == 0) {
-		printk(KERN_ERR "perfmon: pfm_fasync bad magic [%d]\n", current->pid);
+		printk(KERN_ERR "perfmon: pfm_fasync bad magic [%d]\n", task_pid(current));
 		return -EBADF;
 	}
 
 	ctx = (pfm_context_t *)filp->private_data;
 	if (ctx == NULL) {
-		printk(KERN_ERR "perfmon: pfm_fasync NULL ctx [%d]\n", current->pid);
+		printk(KERN_ERR "perfmon: pfm_fasync NULL ctx [%d]\n", task_pid(current));
 		return -EBADF;
 	}
 	/*
@@ -1724,7 +1724,7 @@
 	if (owner != ctx->ctx_task) {
 		printk(KERN_ERR "perfmon: pfm_syswide_force_stop CPU%d unexpected owner [%d] instead of [%d]\n",
 			smp_processor_id(),
-			owner->pid, ctx->ctx_task->pid);
+			task_pid(owner), task_pid(ctx->ctx_task));
 		return;
 	}
 	if (GET_PMU_CTX() != ctx) {
@@ -1734,7 +1734,7 @@
 		return;
 	}
 
-	DPRINT(("on CPU%d forcing system wide stop for [%d]\n", smp_processor_id(), ctx->ctx_task->pid));	
+	DPRINT(("on CPU%d forcing system wide stop for [%d]\n", smp_processor_id(), task_pid(ctx->ctx_task)));
 	/*
 	 * the context is already protected in pfm_close(), we simply
 	 * need to mask interrupts to avoid a PMU interrupt race on
@@ -1786,7 +1786,7 @@
 
 	ctx = (pfm_context_t *)filp->private_data;
 	if (ctx == NULL) {
-		printk(KERN_ERR "perfmon: pfm_flush: NULL ctx [%d]\n", current->pid);
+		printk(KERN_ERR "perfmon: pfm_flush: NULL ctx [%d]\n", task_pid(current));
 		return -EBADF;
 	}
 
@@ -1934,7 +1934,7 @@
 	
 	ctx = (pfm_context_t *)filp->private_data;
 	if (ctx == NULL) {
-		printk(KERN_ERR "perfmon: pfm_close: NULL ctx [%d]\n", current->pid);
+		printk(KERN_ERR "perfmon: pfm_close: NULL ctx [%d]\n", task_pid(current));
 		return -EBADF;
 	}
 
@@ -2031,7 +2031,7 @@
 	 	 */
 		ctx->ctx_state = PFM_CTX_ZOMBIE;
 
-		DPRINT(("zombie ctx for [%d]\n", task->pid));
+		DPRINT(("zombie ctx for [%d]\n", task_pid(task)));
 		/*
 		 * cannot free the context on the spot. deferred until
 		 * the task notices the ZOMBIE state
@@ -2435,7 +2435,7 @@
 	/* invoke and lock buffer format, if found */
 	fmt = pfm_find_buffer_fmt(arg->ctx_smpl_buf_id);
 	if (fmt == NULL) {
-		DPRINT(("[%d] cannot find buffer format\n", task->pid));
+		DPRINT(("[%d] cannot find buffer format\n", task_pid(task)));
 		return -EINVAL;
 	}
 
@@ -2446,7 +2446,7 @@
 
 	ret = pfm_buf_fmt_validate(fmt, task, ctx_flags, cpu, fmt_arg);
 
-	DPRINT(("[%d] after validate(0x%x,%d,%p)=%d\n", task->pid, ctx_flags, cpu, fmt_arg, ret));
+	DPRINT(("[%d] after validate(0x%x,%d,%p)=%d\n", task_pid(task), ctx_flags, cpu, fmt_arg, ret));
 
 	if (ret) goto error;
 
@@ -2568,23 +2568,23 @@
 	 * no kernel task or task not owner by caller
 	 */
 	if (task->mm == NULL) {
-		DPRINT(("task [%d] has not memory context (kernel thread)\n", task->pid));
+		DPRINT(("task [%d] has not memory context (kernel thread)\n", task_pid(task)));
 		return -EPERM;
 	}
 	if (pfm_bad_permissions(task)) {
-		DPRINT(("no permission to attach to  [%d]\n", task->pid));
+		DPRINT(("no permission to attach to  [%d]\n", task_pid(task)));
 		return -EPERM;
 	}
 	/*
 	 * cannot block in self-monitoring mode
 	 */
 	if (CTX_OVFL_NOBLOCK(ctx) == 0 && task == current) {
-		DPRINT(("cannot load a blocking context on self for [%d]\n", task->pid));
+		DPRINT(("cannot load a blocking context on self for [%d]\n", task_pid(task)));
 		return -EINVAL;
 	}
 
 	if (task->exit_state == EXIT_ZOMBIE) {
-		DPRINT(("cannot attach to  zombie task [%d]\n", task->pid));
+		DPRINT(("cannot attach to  zombie task [%d]\n", task_pid(task)));
 		return -EBUSY;
 	}
 
@@ -2594,7 +2594,7 @@
 	if (task == current) return 0;
 
 	if ((task->state != TASK_STOPPED) && (task->state != TASK_TRACED)) {
-		DPRINT(("cannot attach to non-stopped task [%d] state=%ld\n", task->pid, task->state));
+		DPRINT(("cannot attach to non-stopped task [%d] state=%ld\n", task_pid(task), task->state));
 		return -EBUSY;
 	}
 	/*
@@ -2616,7 +2616,7 @@
 	/* XXX: need to add more checks here */
 	if (pid < 2) return -EPERM;
 
-	if (pid != current->pid) {
+	if (pid != task_pid(current)) {
 
 		read_lock(&tasklist_lock);
 
@@ -3481,7 +3481,7 @@
 
 	if (pmu_conf->use_rr_dbregs == 0) return 0;
 
-	DPRINT(("called for [%d]\n", task->pid));
+	DPRINT(("called for [%d]\n", task_pid(task)));
 
 	/*
 	 * do it only once
@@ -3512,7 +3512,7 @@
 	DPRINT(("ptrace_use_dbregs=%u  sys_use_dbregs=%u by [%d] ret = %d\n",
 		  pfm_sessions.pfs_ptrace_use_dbregs,
 		  pfm_sessions.pfs_sys_use_dbregs,
-		  task->pid, ret));
+		  task_pid(task), ret));
 
 	UNLOCK_PFS(flags);
 
@@ -3537,7 +3537,7 @@
 
 	LOCK_PFS(flags);
 	if (pfm_sessions.pfs_ptrace_use_dbregs == 0) {
-		printk(KERN_ERR "perfmon: invalid release for [%d] ptrace_use_dbregs=0\n", task->pid);
+		printk(KERN_ERR "perfmon: invalid release for [%d] ptrace_use_dbregs=0\n", task_pid(task));
 		ret = -1;
 	}  else {
 		pfm_sessions.pfs_ptrace_use_dbregs--;
@@ -3589,7 +3589,7 @@
 
 	/* sanity check */
 	if (unlikely(task == NULL)) {
-		printk(KERN_ERR "perfmon: [%d] pfm_restart no task\n", current->pid);
+		printk(KERN_ERR "perfmon: [%d] pfm_restart no task\n", task_pid(current));
 		return -EINVAL;
 	}
 
@@ -3598,7 +3598,7 @@
 		fmt = ctx->ctx_buf_fmt;
 
 		DPRINT(("restarting self %d ovfl=0x%lx\n",
-			task->pid,
+			task_pid(task),
 			ctx->ctx_ovfl_regs[0]));
 
 		if (CTX_HAS_SMPL(ctx)) {
@@ -3622,11 +3622,11 @@
 				pfm_reset_regs(ctx, ctx->ctx_ovfl_regs, PFM_PMD_LONG_RESET);
 
 			if (rst_ctrl.bits.mask_monitoring == 0) {
-				DPRINT(("resuming monitoring for [%d]\n", task->pid));
+				DPRINT(("resuming monitoring for [%d]\n", task_pid(task)));
 
 				if (state == PFM_CTX_MASKED) pfm_restore_monitoring(task);
 			} else {
-				DPRINT(("keeping monitoring stopped for [%d]\n", task->pid));
+				DPRINT(("keeping monitoring stopped for [%d]\n", task_pid(task)));
 
 				// cannot use pfm_stop_monitoring(task, regs);
 			}
@@ -3683,10 +3683,10 @@
 	 * "self-monitoring".
 	 */
 	if (CTX_OVFL_NOBLOCK(ctx) == 0 && state == PFM_CTX_MASKED) {
-		DPRINT(("unblocking [%d] \n", task->pid));
+		DPRINT(("unblocking [%d] \n", task_pid(task)));
 		up(&ctx->ctx_restart_sem);
 	} else {
-		DPRINT(("[%d] armed exit trap\n", task->pid));
+		DPRINT(("[%d] armed exit trap\n", task_pid(task)));
 
 		ctx->ctx_fl_trap_reason = PFM_TRAP_REASON_RESET;
 
@@ -3774,7 +3774,7 @@
 	 * don't bother if we are loaded and task is being debugged
 	 */
 	if (is_loaded && (thread->flags & IA64_THREAD_DBG_VALID) != 0) {
-		DPRINT(("debug registers already in use for [%d]\n", task->pid));
+		DPRINT(("debug registers already in use for [%d]\n", task_pid(task)));
 		return -EBUSY;
 	}
 
@@ -3815,7 +3815,7 @@
 	 * is shared by all processes running on it
  	 */
 	if (first_time && can_access_pmu) {
-		DPRINT(("[%d] clearing ibrs, dbrs\n", task->pid));
+		DPRINT(("[%d] clearing ibrs, dbrs\n", task_pid(task)));
 		for (i=0; i < pmu_conf->num_ibrs; i++) {
 			ia64_set_ibr(i, 0UL);
 			ia64_dv_serialize_instruction();
@@ -4004,7 +4004,7 @@
 		return -EBUSY;
 	}
 	DPRINT(("task [%d] ctx_state=%d is_system=%d\n",
-		PFM_CTX_TASK(ctx)->pid,
+		task_pid(PFM_CTX_TASK(ctx)),
 		state,
 		is_system));
 	/*
@@ -4062,7 +4062,7 @@
 		 * monitoring disabled in kernel at next reschedule
 		 */
 		ctx->ctx_saved_psr_up = 0;
-		DPRINT(("task=[%d]\n", task->pid));
+		DPRINT(("task=[%d]\n", task_pid(task)));
 	}
 	return 0;
 }
@@ -4228,7 +4228,7 @@
 
 	DPRINT(("load_pid [%d] using_dbreg=%d\n", req->load_pid, ctx->ctx_fl_using_dbreg));
 
-	if (CTX_OVFL_NOBLOCK(ctx) == 0 && req->load_pid == current->pid) {
+	if (CTX_OVFL_NOBLOCK(ctx) == 0 && req->load_pid == task_pid(current)) {
 		DPRINT(("cannot use blocking mode on self\n"));
 		return -EINVAL;
 	}
@@ -4267,11 +4267,11 @@
 
 		if (is_system) {
 			if (pfm_sessions.pfs_ptrace_use_dbregs) {
-				DPRINT(("cannot load [%d] dbregs in use\n", task->pid));
+				DPRINT(("cannot load [%d] dbregs in use\n", task_pid(task)));
 				ret = -EBUSY;
 			} else {
 				pfm_sessions.pfs_sys_use_dbregs++;
-				DPRINT(("load [%d] increased sys_use_dbreg=%u\n", task->pid, pfm_sessions.pfs_sys_use_dbregs));
+				DPRINT(("load [%d] increased sys_use_dbreg=%u\n", task_pid(task), pfm_sessions.pfs_sys_use_dbregs));
 				set_dbregs = 1;
 			}
 		}
@@ -4363,7 +4363,7 @@
 
 			/* allow user level control */
 			ia64_psr(regs)->sp = 0;
-			DPRINT(("clearing psr.sp for [%d]\n", task->pid));
+			DPRINT(("clearing psr.sp for [%d]\n", task_pid(task)));
 
 			SET_LAST_CPU(ctx, smp_processor_id());
 			INC_ACTIVATION();
@@ -4398,7 +4398,7 @@
 		 */
 		SET_PMU_OWNER(task, ctx);
 
-		DPRINT(("context loaded on PMU for [%d]\n", task->pid));
+		DPRINT(("context loaded on PMU for [%d]\n", task_pid(task)));
 	} else {
 		/*
 		 * when not current, task MUST be stopped, so this is safe
@@ -4462,7 +4462,7 @@
 	int prev_state, is_system;
 	int ret;
 
-	DPRINT(("ctx_state=%d task [%d]\n", ctx->ctx_state, task ? task->pid : -1));
+	DPRINT(("ctx_state=%d task [%d]\n", ctx->ctx_state, task ? task_pid(task) : -1));
 
 	prev_state = ctx->ctx_state;
 	is_system  = ctx->ctx_fl_system;
@@ -4537,7 +4537,7 @@
 		 */
 		ia64_psr(regs)->sp = 1;
 
-		DPRINT(("setting psr.sp for [%d]\n", task->pid));
+		DPRINT(("setting psr.sp for [%d]\n", task_pid(task)));
 	}
 	/*
 	 * save PMDs to context
@@ -4577,7 +4577,7 @@
 	ctx->ctx_fl_can_restart  = 0;
 	ctx->ctx_fl_going_zombie = 0;
 
-	DPRINT(("disconnected [%d] from context\n", task->pid));
+	DPRINT(("disconnected [%d] from context\n", task_pid(task)));
 
 	return 0;
 }
@@ -4600,7 +4600,7 @@
 
 	PROTECT_CTX(ctx, flags);
 
-	DPRINT(("state=%d task [%d]\n", ctx->ctx_state, task->pid));
+	DPRINT(("state=%d task [%d]\n", ctx->ctx_state, task_pid(task)));
 
 	state = ctx->ctx_state;
 	switch(state) {
@@ -4609,13 +4609,13 @@
 	 		 * only comes to thios function if pfm_context is not NULL, i.e., cannot
 			 * be in unloaded state
 	 		 */
-			printk(KERN_ERR "perfmon: pfm_exit_thread [%d] ctx unloaded\n", task->pid);
+			printk(KERN_ERR "perfmon: pfm_exit_thread [%d] ctx unloaded\n", task_pid(task));
 			break;
 		case PFM_CTX_LOADED:
 		case PFM_CTX_MASKED:
 			ret = pfm_context_unload(ctx, NULL, 0, regs);
 			if (ret) {
-				printk(KERN_ERR "perfmon: pfm_exit_thread [%d] state=%d unload failed %d\n", task->pid, state, ret);
+				printk(KERN_ERR "perfmon: pfm_exit_thread [%d] state=%d unload failed %d\n", task_pid(task), state, ret);
 			}
 			DPRINT(("ctx unloaded for current state was %d\n", state));
 
@@ -4624,12 +4624,12 @@
 		case PFM_CTX_ZOMBIE:
 			ret = pfm_context_unload(ctx, NULL, 0, regs);
 			if (ret) {
-				printk(KERN_ERR "perfmon: pfm_exit_thread [%d] state=%d unload failed %d\n", task->pid, state, ret);
+				printk(KERN_ERR "perfmon: pfm_exit_thread [%d] state=%d unload failed %d\n", task_pid(task), state, ret);
 			}
 			free_ok = 1;
 			break;
 		default:
-			printk(KERN_ERR "perfmon: pfm_exit_thread [%d] unexpected state=%d\n", task->pid, state);
+			printk(KERN_ERR "perfmon: pfm_exit_thread [%d] unexpected state=%d\n", task_pid(task), state);
 			break;
 	}
 	UNPROTECT_CTX(ctx, flags);
@@ -4713,7 +4713,7 @@
 	DPRINT(("context %d state=%d [%d] task_state=%ld must_stop=%d\n",
 		ctx->ctx_fd,
 		state,
-		task->pid,
+		task_pid(task),
 		task->state, PFM_CMD_STOPPED(cmd)));
 
 	/*
@@ -4760,7 +4760,7 @@
 	 */
 	if (PFM_CMD_STOPPED(cmd)) {
 		if ((task->state != TASK_STOPPED) && (task->state != TASK_TRACED)) {
-			DPRINT(("[%d] task not in stopped state\n", task->pid));
+			DPRINT(("[%d] task not in stopped state\n", task_pid(task)));
 			return -EBUSY;
 		}
 		/*
@@ -4853,7 +4853,7 @@
 	 * limit abuse to min page size
 	 */
 	if (unlikely(sz > PFM_MAX_ARGSIZE)) {
-		printk(KERN_ERR "perfmon: [%d] argument too big %lu\n", current->pid, sz);
+		printk(KERN_ERR "perfmon: [%d] argument too big %lu\n", task_pid(current), sz);
 		return -E2BIG;
 	}
 
@@ -4998,11 +4998,11 @@
 {
 	int ret;
 
-	DPRINT(("entering for [%d]\n", current->pid));
+	DPRINT(("entering for [%d]\n", task_pid(current)));
 
 	ret = pfm_context_unload(ctx, NULL, 0, regs);
 	if (ret) {
-		printk(KERN_ERR "pfm_context_force_terminate: [%d] unloaded failed with %d\n", current->pid, ret);
+		printk(KERN_ERR "pfm_context_force_terminate: [%d] unloaded failed with %d\n", task_pid(current), ret);
 	}
 
 	/*
@@ -5039,7 +5039,7 @@
 
 	ctx = PFM_GET_CTX(current);
 	if (ctx == NULL) {
-		printk(KERN_ERR "perfmon: [%d] has no PFM context\n", current->pid);
+		printk(KERN_ERR "perfmon: [%d] has no PFM context\n", task_pid(current));
 		return;
 	}
 
@@ -5236,7 +5236,7 @@
 	DPRINT_ovfl(("pmc0=0x%lx pid=%d iip=0x%lx, %s "
 		     "used_pmds=0x%lx\n",
 			pmc0,
-			task ? task->pid: -1,
+			task ? task_pid(task): -1,
 			(regs ? regs->cr_iip : 0),
 			CTX_OVFL_NOBLOCK(ctx) ? "nonblocking" : "blocking",
 			ctx->ctx_used_pmds[0]));
@@ -5425,7 +5425,7 @@
 	}
 
 	DPRINT_ovfl(("owner [%d] pending=%ld reason=%u ovfl_pmds=0x%lx ovfl_notify=0x%lx masked=%d\n",
-			GET_PMU_OWNER() ? GET_PMU_OWNER()->pid : -1,
+			GET_PMU_OWNER() ? task_pid(GET_PMU_OWNER()) : -1,
 			PFM_GET_WORK_PENDING(task),
 			ctx->ctx_fl_trap_reason,
 			ovfl_pmds,
@@ -5450,7 +5450,7 @@
 sanity_check:
 	printk(KERN_ERR "perfmon: CPU%d overflow handler [%d] pmc0=0x%lx\n",
 			smp_processor_id(),
-			task ? task->pid : -1,
+			task ? task_pid(task) : -1,
 			pmc0);
 	return;
 
@@ -5483,7 +5483,7 @@
 	 *
 	 * Overall pretty hairy stuff....
 	 */
-	DPRINT(("ctx is zombie for [%d], converted to spurious\n", task ? task->pid: -1));
+	DPRINT(("ctx is zombie for [%d], converted to spurious\n", task ? task_pid(task): -1));
 	pfm_clear_psr_up();
 	ia64_psr(regs)->up = 0;
 	ia64_psr(regs)->sp = 1;
@@ -5544,13 +5544,13 @@
 
 report_spurious1:
 	printk(KERN_INFO "perfmon: spurious overflow interrupt on CPU%d: process %d has no PFM context\n",
-		this_cpu, task->pid);
+		this_cpu, task_pid(task));
 	pfm_unfreeze_pmu();
 	return -1;
 report_spurious2:
 	printk(KERN_INFO "perfmon: spurious overflow interrupt on CPU%d: process %d, invalid flag\n", 
 		this_cpu, 
-		task->pid);
+		task_pid(task));
 	pfm_unfreeze_pmu();
 	return -1;
 }
@@ -5731,7 +5731,7 @@
 		cpu, pfm_get_cpu_data(pfm_syst_info, cpu) & PFM_CPUINFO_SYST_WIDE ? 1 : 0,
 		cpu, pfm_get_cpu_data(pfm_syst_info, cpu) & PFM_CPUINFO_DCR_PP ? 1 : 0,
 		cpu, pfm_get_cpu_data(pfm_syst_info, cpu) & PFM_CPUINFO_EXCL_IDLE ? 1 : 0,
-		cpu, pfm_get_cpu_data(pmu_owner, cpu) ? pfm_get_cpu_data(pmu_owner, cpu)->pid: -1,
+		cpu, pfm_get_cpu_data(pmu_owner, cpu) ? task_pid(pfm_get_cpu_data(pmu_owner, cpu)): -1,
 		cpu, pfm_get_cpu_data(pmu_ctx, cpu),
 		cpu, pfm_get_cpu_data(pmu_activation_number, cpu));
 
@@ -5792,7 +5792,7 @@
 	 * pid 0 is guaranteed to be the idle task. There is one such task with pid 0
 	 * on every CPU, so we can rely on the pid to identify the idle task.
 	 */
-	if ((info & PFM_CPUINFO_EXCL_IDLE) == 0 || task->pid) {
+	if ((info & PFM_CPUINFO_EXCL_IDLE) == 0 || task_pid(task)) {
 		regs = ia64_task_regs(task);
 		ia64_psr(regs)->pp = is_ctxswin ? dcr_pp : 0;
 		return;
@@ -5836,7 +5836,7 @@
 	ia64_psr(regs)->sp = 1;
 
 	if (GET_PMU_OWNER() == task) {
-		DPRINT(("cleared ownership for [%d]\n", ctx->ctx_task->pid));
+		DPRINT(("cleared ownership for [%d]\n", task_pid(ctx->ctx_task)));
 		SET_PMU_OWNER(NULL, NULL);
 	}
 
@@ -5848,7 +5848,7 @@
 	task->thread.pfm_context  = NULL;
 	task->thread.flags       &= ~IA64_THREAD_PM_VALID;
 
-	DPRINT(("force cleanup for [%d]\n",  task->pid));
+	DPRINT(("force cleanup for [%d]\n",  task_pid(task)));
 }
 
 
@@ -6400,7 +6400,7 @@
 
 		if (PMD_IS_COUNTING(i)) {
 			DPRINT(("[%d] pmd[%d] ctx_pmd=0x%lx hw_pmd=0x%lx\n",
-				task->pid,
+				task_pid(task),
 				i,
 				ctx->ctx_pmds[i].val,
 				val & ovfl_val));
@@ -6422,11 +6422,11 @@
 			 */
 			if (pmc0 & (1UL << i)) {
 				val += 1 + ovfl_val;
-				DPRINT(("[%d] pmd[%d] overflowed\n", task->pid, i));
+				DPRINT(("[%d] pmd[%d] overflowed\n", task_pid(task), i));
 			}
 		}
 
-		DPRINT(("[%d] ctx_pmd[%d]=0x%lx  pmd_val=0x%lx\n", task->pid, i, val, pmd_val));
+		DPRINT(("[%d] ctx_pmd[%d]=0x%lx  pmd_val=0x%lx\n", task_pid(task), i, val, pmd_val));
 
 		if (is_self) task->thread.pmds[i] = pmd_val;
 
@@ -6765,14 +6765,14 @@
 	printk("CPU%d from %s() current [%d] iip=0x%lx %s\n", 
 		this_cpu, 
 		from, 
-		current->pid, 
+		task_pid(current),
 		regs->cr_iip,
 		current->comm);
 
 	task = GET_PMU_OWNER();
 	ctx  = GET_PMU_CTX();
 
-	printk("->CPU%d owner [%d] ctx=%p\n", this_cpu, task ? task->pid : -1, ctx);
+	printk("->CPU%d owner [%d] ctx=%p\n", this_cpu, task ? task_pid(task) : -1, ctx);
 
 	psr = pfm_get_psr();
 
@@ -6822,7 +6822,7 @@
 {
 	struct thread_struct *thread;
 
-	DPRINT(("perfmon: pfm_inherit clearing state for [%d]\n", task->pid));
+	DPRINT(("perfmon: pfm_inherit clearing state for [%d]\n", task_pid(task)));
 
 	thread = &task->thread;
 
Index: linux-2.6.15/arch/ia64/kernel/perfmon_default_smpl.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/kernel/perfmon_default_smpl.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/kernel/perfmon_default_smpl.c	2006-01-17 08:37:02.000000000 -0500
@@ -45,11 +45,11 @@
 	int ret = 0;
 
 	if (data == NULL) {
-		DPRINT(("[%d] no argument passed\n", task->pid));
+		DPRINT(("[%d] no argument passed\n", task_pid(task)));
 		return -EINVAL;
 	}
 
-	DPRINT(("[%d] validate flags=0x%x CPU%d\n", task->pid, flags, cpu));
+	DPRINT(("[%d] validate flags=0x%x CPU%d\n", task_pid(task), flags, cpu));
 
 	/*
 	 * must hold at least the buffer header + one minimally sized entry
@@ -89,7 +89,7 @@
 	hdr->hdr_count        = 0UL;
 
 	DPRINT(("[%d] buffer=%p buf_size=%lu hdr_size=%lu hdr_version=%u cur_offs=%lu\n",
-		task->pid,
+		task_pid(task),
 		buf,
 		hdr->hdr_buf_size,
 		sizeof(*hdr),
@@ -111,7 +111,7 @@
 	unsigned char ovfl_notify;
 
 	if (unlikely(buf == NULL || arg == NULL|| regs == NULL || task == NULL)) {
-		DPRINT(("[%d] invalid arguments buf=%p arg=%p\n", task->pid, buf, arg));
+		DPRINT(("[%d] invalid arguments buf=%p arg=%p\n", task_pid(task), buf, arg));
 		return -EINVAL;
 	}
 
@@ -140,7 +140,7 @@
 	hdr->hdr_count++;
 
 	DPRINT_ovfl(("[%d] count=%lu cur=%p last=%p free_bytes=%lu ovfl_pmd=%d ovfl_notify=%d npmds=%u\n",
-			task->pid,
+			task_pid(task),
 			hdr->hdr_count,
 			cur, last,
 			last-cur,
@@ -157,7 +157,7 @@
 	 * system-wide:
 	 * 	- this is not necessarily the task controlling the session
 	 */
-	ent->pid            = current->pid;
+	ent->pid            = task_pid(current);
 	ent->ovfl_pmd  	    = ovfl_pmd;
 	ent->last_reset_val = arg->pmd_last_reset; //pmd[0].reg_last_reset_val;
 
@@ -169,7 +169,7 @@
 	ent->tstamp    = stamp;
 	ent->cpu       = smp_processor_id();
 	ent->set       = arg->active_set;
-	ent->tgid      = current->tgid;
+	ent->tgid      = task_tgid(current);
 
 	/*
 	 * selectively store PMDs in increasing index number
@@ -246,7 +246,7 @@
 static int
 default_exit(struct task_struct *task, void *buf, struct pt_regs *regs)
 {
-	DPRINT(("[%d] exit(%p)\n", task->pid, buf));
+	DPRINT(("[%d] exit(%p)\n", task_pid(task), buf));
 	return 0;
 }
 
Index: linux-2.6.15/arch/ia64/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/kernel/process.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/kernel/process.c	2006-01-17 08:37:02.000000000 -0500
@@ -109,7 +109,7 @@
 	unsigned long ip = regs->cr_iip + ia64_psr(regs)->ri;
 
 	print_modules();
-	printk("\nPid: %d, CPU %d, comm: %20s\n", current->pid, smp_processor_id(), current->comm);
+	printk("\nPid: %d, CPU %d, comm: %20s\n", task_pid(current), smp_processor_id(), current->comm);
 	printk("psr : %016lx ifs : %016lx ip  : [<%016lx>]    %s\n",
 	       regs->cr_ipsr, regs->cr_ifs, ip, print_tainted());
 	print_symbol("ip is at %s\n", ip);
Index: linux-2.6.15/arch/ia64/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/kernel/signal.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/kernel/signal.c	2006-01-17 08:37:02.000000000 -0500
@@ -257,7 +257,7 @@
 
 #if DEBUG_SIG
 	printk("SIG return (%s:%d): sp=%lx ip=%lx\n",
-	       current->comm, current->pid, scr->pt.r12, scr->pt.cr_iip);
+	       current->comm, task_pid(current), scr->pt.r12, scr->pt.cr_iip);
 #endif
 	/*
 	 * It is more difficult to avoid calling this function than to
@@ -270,7 +270,7 @@
 	si.si_signo = SIGSEGV;
 	si.si_errno = 0;
 	si.si_code = SI_KERNEL;
-	si.si_pid = current->pid;
+	si.si_pid = task_pid(current);
 	si.si_uid = current->uid;
 	si.si_addr = sc;
 	force_sig_info(SIGSEGV, &si, current);
@@ -375,7 +375,7 @@
 	si.si_signo = SIGSEGV;
 	si.si_errno = 0;
 	si.si_code = SI_KERNEL;
-	si.si_pid = current->pid;
+	si.si_pid = task_pid(current);
 	si.si_uid = current->uid;
 	si.si_addr = addr;
 	force_sig_info(SIGSEGV, &si, current);
@@ -448,7 +448,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sig=%d sp=%lx ip=%lx handler=%p\n",
-	       current->comm, current->pid, sig, scr->pt.r12, frame->sc.sc_ip, frame->handler);
+	       current->comm, task_pid(current), sig, scr->pt.r12, frame->sc.sc_ip, frame->handler);
 #endif
 	return 1;
 }
Index: linux-2.6.15/arch/ia64/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/kernel/traps.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/kernel/traps.c	2006-01-17 08:37:02.000000000 -0500
@@ -107,7 +107,7 @@
 
 	if (++die.lock_owner_depth < 3) {
 		printk("%s[%d]: %s %ld [%d]\n",
-			current->comm, current->pid, str, err, ++die_counter);
+			current->comm, task_pid(current), str, err, ++die_counter);
 		(void) notify_die(DIE_OOPS, (char *)str, regs, err, 255, SIGSEGV);
 		show_regs(regs);
   	} else
@@ -328,7 +328,7 @@
 		++fpu_swa_count;
 		printk(KERN_WARNING
 		       "%s(%d): floating-point assist fault at ip %016lx, isr %016lx\n",
-		       current->comm, current->pid, regs->cr_iip + ia64_psr(regs)->ri, isr);
+		       current->comm, task_pid(current), regs->cr_iip + ia64_psr(regs)->ri, isr);
 	}
 
 	exception = fp_emulate(fp_fault, bundle, &regs->cr_ipsr, &regs->ar_fpsr, &isr, &regs->pr,
@@ -464,7 +464,7 @@
 		if (code == 8) {
 # ifdef CONFIG_IA64_PRINT_HAZARDS
 			printk("%s[%d]: possible hazard @ ip=%016lx (pr = %016lx)\n",
-			       current->comm, current->pid,
+			       current->comm, task_pid(current),
 			       regs.cr_iip + ia64_psr(&regs)->ri, regs.pr);
 # endif
 			return;
Index: linux-2.6.15/arch/ia64/kernel/unaligned.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/kernel/unaligned.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/kernel/unaligned.c	2006-01-17 08:37:02.000000000 -0500
@@ -1330,7 +1330,7 @@
 			size_t len;
 
 			len = sprintf(buf, "%s(%d): unaligned access to 0x%016lx, "
-				      "ip=0x%016lx\n\r", current->comm, current->pid,
+				      "ip=0x%016lx\n\r", current->comm, task_pid(current),
 				      ifa, regs->cr_iip + ipsr->ri);
 			/*
 			 * Don't call tty_write_message() if we're in the kernel; we might
Index: linux-2.6.15/arch/ia64/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/mm/fault.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/mm/fault.c	2006-01-17 08:37:02.000000000 -0500
@@ -241,7 +241,7 @@
 
   out_of_memory:
 	up_read(&mm->mmap_sem);
-	if (current->pid == 1) {
+	if (task_pid(current) == 1) {
 		yield();
 		down_read(&mm->mmap_sem);
 		goto survive;
Index: linux-2.6.15/arch/ia64/sn/kernel/xpc_main.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/sn/kernel/xpc_main.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/sn/kernel/xpc_main.c	2006-01-17 08:37:02.000000000 -0500
@@ -508,7 +508,7 @@
 	ret = sched_setscheduler(current, SCHED_FIFO, &param);
 	if (ret != 0) {
 		dev_warn(xpc_part, "unable to set pid %d to a realtime "
-			"priority, ret=%d\n", current->pid, ret);
+			"priority, ret=%d\n", task_pid(current), ret);
 	}
 
 	/* allow this thread and its children to run on any CPU */
Index: linux-2.6.15/arch/mips/kernel/irixelf.c
===================================================================
--- linux-2.6.15.orig/arch/mips/kernel/irixelf.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/mips/kernel/irixelf.c	2006-01-17 08:37:02.000000000 -0500
@@ -556,9 +556,9 @@
 		return;
 
 	pp = (struct prda *) v;
-	pp->prda_sys.t_pid  = current->pid;
+	pp->prda_sys.t_pid  = task_pid(current);
 	pp->prda_sys.t_prid = read_c0_prid();
-	pp->prda_sys.t_rpid = current->pid;
+	pp->prda_sys.t_rpid = task_pid(current);
 
 	/* We leave the rest set to zero */
 }
@@ -1148,11 +1148,11 @@
 	prstatus.pr_info.si_signo = prstatus.pr_cursig = signr;
 	prstatus.pr_sigpend = current->pending.signal.sig[0];
 	prstatus.pr_sighold = current->blocked.sig[0];
-	psinfo.pr_pid = prstatus.pr_pid = current->pid;
+	psinfo.pr_pid = prstatus.pr_pid = task_pid(current);
 	psinfo.pr_ppid = prstatus.pr_ppid = current->parent->pid;
 	psinfo.pr_pgrp = prstatus.pr_pgrp = process_group(current);
 	psinfo.pr_sid = prstatus.pr_sid = current->signal->session;
-	if (current->pid == current->tgid) {
+	if (task_pid(current) == task_tgid(current)) {
 		/*
 		 * This is the record for the group leader.  Add in the
 		 * cumulative times of previous dead threads.  This total
Index: linux-2.6.15/arch/mips/kernel/irixioctl.c
===================================================================
--- linux-2.6.15.orig/arch/mips/kernel/irixioctl.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/mips/kernel/irixioctl.c	2006-01-17 08:37:02.000000000 -0500
@@ -62,7 +62,7 @@
 	int i, error = 0;
 
 #ifdef DEBUG_IOCTLS
-	printk("[%s:%d] irix_ioctl(%d, ", current->comm, current->pid, fd);
+	printk("[%s:%d] irix_ioctl(%d, ", current->comm, task_pid(current), fd);
 #endif
 	switch(cmd) {
 	case 0x00005401:
@@ -236,7 +236,7 @@
 		sys_write(2, msg, strlen(msg));
 		set_fs(old_fs);
 		printk("[%s:%d] Does unimplemented IRIX ioctl cmd %08lx\n",
-		       current->comm, current->pid, cmd);
+		       current->comm, task_pid(current), cmd);
 		do_exit(255);
 #else
 		error = sys_ioctl (fd, cmd, arg);
Index: linux-2.6.15/arch/mips/kernel/irixsig.c
===================================================================
--- linux-2.6.15.orig/arch/mips/kernel/irixsig.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/mips/kernel/irixsig.c	2006-01-17 08:37:02.000000000 -0500
@@ -232,7 +232,7 @@
 	sig = (int) regs->regs[base + 6];
 #ifdef DEBUG_SIG
 	printk("[%s:%d] IRIX sigreturn(scp[%p],ucp[%p],sig[%d])\n",
-	       current->comm, current->pid, context, magic, sig);
+	       current->comm, task_pid(current), context, magic, sig);
 #endif
 	if (!context)
 		context = magic;
@@ -471,7 +471,7 @@
 
 #ifdef DEBUG_SIG
 	printk("[%s:%d] irix_sigpoll_sys(%p,%p,%p)\n",
-	       current->comm, current->pid, set, info, tp);
+	       current->comm, task_pid(current), set, info, tp);
 #endif
 
 	/* Must always specify the signal set. */
@@ -584,7 +584,7 @@
 	tsk = current;
 	list_for_each(_p,&tsk->children) {
 		p = list_entry(_p,struct task_struct,sibling);
-		if ((type == IRIX_P_PID) && p->pid != pid)
+		if ((type == IRIX_P_PID) && task_pid(p) != pid)
 			continue;
 		if ((type == IRIX_P_PGID) && process_group(p) != pid)
 			continue;
@@ -611,7 +611,7 @@
 
 			retval = __put_user(SIGCHLD, &info->sig);
 			retval |= __put_user(0, &info->code);
-			retval |= __put_user(p->pid, &info->stuff.procinfo.pid);
+			retval |= __put_user(task_pid(p), &info->stuff.procinfo.pid);
 			retval |= __put_user((p->exit_code >> 8) & 0xff,
 			           &info->stuff.procinfo.procdata.child.status);
 			retval |= __put_user(p->utime, &info->stuff.procinfo.procdata.child.utime);
@@ -629,7 +629,7 @@
 				getrusage(p, RUSAGE_BOTH, ru);
 			retval = __put_user(SIGCHLD, &info->sig);
 			retval |= __put_user(1, &info->code);      /* CLD_EXITED */
-			retval |= __put_user(p->pid, &info->stuff.procinfo.pid);
+			retval |= __put_user(task_pid(p), &info->stuff.procinfo.pid);
 			retval |= __put_user((p->exit_code >> 8) & 0xff,
 			           &info->stuff.procinfo.procdata.child.status);
 			retval |= __put_user(p->utime,
@@ -699,7 +699,7 @@
 
 #ifdef DEBUG_SIG
 	printk("[%s:%d] irix_getcontext(%p)\n",
-	       current->comm, current->pid, ctx);
+	       current->comm, task_pid(current), ctx);
 #endif
 
 	if (!access_ok(VERIFY_WRITE, ctx, sizeof(*ctx)));
@@ -747,7 +747,7 @@
 
 #ifdef DEBUG_SIG
 	printk("[%s:%d] irix_setcontext(%p)\n",
-	       current->comm, current->pid, ctx);
+	       current->comm, task_pid(current), ctx);
 #endif
 
 	if (!access_ok(VERIFY_READ, ctx, sizeof(*ctx)))
@@ -802,7 +802,7 @@
 {
 #ifdef DEBUG_SIG
 	printk("[%s:%d] irix_sigstack(%p,%p)\n",
-	       current->comm, current->pid, new, old);
+	       current->comm, task_pid(current), new, old);
 #endif
 	if (new) {
 		if (!access_ok(VERIFY_READ, new, sizeof(*new)))
@@ -824,7 +824,7 @@
 {
 #ifdef DEBUG_SIG
 	printk("[%s:%d] irix_sigaltstack(%p,%p)\n",
-	       current->comm, current->pid, new, old);
+	       current->comm, task_pid(current), new, old);
 #endif
 	if (new)
 		if (!access_ok(VERIFY_READ, new, sizeof(*new)))
@@ -848,7 +848,7 @@
 		return -EFAULT;
 #ifdef DEBUG_SIG
 	printk("[%s:%d] irix_sigsendset([%d,%d,%d,%d,%d],%d)\n",
-	       current->comm, current->pid,
+	       current->comm, task_pid(current),
 	       pset->cmd, pset->ltype, pset->lid, pset->rtype, pset->rid,
 	       sig);
 #endif
Index: linux-2.6.15/arch/mips/kernel/process.c
===================================================================
--- linux-2.6.15.orig/arch/mips/kernel/process.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/mips/kernel/process.c	2006-01-17 08:37:02.000000000 -0500
@@ -170,7 +170,7 @@
 	}
 #endif
 	childregs->regs[2] = 0;	/* Child gets zero as return value */
-	regs->regs[2] = p->pid;
+	regs->regs[2] = task_pid(p);
 
 	if (childregs->cp0_status & ST0_CU0) {
 		childregs->regs[28] = (unsigned long) ti;
Index: linux-2.6.15/arch/mips/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/mips/kernel/signal.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/mips/kernel/signal.c	2006-01-17 08:37:02.000000000 -0500
@@ -312,7 +312,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=0x%p pc=0x%lx ra=0x%p\n",
-	       current->comm, current->pid,
+	       current->comm, task_pid(current),
 	       frame, regs->cp0_epc, frame->regs[31]);
 #endif
         return 1;
@@ -372,7 +372,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=0x%p pc=0x%lx ra=0x%p\n",
-	       current->comm, current->pid,
+	       current->comm, task_pid(current),
 	       frame, regs->cp0_epc, regs->regs[31]);
 #endif
 	return 1;
Index: linux-2.6.15/arch/mips/kernel/signal32.c
===================================================================
--- linux-2.6.15.orig/arch/mips/kernel/signal32.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/mips/kernel/signal32.c	2006-01-17 08:37:02.000000000 -0500
@@ -689,7 +689,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=0x%p pc=0x%lx ra=0x%p\n",
-	       current->comm, current->pid,
+	       current->comm, task_pid(current),
 	       frame, regs->cp0_epc, frame->sf_code);
 #endif
 	return 1;
@@ -760,7 +760,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=0x%p pc=0x%lx ra=0x%p\n",
-	       current->comm, current->pid,
+	       current->comm, task_pid(current),
 	       frame, regs->cp0_epc, frame->rs_code);
 #endif
 	return 1;
Index: linux-2.6.15/arch/mips/kernel/signal_n32.c
===================================================================
--- linux-2.6.15.orig/arch/mips/kernel/signal_n32.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/mips/kernel/signal_n32.c	2006-01-17 08:37:02.000000000 -0500
@@ -181,7 +181,7 @@
 
 #if DEBUG_SIG
 	printk("SIG deliver (%s:%d): sp=0x%p pc=0x%lx ra=0x%p\n",
-	       current->comm, current->pid,
+	       current->comm, task_pid(current),
 	       frame, regs->cp0_epc, regs->regs[31]);
 #endif
 	return 1;
Index: linux-2.6.15/arch/mips/kernel/sysirix.c
===================================================================
--- linux-2.6.15.orig/arch/mips/kernel/sysirix.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/mips/kernel/sysirix.c	2006-01-17 08:37:02.000000000 -0500
@@ -64,7 +64,7 @@
 		break;
 	default:
 		printk("SYSMP[%s:%d]: Unsupported opcode %d\n",
-		       current->comm, current->pid, (int)cmd);
+		       current->comm, task_pid(current), (int)cmd);
 		error = -EINVAL;
 		break;
 	}
@@ -99,7 +99,7 @@
 	switch (option) {
 	case PR_MAXPROCS:
 		printk("irix_prctl[%s:%d]: Wants PR_MAXPROCS\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		error = max_threads;
 		break;
 
@@ -107,7 +107,7 @@
 		struct task_struct *task;
 
 		printk("irix_prctl[%s:%d]: Wants PR_ISBLOCKED\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		read_lock(&tasklist_lock);
 		task = find_task_by_pid(va_arg(args, pid_t));
 		error = -ESRCH;
@@ -122,7 +122,7 @@
 		long value = va_arg(args, long);
 
 		printk("irix_prctl[%s:%d]: Wants PR_SETSTACKSIZE<%08lx>\n",
-		       current->comm, current->pid, (unsigned long) value);
+		       current->comm, task_pid(current), (unsigned long) value);
 		if (value > RLIM_INFINITY)
 			value = RLIM_INFINITY;
 		if (capable(CAP_SYS_ADMIN)) {
@@ -147,25 +147,25 @@
 
 	case PR_GETSTACKSIZE:
 		printk("irix_prctl[%s:%d]: Wants PR_GETSTACKSIZE\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		error = current->signal->rlim[RLIMIT_STACK].rlim_cur;
 		break;
 
 	case PR_MAXPPROCS:
 		printk("irix_prctl[%s:%d]: Wants PR_MAXPROCS\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		error = 1;
 		break;
 
 	case PR_UNBLKONEXEC:
 		printk("irix_prctl[%s:%d]: Wants PR_UNBLKONEXEC\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		error = -EINVAL;
 		break;
 
 	case PR_SETEXITSIG:
 		printk("irix_prctl[%s:%d]: Wants PR_SETEXITSIG\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 
 		/* We can probably play some game where we set the task
 		 * exit_code to some non-zero value when this is requested,
@@ -176,31 +176,31 @@
 
 	case PR_RESIDENT:
 		printk("irix_prctl[%s:%d]: Wants PR_RESIDENT\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		error = 0; /* Compatibility indeed. */
 		break;
 
 	case PR_ATTACHADDR:
 		printk("irix_prctl[%s:%d]: Wants PR_ATTACHADDR\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		error = -EINVAL;
 		break;
 
 	case PR_DETACHADDR:
 		printk("irix_prctl[%s:%d]: Wants PR_DETACHADDR\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		error = -EINVAL;
 		break;
 
 	case PR_TERMCHILD:
 		printk("irix_prctl[%s:%d]: Wants PR_TERMCHILD\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		error = -EINVAL;
 		break;
 
 	case PR_GETSHMASK:
 		printk("irix_prctl[%s:%d]: Wants PR_GETSHMASK\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		error = -EINVAL; /* Until I have the sproc() stuff in. */
 		break;
 
@@ -210,19 +210,19 @@
 
 	case PR_COREPID:
 		printk("irix_prctl[%s:%d]: Wants PR_COREPID\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		error = -EINVAL;
 		break;
 
 	case PR_ATTACHADDRPERM:
 		printk("irix_prctl[%s:%d]: Wants PR_ATTACHADDRPERM\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		error = -EINVAL;
 		break;
 
 	default:
 		printk("irix_prctl[%s:%d]: Non-existant opcode %d\n",
-		       current->comm, current->pid, option);
+		       current->comm, task_pid(current), option);
 		error = -EINVAL;
 		break;
 	}
@@ -320,7 +320,7 @@
 		retval = prom_setenv(name, value);
 		/* XXX make sure retval conforms to syssgi(2) */
 		printk("[%s:%d] setnvram(\"%s\", \"%s\"): retval %d",
-		       current->comm, current->pid, name, value, retval);
+		       current->comm, task_pid(current), name, value, retval);
 /*		if (retval == PROM_ENOENT)
 		  	retval = -ENOENT; */
 		break;
@@ -330,7 +330,7 @@
 	case SGI_SETPGID: {
 #ifdef DEBUG_PROCGRPS
 		printk("[%s:%d] setpgid(%d, %d) ",
-		       current->comm, current->pid,
+		       current->comm, task_pid(current),
 		       (int) regs->regs[base + 5], (int)regs->regs[base + 6]);
 #endif
 		retval = sys_setpgid(regs->regs[base + 5], regs->regs[base + 6]);
@@ -426,7 +426,7 @@
 
 	case SGI_GETSID:
 #ifdef DEBUG_PROCGRPS
-		printk("[%s:%d] getsid(%d) ", current->comm, current->pid,
+		printk("[%s:%d] getsid(%d) ", current->comm, task_pid(current),
 		       (int) regs->regs[base + 5]);
 #endif
 		retval = sys_getsid(regs->regs[base + 5]);
@@ -582,7 +582,7 @@
 asmlinkage int irix_getpid(struct pt_regs *regs)
 {
 	regs->regs[3] = current->real_parent->pid;
-	return current->pid;
+	return task_pid(current);
 }
 
 asmlinkage int irix_getuid(struct pt_regs *regs)
@@ -680,7 +680,7 @@
 	unsigned long flags, char __user *type, void __user *data, int datalen)
 {
 	printk("[%s:%d] irix_mount(%p,%p,%08lx,%p,%p,%d)\n",
-	       current->comm, current->pid,
+	       current->comm, task_pid(current),
 	       dev_name, dir_name, flags, type, data, datalen);
 
 	return sys_mount(dev_name, dir_name, type, flags, data);
@@ -779,7 +779,7 @@
 	int error;
 
 #ifdef DEBUG_PROCGRPS
-	printk("[%s:%d] setpgrp(%d) ", current->comm, current->pid, flags);
+	printk("[%s:%d] setpgrp(%d) ", current->comm, task_pid(current), flags);
 #endif
 	if(!flags)
 		error = process_group(current);
@@ -849,7 +849,7 @@
 asmlinkage unsigned long irix_gethostid(void)
 {
 	printk("[%s:%d]: irix_gethostid() called...\n",
-	       current->comm, current->pid);
+	       current->comm, task_pid(current));
 
 	return -EINVAL;
 }
@@ -857,7 +857,7 @@
 asmlinkage unsigned long irix_sethostid(unsigned long val)
 {
 	printk("[%s:%d]: irix_sethostid(%08lx) called...\n",
-	       current->comm, current->pid, val);
+	       current->comm, task_pid(current), val);
 
 	return -EINVAL;
 }
@@ -1082,7 +1082,7 @@
 asmlinkage int irix_madvise(unsigned long addr, int len, int behavior)
 {
 	printk("[%s:%d] Wheee.. irix_madvise(%08lx,%d,%d)\n",
-	       current->comm, current->pid, addr, len, behavior);
+	       current->comm, task_pid(current), addr, len, behavior);
 
 	return -EINVAL;
 }
@@ -1090,7 +1090,7 @@
 asmlinkage int irix_pagelock(char __user *addr, int len, int op)
 {
 	printk("[%s:%d] Wheee.. irix_pagelock(%p,%d,%d)\n",
-	       current->comm, current->pid, addr, len, op);
+	       current->comm, task_pid(current), addr, len, op);
 
 	return -EINVAL;
 }
@@ -1098,7 +1098,7 @@
 asmlinkage int irix_quotactl(struct pt_regs *regs)
 {
 	printk("[%s:%d] Wheee.. irix_quotactl()\n",
-	       current->comm, current->pid);
+	       current->comm, task_pid(current));
 
 	return -EINVAL;
 }
@@ -1108,14 +1108,14 @@
 	int error;
 
 #ifdef DEBUG_PROCGRPS
-	printk("[%s:%d] BSDsetpgrp(%d, %d) ", current->comm, current->pid,
+	printk("[%s:%d] BSDsetpgrp(%d, %d) ", current->comm, task_pid(current),
 	       pid, pgrp);
 #endif
 	if(!pid)
-		pid = current->pid;
+		pid = task_pid(current);
 
 	/* Wheee, weird sysv thing... */
-	if ((pgrp == 0) && (pid == current->pid))
+	if ((pgrp == 0) && (pid == task_pid(current)))
 		error = sys_setsid();
 	else
 		error = sys_setpgid(pid, pgrp);
@@ -1130,7 +1130,7 @@
 asmlinkage int irix_systeminfo(int cmd, char __user *buf, int cnt)
 {
 	printk("[%s:%d] Wheee.. irix_systeminfo(%d,%p,%d)\n",
-	       current->comm, current->pid, cmd, buf, cnt);
+	       current->comm, task_pid(current), cmd, buf, cnt);
 
 	return -EINVAL;
 }
@@ -1257,7 +1257,7 @@
 
 #ifdef DEBUG_XSTAT
 	printk("[%s:%d] Wheee.. irix_xstat(%d,%s,%p) ",
-	       current->comm, current->pid, version, filename, statbuf);
+	       current->comm, task_pid(current), version, filename, statbuf);
 #endif
 
 	retval = vfs_stat(filename, &stat);
@@ -1283,7 +1283,7 @@
 
 #ifdef DEBUG_XSTAT
 	printk("[%s:%d] Wheee.. irix_lxstat(%d,%s,%p) ",
-	       current->comm, current->pid, version, filename, statbuf);
+	       current->comm, task_pid(current), version, filename, statbuf);
 #endif
 
 	error = vfs_lstat(filename, &stat);
@@ -1310,7 +1310,7 @@
 
 #ifdef DEBUG_XSTAT
 	printk("[%s:%d] Wheee.. irix_fxstat(%d,%d,%p) ",
-	       current->comm, current->pid, version, fd, statbuf);
+	       current->comm, task_pid(current), version, fd, statbuf);
 #endif
 
 	error = vfs_fstat(fd, &stat);
@@ -1333,7 +1333,7 @@
 {
 	int retval;
 	printk("[%s:%d] Wheee.. irix_xmknod(%d,%s,%x,%x)\n",
-	       current->comm, current->pid, ver, filename, mode, dev);
+	       current->comm, task_pid(current), ver, filename, mode, dev);
 
 	switch(ver) {
 	case 2:
@@ -1352,7 +1352,7 @@
 asmlinkage int irix_swapctl(int cmd, char __user *arg)
 {
 	printk("[%s:%d] Wheee.. irix_swapctl(%d,%p)\n",
-	       current->comm, current->pid, cmd, arg);
+	       current->comm, task_pid(current), cmd, arg);
 
 	return -EINVAL;
 }
@@ -1372,7 +1372,7 @@
 	int error, i;
 
 	printk("[%s:%d] Wheee.. irix_statvfs(%s,%p)\n",
-	       current->comm, current->pid, fname, buf);
+	       current->comm, task_pid(current), fname, buf);
 	if (!access_ok(VERIFY_WRITE, buf, sizeof(struct irix_statvfs)))
 		return -EFAULT;
 
@@ -1416,7 +1416,7 @@
 	int error, i;
 
 	printk("[%s:%d] Wheee.. irix_fstatvfs(%d,%p)\n",
-	       current->comm, current->pid, fd, buf);
+	       current->comm, task_pid(current), fd, buf);
 
 	if (!access_ok(VERIFY_WRITE, buf, sizeof(struct irix_statvfs)))
 		return -EFAULT;
@@ -1457,7 +1457,7 @@
 asmlinkage int irix_priocntl(struct pt_regs *regs)
 {
 	printk("[%s:%d] Wheee.. irix_priocntl()\n",
-	       current->comm, current->pid);
+	       current->comm, task_pid(current));
 
 	return -EINVAL;
 }
@@ -1465,7 +1465,7 @@
 asmlinkage int irix_sigqueue(int pid, int sig, int code, int val)
 {
 	printk("[%s:%d] Wheee.. irix_sigqueue(%d,%d,%d,%d)\n",
-	       current->comm, current->pid, pid, sig, code, val);
+	       current->comm, task_pid(current), pid, sig, code, val);
 
 	return -EINVAL;
 }
@@ -1568,7 +1568,7 @@
 asmlinkage int irix_dmi(struct pt_regs *regs)
 {
 	printk("[%s:%d] Wheee.. irix_dmi()\n",
-	       current->comm, current->pid);
+	       current->comm, task_pid(current));
 
 	return -EINVAL;
 }
@@ -1577,7 +1577,7 @@
 			  int off1, int off2)
 {
 	printk("[%s:%d] Wheee.. irix_pread(%d,%p,%d,%d,%d,%d)\n",
-	       current->comm, current->pid, fd, buf, cnt, off64, off1, off2);
+	       current->comm, task_pid(current), fd, buf, cnt, off64, off1, off2);
 
 	return -EINVAL;
 }
@@ -1586,7 +1586,7 @@
 			   int off1, int off2)
 {
 	printk("[%s:%d] Wheee.. irix_pwrite(%d,%p,%d,%d,%d,%d)\n",
-	       current->comm, current->pid, fd, buf, cnt, off64, off1, off2);
+	       current->comm, task_pid(current), fd, buf, cnt, off64, off1, off2);
 
 	return -EINVAL;
 }
@@ -1597,7 +1597,7 @@
 {
 	printk("[%s:%d] Wheee.. irix_fastpath(%d,%08lx,%08lx,%08lx,%08lx,"
 	       "%08lx,%08lx)\n",
-	       current->comm, current->pid, cmd, arg0, arg1, arg2,
+	       current->comm, task_pid(current), cmd, arg0, arg1, arg2,
 	       arg3, arg4, arg5);
 
 	return -EINVAL;
@@ -1621,7 +1621,7 @@
 	int error, i;
 
 	printk("[%s:%d] Wheee.. irix_statvfs64(%s,%p)\n",
-	       current->comm, current->pid, fname, buf);
+	       current->comm, task_pid(current), fname, buf);
 	if (!access_ok(VERIFY_WRITE, buf, sizeof(struct irix_statvfs64))) {
 		error = -EFAULT;
 		goto out;
@@ -1667,7 +1667,7 @@
 	int error, i;
 
 	printk("[%s:%d] Wheee.. irix_fstatvfs64(%d,%p)\n",
-	       current->comm, current->pid, fd, buf);
+	       current->comm, task_pid(current), fd, buf);
 
 	if (!access_ok(VERIFY_WRITE, buf, sizeof(struct irix_statvfs))) {
 		error = -EFAULT;
@@ -1711,7 +1711,7 @@
 	int err;
 
 	printk("[%s:%d] irix_getmountid(%s, %p)\n",
-	       current->comm, current->pid, fname, midbuf);
+	       current->comm, task_pid(current), fname, midbuf);
 	if (!access_ok(VERIFY_WRITE, midbuf, (sizeof(unsigned long) * 4)))
 		return -EFAULT;
 
@@ -1733,7 +1733,7 @@
 			   unsigned long arg, unsigned long sp, int slen)
 {
 	printk("[%s:%d] Wheee.. irix_nsproc(%08lx,%08lx,%08lx,%08lx,%d)\n",
-	       current->comm, current->pid, entry, mask, arg, sp, slen);
+	       current->comm, task_pid(current), entry, mask, arg, sp, slen);
 
 	return -EINVAL;
 }
@@ -1799,7 +1799,7 @@
 
 #ifdef DEBUG_GETDENTS
 	printk("[%s:%d] ngetdents(%d, %p, %d, %p) ", current->comm,
-	       current->pid, fd, dirent, count, eob);
+	       task_pid(current), fd, dirent, count, eob);
 #endif
 	error = -EBADF;
 	file = fget(fd);
@@ -1898,7 +1898,7 @@
 
 #ifdef DEBUG_GETDENTS
 	printk("[%s:%d] getdents64(%d, %p, %d) ", current->comm,
-	       current->pid, fd, dirent, cnt);
+	       task_pid(current), fd, dirent, cnt);
 #endif
 	error = -EBADF;
 	if (!(file = fget(fd)))
@@ -1946,7 +1946,7 @@
 
 #ifdef DEBUG_GETDENTS
 	printk("[%s:%d] ngetdents64(%d, %p, %d) ", current->comm,
-	       current->pid, fd, dirent, cnt);
+	       task_pid(current), fd, dirent, cnt);
 #endif
 	error = -EBADF;
 	if (!(file = fget(fd)))
@@ -1995,41 +1995,41 @@
 	case 1:
 		/* Reboot */
 		printk("[%s:%d] irix_uadmin: Wants to reboot...\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		retval = -EINVAL;
 		goto out;
 
 	case 2:
 		/* Shutdown */
 		printk("[%s:%d] irix_uadmin: Wants to shutdown...\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		retval = -EINVAL;
 		goto out;
 
 	case 4:
 		/* Remount-root */
 		printk("[%s:%d] irix_uadmin: Wants to remount root...\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		retval = -EINVAL;
 		goto out;
 
 	case 8:
 		/* Kill all tasks. */
 		printk("[%s:%d] irix_uadmin: Wants to kill all tasks...\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		retval = -EINVAL;
 		goto out;
 
 	case 256:
 		/* Set magic mushrooms... */
 		printk("[%s:%d] irix_uadmin: Wants to set magic mushroom[%d]...\n",
-		       current->comm, current->pid, (int) func);
+		       current->comm, task_pid(current), (int) func);
 		retval = -EINVAL;
 		goto out;
 
 	default:
 		printk("[%s:%d] irix_uadmin: Unknown operation [%d]...\n",
-		       current->comm, current->pid, (int) op);
+		       current->comm, task_pid(current), (int) op);
 		retval = -EINVAL;
 		goto out;
 	};
@@ -2051,20 +2051,20 @@
 	case 2:
 		/* ustat() */
 		printk("[%s:%d] irix_utssys: Wants to do ustat()\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		retval = -EINVAL;
 		goto out;
 
 	case 3:
 		/* fusers() */
 		printk("[%s:%d] irix_utssys: Wants to do fusers()\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		retval = -EINVAL;
 		goto out;
 
 	default:
 		printk("[%s:%d] irix_utssys: Wants to do unknown type[%d]\n",
-		       current->comm, current->pid, (int) type);
+		       current->comm, task_pid(current), (int) type);
 		retval = -EINVAL;
 		goto out;
 	}
@@ -2083,7 +2083,7 @@
 
 #ifdef DEBUG_FCNTL
 	printk("[%s:%d] irix_fcntl(%d, %d, %d) ", current->comm,
-	       current->pid, fd, cmd, arg);
+	       task_pid(current), fd, cmd, arg);
 #endif
 	if (cmd == IRIX_F_ALLOCSP){
 		return 0;
@@ -2102,26 +2102,26 @@
 	switch(cmd) {
 	case 1:
 		printk("[%s:%d] irix_ulimit: Wants to get file size limit.\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		retval = -EINVAL;
 		goto out;
 
 	case 2:
 		printk("[%s:%d] irix_ulimit: Wants to set file size limit.\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		retval = -EINVAL;
 		goto out;
 
 	case 3:
 		printk("[%s:%d] irix_ulimit: Wants to get brk limit.\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		retval = -EINVAL;
 		goto out;
 
 	case 4:
 #if 0
 		printk("[%s:%d] irix_ulimit: Wants to get fd limit.\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		retval = -EINVAL;
 		goto out;
 #endif
@@ -2130,13 +2130,13 @@
 
 	case 5:
 		printk("[%s:%d] irix_ulimit: Wants to get txt offset.\n",
-		       current->comm, current->pid);
+		       current->comm, task_pid(current));
 		retval = -EINVAL;
 		goto out;
 
 	default:
 		printk("[%s:%d] irix_ulimit: Unknown command [%d].\n",
-		       current->comm, current->pid, cmd);
+		       current->comm, task_pid(current), cmd);
 		retval = -EINVAL;
 		goto out;
 	}
@@ -2147,7 +2147,7 @@
 asmlinkage int irix_unimp(struct pt_regs *regs)
 {
 	printk("irix_unimp [%s:%d] v0=%d v1=%d a0=%08lx a1=%08lx a2=%08lx "
-	       "a3=%08lx\n", current->comm, current->pid,
+	       "a3=%08lx\n", current->comm, task_pid(current),
 	       (int) regs->regs[2], (int) regs->regs[3],
 	       regs->regs[4], regs->regs[5], regs->regs[6], regs->regs[7]);
 
Index: linux-2.6.15/arch/mips/kernel/time.c
===================================================================
--- linux-2.6.15.orig/arch/mips/kernel/time.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/mips/kernel/time.c	2006-01-17 08:37:02.000000000 -0500
@@ -409,7 +409,7 @@
  */
 void local_timer_interrupt(int irq, void *dev_id, struct pt_regs *regs)
 {
-	if (current->pid)
+	if (task_pid(current))
 		profile_tick(CPU_PROFILING, regs);
 	update_process_times(user_mode(regs));
 }
Index: linux-2.6.15/arch/mips/kernel/traps.c
===================================================================
--- linux-2.6.15.orig/arch/mips/kernel/traps.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/mips/kernel/traps.c	2006-01-17 08:37:02.000000000 -0500
@@ -267,7 +267,7 @@
 	show_regs(regs);
 	print_modules();
 	printk("Process %s (pid: %d, threadinfo=%p, task=%p)\n",
-	        current->comm, current->pid, current_thread_info(), current);
+	        current->comm, task_pid(current), current_thread_info(), current);
 	show_stack(current, (long *) regs->regs[29]);
 	show_trace(current, (long *) regs->regs[29]);
 	show_code((unsigned int *) regs->cp0_epc);
Index: linux-2.6.15/arch/mips/mm/fault.c
===================================================================
--- linux-2.6.15.orig/arch/mips/mm/fault.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/mips/mm/fault.c	2006-01-17 08:37:02.000000000 -0500
@@ -43,7 +43,7 @@
 
 #if 0
 	printk("Cpu%d[%s:%d:%0*lx:%ld:%0*lx]\n", smp_processor_id(),
-	       current->comm, current->pid, field, address, write,
+	       current->comm, task_pid(current), field, address, write,
 	       field, regs->cp0_epc);
 #endif
 
@@ -172,7 +172,7 @@
  */
 out_of_memory:
 	up_read(&mm->mmap_sem);
-	if (tsk->pid == 1) {
+	if (task_pid(tsk) == 1) {
 		yield();
 		down_read(&mm->mmap_sem);
 		goto survive;
Index: linux-2.6.15/arch/ia64/kernel/asm-offsets.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/kernel/asm-offsets.c	2006-01-17 08:36:28.000000000 -0500
+++ linux-2.6.15/arch/ia64/kernel/asm-offsets.c	2006-01-17 08:37:02.000000000 -0500
@@ -44,11 +44,11 @@
 	DEFINE(IA64_TASK_CLEAR_CHILD_TID_OFFSET,offsetof (struct task_struct, clear_child_tid));
 	DEFINE(IA64_TASK_GROUP_LEADER_OFFSET, offsetof (struct task_struct, group_leader));
 	DEFINE(IA64_TASK_PENDING_OFFSET,offsetof (struct task_struct, pending));
-	DEFINE(IA64_TASK_PID_OFFSET, offsetof (struct task_struct, pid));
+	DEFINE(IA64_TASK_PID_OFFSET, offsetof (struct task_struct, __pid));
 	DEFINE(IA64_TASK_REAL_PARENT_OFFSET, offsetof (struct task_struct, real_parent));
 	DEFINE(IA64_TASK_SIGHAND_OFFSET,offsetof (struct task_struct, sighand));
 	DEFINE(IA64_TASK_SIGNAL_OFFSET,offsetof (struct task_struct, signal));
-	DEFINE(IA64_TASK_TGID_OFFSET, offsetof (struct task_struct, tgid));
+	DEFINE(IA64_TASK_TGID_OFFSET, offsetof (struct task_struct, __tgid));
 	DEFINE(IA64_TASK_THREAD_KSP_OFFSET, offsetof (struct task_struct, thread.ksp));
 	DEFINE(IA64_TASK_THREAD_ON_USTACK_OFFSET, offsetof (struct task_struct, thread.on_ustack));
 

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (11 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 12/34] PID Virtualization Change pid accesses: ia64 and mips Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 15:32   ` Arjan van de Ven
  2006-01-17 14:33 ` RFC [patch 14/34] PID Virtualization const parameter for process group Serge Hallyn
                   ` (21 subsequent siblings)
  34 siblings, 1 reply; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: BC-define-pid-handlers --]
[-- Type: text/plain, Size: 1738 bytes --]

Actually define the task_pid() and task_tgid() functions.  Also
replace pid with __pid so as to make sure any missed accessors are
caught.

The resulting object code seems to be identical in most cases, and is
actually shorter in cases where current->pid is used twice in a row,
as it does not dereference task-> twice.

Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 sched.h |   16 +++++++++++++---
 1 files changed, 13 insertions(+), 3 deletions(-)

Index: linux-2.6.15/include/linux/sched.h
===================================================================
--- linux-2.6.15.orig/include/linux/sched.h	2006-01-17 08:18:22.000000000 -0500
+++ linux-2.6.15/include/linux/sched.h	2006-01-17 08:37:03.000000000 -0500
@@ -732,8 +732,8 @@
 	/* ??? */
 	unsigned long personality;
 	unsigned did_exec:1;
-	pid_t pid;
-	pid_t tgid;
+	pid_t __pid;
+	pid_t __tgid;
 	/* 
 	 * pointers to (original) parent process, youngest child, younger sibling,
 	 * older sibling, respectively.  (p->father can be replaced with 
@@ -877,6 +877,16 @@
 	return p->pids[PIDTYPE_PID].nr != 0;
 }
 
+static inline pid_t task_pid(const struct task_struct *p)
+{
+	return p->__pid;
+}
+
+static inline pid_t task_tgid(const struct task_struct *p)
+{
+	return p->__tgid;
+}
+
 extern void free_task(struct task_struct *tsk);
 extern void __put_task_struct(struct task_struct *tsk);
 #define get_task_struct(tsk) do { atomic_inc(&(tsk)->usage); } while(0)
@@ -1200,7 +1210,7 @@
 
 extern task_t * FASTCALL(next_thread(const task_t *p));
 
-#define thread_group_leader(p)	(p->pid == p->tgid)
+#define thread_group_leader(p)	(task_pid(p) == task_tgid(p))
 
 static inline int thread_group_empty(task_t *p)
 {

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 14/34] PID Virtualization const parameter for process group
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (12 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 13/34] PID Virtualization Define new task_pid api Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 15/34] PID Virtualization task virtual pid access functions Serge Hallyn
                   ` (20 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: F1-const-task-parameter.patch --]
[-- Type: text/plain, Size: 835 bytes --]

Change parameter in access functions to const.
We try to be more diligent with the "const" attribute.
As a result not introducing const for this function will
result in many compiler warnings.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 sched.h |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.15/include/linux/sched.h
===================================================================
--- linux-2.6.15.orig/include/linux/sched.h	2006-01-17 08:37:03.000000000 -0500
+++ linux-2.6.15/include/linux/sched.h	2006-01-17 08:37:03.000000000 -0500
@@ -859,7 +859,7 @@
 	atomic_t fs_excl;	/* holding fs exclusive resources */
 };
 
-static inline pid_t process_group(struct task_struct *tsk)
+static inline pid_t process_group(const struct task_struct *tsk)
 {
 	return tsk->signal->pgrp;
 }

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 15/34] PID Virtualization task virtual pid access functions
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (13 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 14/34] PID Virtualization const parameter for process group Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 16/34] PID Virtualization return virtual pids where required Serge Hallyn
                   ` (19 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: F2-define-task-virt-access-functions.patch --]
[-- Type: text/plain, Size: 1116 bytes --]

Introduce task access functions for the virtual pid domain
for pid/ppid/tgid/process_group/sessionids

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 sched.h |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+)

Index: linux-2.6.15/include/linux/sched.h
===================================================================
--- linux-2.6.15.orig/include/linux/sched.h	2006-01-17 08:37:03.000000000 -0500
+++ linux-2.6.15/include/linux/sched.h	2006-01-17 08:37:03.000000000 -0500
@@ -887,6 +887,26 @@
 	return p->__tgid;
 }
 
+static inline pid_t task_vpid(const struct task_struct *p)
+{
+	return task_pid(p);
+}
+
+static inline pid_t task_vppid(const struct task_struct *p)
+{
+	return task_pid(p->parent);
+}
+
+static inline pid_t task_vtgid(const struct task_struct *p)
+{
+	return task_tgid(p);
+}
+
+static inline pid_t virt_process_group(const struct task_struct *p)
+{
+	return process_group(p);
+}
+
 extern void free_task(struct task_struct *tsk);
 extern void __put_task_struct(struct task_struct *tsk);
 #define get_task_struct(tsk) do { atomic_inc(&(tsk)->usage); } while(0)

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 16/34] PID Virtualization return virtual pids where required
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (14 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 15/34] PID Virtualization task virtual pid access functions Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 17/34] PID Virtualization return virtual process group ids Serge Hallyn
                   ` (18 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: F3-replace-pid-kernel-access-with-virt-access.patch --]
[-- Type: text/plain, Size: 8651 bytes --]

In this patch we now identify where in the kernel code conceptually
a virtual pid(etc.) needs to be returned to userspace. This is at the 
kernel/user interfaces. We need to identify all locations where 
pids are returned, broadly they fall into 3 categories:
(a) syscall return parameter, 
(b) syscall return code, 
(c) through a datastructure filled in a syscall

The process_group virtualization will be done in the following patch.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 arch/ia64/kernel/signal.c |    2 +-
 fs/binfmt_elf.c           |    8 ++++----
 fs/proc/array.c           |   12 ++++++------
 fs/proc/base.c            |    8 ++++----
 kernel/exit.c             |    4 ++--
 kernel/fork.c             |    4 ++--
 kernel/sched.c            |    2 +-
 kernel/signal.c           |   10 +++++-----
 kernel/timer.c            |    4 ++--
 9 files changed, 27 insertions(+), 27 deletions(-)

Index: linux-2.6.15/arch/ia64/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/kernel/signal.c	2006-01-17 08:37:02.000000000 -0500
+++ linux-2.6.15/arch/ia64/kernel/signal.c	2006-01-17 08:37:04.000000000 -0500
@@ -270,7 +270,7 @@
 	si.si_signo = SIGSEGV;
 	si.si_errno = 0;
 	si.si_code = SI_KERNEL;
-	si.si_pid = task_pid(current);
+	si.si_pid = task_vpid(current);
 	si.si_uid = current->uid;
 	si.si_addr = sc;
 	force_sig_info(SIGSEGV, &si, current);
Index: linux-2.6.15/fs/binfmt_elf.c
===================================================================
--- linux-2.6.15.orig/fs/binfmt_elf.c	2006-01-17 08:36:57.000000000 -0500
+++ linux-2.6.15/fs/binfmt_elf.c	2006-01-17 08:37:04.000000000 -0500
@@ -1270,8 +1270,8 @@
 	prstatus->pr_info.si_signo = prstatus->pr_cursig = signr;
 	prstatus->pr_sigpend = p->pending.signal.sig[0];
 	prstatus->pr_sighold = p->blocked.sig[0];
-	prstatus->pr_pid = task_pid(p);
-	prstatus->pr_ppid = task_pid(p->parent);
+	prstatus->pr_pid = task_vpid(p);
+	prstatus->pr_ppid = task_vppid(p);
 	prstatus->pr_pgrp = process_group(p);
 	prstatus->pr_sid = p->signal->session;
 	if (thread_group_leader(p)) {
@@ -1316,8 +1316,8 @@
 			psinfo->pr_psargs[i] = ' ';
 	psinfo->pr_psargs[len] = 0;
 
-	psinfo->pr_pid = task_pid(p);
-	psinfo->pr_ppid = task_pid(p->parent);
+	psinfo->pr_pid = task_vpid(p);
+	psinfo->pr_ppid = task_vppid(p);
 	psinfo->pr_pgrp = process_group(p);
 	psinfo->pr_sid = p->signal->session;
 
Index: linux-2.6.15/fs/proc/array.c
===================================================================
--- linux-2.6.15.orig/fs/proc/array.c	2006-01-17 08:36:57.000000000 -0500
+++ linux-2.6.15/fs/proc/array.c	2006-01-17 08:37:04.000000000 -0500
@@ -174,10 +174,10 @@
 		"Gid:\t%d\t%d\t%d\t%d\n",
 		get_task_state(p),
 		(p->sleep_avg/1024)*100/(1020000000/1024),
-	       	task_tgid(p),
-		task_pid(p), pid_alive(p) ?
-			task_tgid(p->group_leader->real_parent) : 0,
-		pid_alive(p) && p->ptrace ? task_pid(p->parent) : 0,
+	       	task_vtgid(p),
+		task_vpid(p), pid_alive(p) ?
+			task_vtgid(p->group_leader->real_parent) : 0,
+		pid_alive(p) && p->ptrace ? task_vpid(p->parent) : 0,
 		p->uid, p->euid, p->suid, p->fsuid,
 		p->gid, p->egid, p->sgid, p->fsgid);
 	read_unlock(&tasklist_lock);
@@ -390,7 +390,7 @@
 		it_real_value = task->signal->it_real_value;
 	}
 	ppid = pid_alive(task) ?
-		task_tgid(task->group_leader->real_parent) : 0;
+		task_vtgid(task->group_leader->real_parent) : 0;
 	read_unlock(&tasklist_lock);
 
 	if (!whole || num_threads<2)
@@ -417,7 +417,7 @@
 	res = sprintf(buffer,"%d (%s) %c %d %d %d %d %d %lu %lu \
 %lu %lu %lu %lu %lu %ld %ld %ld %ld %d %ld %llu %lu %ld %lu %lu %lu %lu %lu \
 %lu %lu %lu %lu %lu %lu %lu %lu %d %d %lu %lu\n",
-		task_pid(task),
+		task_vpid(task),
 		tcomm,
 		state,
 		ppid,
Index: linux-2.6.15/fs/proc/base.c
===================================================================
--- linux-2.6.15.orig/fs/proc/base.c	2006-01-17 08:36:57.000000000 -0500
+++ linux-2.6.15/fs/proc/base.c	2006-01-17 08:37:04.000000000 -0500
@@ -1878,14 +1878,14 @@
 			      int buflen)
 {
 	char tmp[30];
-	sprintf(tmp, "%d", task_tgid(current));
+	sprintf(tmp, "%d", task_vtgid(current));
 	return vfs_readlink(dentry,buffer,buflen,tmp);
 }
 
 static void *proc_self_follow_link(struct dentry *dentry, struct nameidata *nd)
 {
 	char tmp[30];
-	sprintf(tmp, "%d", task_tgid(current));
+	sprintf(tmp, "%d", task_vtgid(current));
 	return ERR_PTR(vfs_follow_link(nd,tmp));
 }	
 
@@ -2100,7 +2100,7 @@
 		p = next_task(&init_task);
 
 	for ( ; p != &init_task; p = next_task(p)) {
-		int tgid = task_pid(p);
+		int tgid = task_vpid(p);
 		if (!pid_alive(p))
 			continue;
 		if (--index >= 0)
@@ -2133,7 +2133,7 @@
 	 * via next_thread().
 	 */
 	if (pid_alive(task)) do {
-		int tid = task_pid(task);
+		int tid = task_vpid(task);
 
 		if (--index >= 0)
 			continue;
Index: linux-2.6.15/kernel/exit.c
===================================================================
--- linux-2.6.15.orig/kernel/exit.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/kernel/exit.c	2006-01-17 08:37:04.000000000 -0500
@@ -1143,7 +1143,7 @@
 		p->exit_state = EXIT_ZOMBIE;
 		return retval;
 	}
-	retval = task_pid(p);
+	retval = task_vpid(p);
 	if (p->real_parent != p->parent) {
 		write_lock_irq(&tasklist_lock);
 		/* Double-check with lock held.  */
@@ -1278,7 +1278,7 @@
 	if (!retval && infop)
 		retval = put_user(p->uid, &infop->si_uid);
 	if (!retval)
-		retval = task_pid(p);
+		retval = task_vpid(p);
 	put_task_struct(p);
 
 	BUG_ON(!retval);
Index: linux-2.6.15/kernel/fork.c
===================================================================
--- linux-2.6.15.orig/kernel/fork.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/kernel/fork.c	2006-01-17 08:37:04.000000000 -0500
@@ -848,7 +848,7 @@
 {
 	current->clear_child_tid = tidptr;
 
-	return task_pid(current);
+	return task_vpid(current);
 }
 
 /*
@@ -928,7 +928,7 @@
 	p->__pid = pid;
 	retval = -EFAULT;
 	if (clone_flags & CLONE_PARENT_SETTID)
-		if (put_user(task_pid(p), parent_tidptr))
+		if (put_user(task_vpid(p), parent_tidptr))
 			goto bad_fork_cleanup;
 
 	p->proc_dentry = NULL;
Index: linux-2.6.15/kernel/sched.c
===================================================================
--- linux-2.6.15.orig/kernel/sched.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/kernel/sched.c	2006-01-17 08:37:04.000000000 -0500
@@ -1653,7 +1653,7 @@
 	preempt_enable();
 #endif
 	if (current->set_child_tid)
-		put_user(task_pid(current), current->set_child_tid);
+		put_user(task_vpid(current), current->set_child_tid);
 }
 
 /*
Index: linux-2.6.15/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/kernel/signal.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/kernel/signal.c	2006-01-17 08:37:04.000000000 -0500
@@ -800,7 +800,7 @@
 			q->info.si_signo = sig;
 			q->info.si_errno = 0;
 			q->info.si_code = SI_USER;
-			q->info.si_pid = task_pid(current);
+			q->info.si_pid = task_vpid(current);
 			q->info.si_uid = current->uid;
 			break;
 		case (unsigned long) SEND_SIG_PRIV:
@@ -1469,7 +1469,7 @@
 
 	info.si_signo = sig;
 	info.si_errno = 0;
-	info.si_pid = task_pid(tsk);
+	info.si_pid = task_vpid(tsk);
 	info.si_uid = tsk->uid;
 
 	/* FIXME: find out whether or not this is supposed to be c*time. */
@@ -1534,7 +1534,7 @@
 
 	info.si_signo = SIGCHLD;
 	info.si_errno = 0;
-	info.si_pid = task_pid(tsk);
+	info.si_pid = task_vpid(tsk);
 	info.si_uid = tsk->uid;
 
 	/* FIXME: find out whether or not this is supposed to be c*time. */
@@ -2245,7 +2245,7 @@
 	info.si_signo = sig;
 	info.si_errno = 0;
 	info.si_code = SI_USER;
-	info.si_pid = task_tgid(current);
+	info.si_pid = task_vtgid(current);
 	info.si_uid = current->uid;
 
 	return kill_something_info(sig, &info, pid);
@@ -2261,7 +2261,7 @@
 	info.si_signo = sig;
 	info.si_errno = 0;
 	info.si_code = SI_TKILL;
-	info.si_pid = task_tgid(current);
+	info.si_pid = task_vtgid(current);
 	info.si_uid = current->uid;
 
 	read_lock(&tasklist_lock);
Index: linux-2.6.15/kernel/timer.c
===================================================================
--- linux-2.6.15.orig/kernel/timer.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/kernel/timer.c	2006-01-17 08:37:04.000000000 -0500
@@ -941,7 +941,7 @@
  */
 asmlinkage long sys_getpid(void)
 {
-	return task_tgid(current);
+	return task_vtgid(current);
 }
 
 /*
@@ -1115,7 +1115,7 @@
 /* Thread ID - the internal kernel "pid" */
 asmlinkage long sys_gettid(void)
 {
-	return task_pid(current);
+	return task_vpid(current);
 }
 
 static long __sched nanosleep_restart(struct restart_block *restart)

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 17/34] PID Virtualization return virtual process group ids
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (15 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 16/34] PID Virtualization return virtual pids where required Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 18/34] PID Virtualization code enhancements for virtual pids in /proc Serge Hallyn
                   ` (17 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: F4-replace-process-group-access-with-virt-access.patch --]
[-- Type: text/plain, Size: 3165 bytes --]

In this patch we now identify where in the kernel code conceptually
a virtual process group needs to be returned to userspace. This is
simply the extension of the previous patch which only dealt with
identify the location of virtual pid/tgid/ppids returns.

As in that patch, these locations are at the kernel/user interfaces. 
and broadly they fall into 3 categories:
(a) syscall return parameter,
(b) syscall return code,
(c) through a datastructure filled in a syscall

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 fs/binfmt_elf.c |    4 ++--
 fs/proc/array.c |    2 +-
 kernel/sys.c    |    8 ++++----
 3 files changed, 7 insertions(+), 7 deletions(-)

Index: linux-2.6.15/fs/binfmt_elf.c
===================================================================
--- linux-2.6.15.orig/fs/binfmt_elf.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/fs/binfmt_elf.c	2006-01-17 08:37:04.000000000 -0500
@@ -1272,7 +1272,7 @@
 	prstatus->pr_sighold = p->blocked.sig[0];
 	prstatus->pr_pid = task_vpid(p);
 	prstatus->pr_ppid = task_vppid(p);
-	prstatus->pr_pgrp = process_group(p);
+	prstatus->pr_pgrp = virt_process_group(p);
 	prstatus->pr_sid = p->signal->session;
 	if (thread_group_leader(p)) {
 		/*
@@ -1318,7 +1318,7 @@
 
 	psinfo->pr_pid = task_vpid(p);
 	psinfo->pr_ppid = task_vppid(p);
-	psinfo->pr_pgrp = process_group(p);
+	psinfo->pr_pgrp = virt_process_group(p);
 	psinfo->pr_sid = p->signal->session;
 
 	i = p->state ? ffz(~p->state) + 1 : 0;
Index: linux-2.6.15/fs/proc/array.c
===================================================================
--- linux-2.6.15.orig/fs/proc/array.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/fs/proc/array.c	2006-01-17 08:37:04.000000000 -0500
@@ -374,7 +374,7 @@
 			tty_pgrp = task->signal->tty->pgrp;
 			tty_nr = new_encode_dev(tty_devnum(task->signal->tty));
 		}
-		pgid = process_group(task);
+		pgid = virt_process_group(task);
 		sid = task->signal->session;
 		cmin_flt = task->signal->cmin_flt;
 		cmaj_flt = task->signal->cmaj_flt;
Index: linux-2.6.15/kernel/sys.c
===================================================================
--- linux-2.6.15.orig/kernel/sys.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/kernel/sys.c	2006-01-17 08:37:04.000000000 -0500
@@ -1155,7 +1155,7 @@
 asmlinkage long sys_getpgid(pid_t pid)
 {
 	if (!pid) {
-		return process_group(current);
+		return virt_process_group(current);
 	} else {
 		int retval;
 		struct task_struct *p;
@@ -1167,7 +1167,7 @@
 		if (p) {
 			retval = security_task_getpgid(p);
 			if (!retval)
-				retval = process_group(p);
+				retval = virt_process_group(p);
 		}
 		read_unlock(&tasklist_lock);
 		return retval;
@@ -1179,7 +1179,7 @@
 asmlinkage long sys_getpgrp(void)
 {
 	/* SMP - assuming writes are word atomic this is fine */
-	return process_group(current);
+	return virt_process_group(current);
 }
 
 #endif
@@ -1225,7 +1225,7 @@
 	__set_special_pids(task_pid(current), task_pid(current));
 	current->signal->tty = NULL;
 	current->signal->tty_old_pgrp = 0;
-	err = process_group(current);
+	err = virt_process_group(current);
 out:
 	write_unlock_irq(&tasklist_lock);
 	up(&tty_sem);

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 18/34] PID Virtualization code enhancements for virtual pids in /proc
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (16 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 17/34] PID Virtualization return virtual process group ids Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 19/34] PID Virtualization Define pid_to_vpid functions Serge Hallyn
                   ` (16 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: F5-code-cleanup-procarray.patch --]
[-- Type: text/plain, Size: 1493 bytes --]

To avoid ugly parameter specifications for the sprintf statement
we pull the ppid,tpid computations out. Later these statements
will get a tiny bit more elaborate, because we need to deal with
the special case of an illegal task_vvpid (not in the same container)
virtualization. This is simply in preparation for that.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 array.c |   14 +++++++++++---
 1 files changed, 11 insertions(+), 3 deletions(-)

Index: linux-2.6.15/fs/proc/array.c
===================================================================
--- linux-2.6.15.orig/fs/proc/array.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/fs/proc/array.c	2006-01-17 08:37:04.000000000 -0500
@@ -161,8 +161,17 @@
 	struct group_info *group_info;
 	int g;
 	struct fdtable *fdt = NULL;
+	pid_t ppid, tpid;
 
 	read_lock(&tasklist_lock);
+	if (pid_alive(p))
+		ppid = task_vtgid(p->group_leader->real_parent);
+	else
+		ppid = 0;
+	if (pid_alive(p) && p->ptrace)
+		tpid = task_vppid(p);
+	else
+		tpid = 0;
 	buffer += sprintf(buffer,
 		"State:\t%s\n"
 		"SleepAVG:\t%lu%%\n"
@@ -175,9 +184,8 @@
 		get_task_state(p),
 		(p->sleep_avg/1024)*100/(1020000000/1024),
 	       	task_vtgid(p),
-		task_vpid(p), pid_alive(p) ?
-			task_vtgid(p->group_leader->real_parent) : 0,
-		pid_alive(p) && p->ptrace ? task_vpid(p->parent) : 0,
+		task_vpid(p),
+		ppid, tpid,
 		p->uid, p->euid, p->suid, p->fsuid,
 		p->gid, p->egid, p->sgid, p->fsgid);
 	read_unlock(&tasklist_lock);

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 19/34] PID Virtualization Define pid_to_vpid functions
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (17 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 18/34] PID Virtualization code enhancements for virtual pids in /proc Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 20/34] PID Virtualization Use pid_to_vpid conversion functions Serge Hallyn
                   ` (15 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: F6-define-pid-to-vpid-translation.patch --]
[-- Type: text/plain, Size: 1122 bytes --]

In this patch we introduce convertion functions to 
translate pids into virtual pids. These are just the APIs
not the implementation yet.
Subsequent patches will utilize these internal functions
to rewrite the task virtual pid/ppid/tgid access functions
such that finally we only have to rewrite these virtual
conversion functions to actually obtain the pid virtualization.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 sched.h |   14 ++++++++++++++
 1 files changed, 14 insertions(+)

Index: linux-2.6.15/include/linux/sched.h
===================================================================
--- linux-2.6.15.orig/include/linux/sched.h	2006-01-17 08:37:03.000000000 -0500
+++ linux-2.6.15/include/linux/sched.h	2006-01-17 08:37:05.000000000 -0500
@@ -865,6 +865,20 @@
 }
 
 /**
+ *  pid domain translation functions:
+ *	- from kernel to user pid domain
+ */
+static inline pid_t pid_to_vpid(pid_t pid)
+{
+	return pid;
+}
+
+static inline pid_t pgid_to_vpgid(pid_t pid)
+{
+	return pid;
+}
+
+/**
  * pid_alive - check that a task structure is not stale
  * @p: Task structure to be checked.
  *

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 20/34] PID Virtualization Use pid_to_vpid conversion functions
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (18 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 19/34] PID Virtualization Define pid_to_vpid functions Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 21/34] PID Virtualization file owner pid virtualization Serge Hallyn
                   ` (14 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: F7-pid-to-vpid-translation.patch --]
[-- Type: text/plain, Size: 6107 bytes --]

Utilize the pid_to_vpid translation function 
to return to userspace a virtual pid. 
These need to be applied where the task access functions 
previously defined can not be utilized.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 drivers/char/tty_io.c |    2 +-
 fs/binfmt_elf.c       |    4 ++--
 fs/proc/array.c       |    4 ++--
 ipc/msg.c             |    8 ++++----
 ipc/shm.c             |    8 ++++----
 kernel/fork.c         |    9 ++++++---
 kernel/sys.c          |    4 ++--
 7 files changed, 21 insertions(+), 18 deletions(-)

Index: linux-2.6.15/drivers/char/tty_io.c
===================================================================
--- linux-2.6.15.orig/drivers/char/tty_io.c	2006-01-17 08:36:54.000000000 -0500
+++ linux-2.6.15/drivers/char/tty_io.c	2006-01-17 08:37:05.000000000 -0500
@@ -2158,7 +2158,7 @@
 	 */
 	if (tty == real_tty && current->signal->tty != real_tty)
 		return -ENOTTY;
-	return put_user(real_tty->pgrp, p);
+	return put_user(pid_to_vpid(real_tty->pgrp), p);
 }
 
 static int tiocspgrp(struct tty_struct *tty, struct tty_struct *real_tty, pid_t __user *p)
Index: linux-2.6.15/fs/proc/array.c
===================================================================
--- linux-2.6.15.orig/fs/proc/array.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/fs/proc/array.c	2006-01-17 08:37:05.000000000 -0500
@@ -379,11 +379,11 @@
 	}
 	if (task->signal) {
 		if (task->signal->tty) {
-			tty_pgrp = task->signal->tty->pgrp;
+			tty_pgrp = pid_to_vpid(task->signal->tty->pgrp);
 			tty_nr = new_encode_dev(tty_devnum(task->signal->tty));
 		}
 		pgid = virt_process_group(task);
-		sid = task->signal->session;
+		sid = pid_to_vpid(task->signal->session);
 		cmin_flt = task->signal->cmin_flt;
 		cmaj_flt = task->signal->cmaj_flt;
 		cutime = task->signal->cutime;
Index: linux-2.6.15/ipc/msg.c
===================================================================
--- linux-2.6.15.orig/ipc/msg.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/ipc/msg.c	2006-01-17 08:37:05.000000000 -0500
@@ -416,8 +416,8 @@
 		tbuf.msg_cbytes = msq->q_cbytes;
 		tbuf.msg_qnum   = msq->q_qnum;
 		tbuf.msg_qbytes = msq->q_qbytes;
-		tbuf.msg_lspid  = msq->q_lspid;
-		tbuf.msg_lrpid  = msq->q_lrpid;
+		tbuf.msg_lspid  = pid_to_vpid(msq->q_lspid);
+		tbuf.msg_lrpid  = pid_to_vpid(msq->q_lrpid);
 		msg_unlock(msq);
 		if (copy_msqid_to_user(buf, &tbuf, version))
 			return -EFAULT;
@@ -821,8 +821,8 @@
 			  msq->q_perm.mode,
 			  msq->q_cbytes,
 			  msq->q_qnum,
-			  msq->q_lspid,
-			  msq->q_lrpid,
+			  pid_to_vpid(msq->q_lspid),
+			  pid_to_vpid(msq->q_lrpid),
 			  msq->q_perm.uid,
 			  msq->q_perm.gid,
 			  msq->q_perm.cuid,
Index: linux-2.6.15/ipc/shm.c
===================================================================
--- linux-2.6.15.orig/ipc/shm.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/ipc/shm.c	2006-01-17 08:37:05.000000000 -0500
@@ -508,8 +508,8 @@
 		tbuf.shm_atime	= shp->shm_atim;
 		tbuf.shm_dtime	= shp->shm_dtim;
 		tbuf.shm_ctime	= shp->shm_ctim;
-		tbuf.shm_cpid	= shp->shm_cprid;
-		tbuf.shm_lpid	= shp->shm_lprid;
+		tbuf.shm_cpid	= pid_to_vpid(shp->shm_cprid);
+		tbuf.shm_lpid	= pid_to_vpid(shp->shm_lprid);
 		if (!is_file_hugepages(shp->shm_file))
 			tbuf.shm_nattch	= shp->shm_nattch;
 		else
@@ -896,8 +896,8 @@
 			  shp->id,
 			  shp->shm_flags,
 			  shp->shm_segsz,
-			  shp->shm_cprid,
-			  shp->shm_lprid,
+			  pid_to_vpid(shp->shm_cprid),
+			  pid_to_vpid(shp->shm_lprid),
 			  is_file_hugepages(shp->shm_file) ? (file_count(shp->shm_file) - 1) : shp->shm_nattch,
 			  shp->shm_perm.uid,
 			  shp->shm_perm.gid,
Index: linux-2.6.15/kernel/fork.c
===================================================================
--- linux-2.6.15.orig/kernel/fork.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/kernel/fork.c	2006-01-17 08:37:05.000000000 -0500
@@ -1238,6 +1238,7 @@
 	struct task_struct *p;
 	int trace = 0;
 	long pid = alloc_pidmap();
+	long vpid;
 
 	if (pid < 0)
 		return -EAGAIN;
@@ -1268,13 +1269,15 @@
 			set_tsk_thread_flag(p, TIF_SIGPENDING);
 		}
 
+		vpid = pid_to_vpid(pid);
+
 		if (!(clone_flags & CLONE_STOPPED))
 			wake_up_new_task(p, clone_flags);
 		else
 			p->state = TASK_STOPPED;
 
 		if (unlikely (trace)) {
-			current->ptrace_message = pid;
+			current->ptrace_message = vpid;
 			ptrace_notify ((trace << 8) | SIGTRAP);
 		}
 
@@ -1285,9 +1288,9 @@
 		}
 	} else {
 		free_pidmap(pid);
-		pid = PTR_ERR(p);
+		vpid = PTR_ERR(p);
 	}
-	return pid;
+	return vpid;
 }
 
 void __init proc_caches_init(void)
Index: linux-2.6.15/kernel/sys.c
===================================================================
--- linux-2.6.15.orig/kernel/sys.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/kernel/sys.c	2006-01-17 08:37:05.000000000 -0500
@@ -1187,7 +1187,7 @@
 asmlinkage long sys_getsid(pid_t pid)
 {
 	if (!pid) {
-		return current->signal->session;
+		return pid_to_vpid(current->signal->session);
 	} else {
 		int retval;
 		struct task_struct *p;
@@ -1199,7 +1199,7 @@
 		if(p) {
 			retval = security_task_getsid(p);
 			if (!retval)
-				retval = p->signal->session;
+				retval = pid_to_vpid(p->signal->session);
 		}
 		read_unlock(&tasklist_lock);
 		return retval;
Index: linux-2.6.15/fs/binfmt_elf.c
===================================================================
--- linux-2.6.15.orig/fs/binfmt_elf.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/fs/binfmt_elf.c	2006-01-17 08:37:05.000000000 -0500
@@ -1273,7 +1273,7 @@
 	prstatus->pr_pid = task_vpid(p);
 	prstatus->pr_ppid = task_vppid(p);
 	prstatus->pr_pgrp = virt_process_group(p);
-	prstatus->pr_sid = p->signal->session;
+	prstatus->pr_sid = pid_to_vpid(p->signal->session);
 	if (thread_group_leader(p)) {
 		/*
 		 * This is the record for the group leader.  Add in the
@@ -1319,7 +1319,7 @@
 	psinfo->pr_pid = task_vpid(p);
 	psinfo->pr_ppid = task_vppid(p);
 	psinfo->pr_pgrp = virt_process_group(p);
-	psinfo->pr_sid = p->signal->session;
+	psinfo->pr_sid = pid_to_vpid(p->signal->session);
 
 	i = p->state ? ffz(~p->state) + 1 : 0;
 	psinfo->pr_state = i;

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 21/34] PID Virtualization file owner pid virtualization
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (19 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 20/34] PID Virtualization Use pid_to_vpid conversion functions Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 22/34] PID Virtualization define vpid_to_pid functions Serge Hallyn
                   ` (13 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: F8-pgid-to-vpgid-translation.patch --]
[-- Type: text/plain, Size: 792 bytes --]

Utilization of the internal pid_to_vpid function for the 
process group id. This is specifically for the owner of 
a file that needs to be returned through the fcntl 
system call.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 fcntl.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.15/fs/fcntl.c
===================================================================
--- linux-2.6.15.orig/fs/fcntl.c	2006-01-17 08:17:29.000000000 -0500
+++ linux-2.6.15/fs/fcntl.c	2006-01-17 08:37:05.000000000 -0500
@@ -316,7 +316,7 @@
 		 * current syscall conventions, the only way
 		 * to fix this will be in libc.
 		 */
-		err = filp->f_owner.pid;
+		err = pgid_to_vpgid(filp->f_owner.pid);
 		force_successful_syscall_return();
 		break;
 	case F_SETOWN:

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 22/34] PID Virtualization define vpid_to_pid functions
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (20 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 21/34] PID Virtualization file owner pid virtualization Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 23/34] PID Virtualization Use " Serge Hallyn
                   ` (12 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: F9-define-vpid-to-pid-translation.patch --]
[-- Type: text/plain, Size: 1168 bytes --]

Introduce the reverse conversion functions namely from the 
user virtual pid to the kernel pid.
Again, we only specify the API here, will utilize the API 
at the appropriate locations in subsequent patches and finally
will provide a real implementation for the virtualization
behind these functions together with the pid_to_vpid conversion.
Any pid passed through the syscall interface from userspace
is virtual and therefore must pass through this conversion 
before it can be used as a kernel pid.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 sched.h |   10 ++++++++++
 1 files changed, 10 insertions(+)

Index: linux-2.6.15/include/linux/sched.h
===================================================================
--- linux-2.6.15.orig/include/linux/sched.h	2006-01-17 08:37:05.000000000 -0500
+++ linux-2.6.15/include/linux/sched.h	2006-01-17 08:37:06.000000000 -0500
@@ -878,6 +878,16 @@
 	return pid;
 }
 
+static inline pid_t vpid_to_pid(pid_t pid)
+{
+	return pid;
+}
+
+static inline pid_t vpgid_to_pgid(pid_t pid)
+{
+	return pid;
+}
+
 /**
  * pid_alive - check that a task structure is not stale
  * @p: Task structure to be checked.

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 23/34] PID Virtualization Use vpid_to_pid functions
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (21 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 22/34] PID Virtualization define vpid_to_pid functions Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 24/34] PID Virtualization use vpgid_to_pgid function Serge Hallyn
                   ` (11 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: FA-vpid-to-pid-translation.patch --]
[-- Type: text/plain, Size: 6966 bytes --]

We now utilize the vpid_to_pid function where ever
a pid is passed from user space and needs to be converted 
into a kernel pid.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 arch/ia64/kernel/ptrace.c |    1 +
 arch/s390/kernel/ptrace.c |    1 +
 drivers/char/tty_io.c     |    1 +
 fs/proc/base.c            |    2 ++
 kernel/capability.c       |    1 +
 kernel/exit.c             |    2 ++
 kernel/ptrace.c           |    1 +
 kernel/sched.c            |    6 +++++-
 kernel/signal.c           |    7 +++++--
 kernel/sys.c              |   14 ++++++++++++++
 10 files changed, 33 insertions(+), 3 deletions(-)

Index: linux-2.6.15/arch/ia64/kernel/ptrace.c
===================================================================
--- linux-2.6.15.orig/arch/ia64/kernel/ptrace.c	2006-01-17 08:17:29.000000000 -0500
+++ linux-2.6.15/arch/ia64/kernel/ptrace.c	2006-01-17 08:37:06.000000000 -0500
@@ -1419,6 +1419,7 @@
 	struct switch_stack *sw;
 	long ret;
 
+	pid = vpid_to_pid(pid);
 	lock_kernel();
 	ret = -EPERM;
 	if (request == PTRACE_TRACEME) {
Index: linux-2.6.15/arch/s390/kernel/ptrace.c
===================================================================
--- linux-2.6.15.orig/arch/s390/kernel/ptrace.c	2006-01-17 08:17:29.000000000 -0500
+++ linux-2.6.15/arch/s390/kernel/ptrace.c	2006-01-17 08:37:06.000000000 -0500
@@ -711,6 +711,7 @@
 	struct task_struct *child;
 	int ret;
 
+	pid = vpid_to_pid(pid);
 	lock_kernel();
 
 	if (request == PTRACE_TRACEME) {
Index: linux-2.6.15/drivers/char/tty_io.c
===================================================================
--- linux-2.6.15.orig/drivers/char/tty_io.c	2006-01-17 08:37:05.000000000 -0500
+++ linux-2.6.15/drivers/char/tty_io.c	2006-01-17 08:37:06.000000000 -0500
@@ -2176,6 +2176,7 @@
 		return -ENOTTY;
 	if (get_user(pgrp, p))
 		return -EFAULT;
+	pgrp = vpid_to_pid(pgrp);
 	if (pgrp < 0)
 		return -EINVAL;
 	if (session_of_pgrp(pgrp) != current->signal->session)
Index: linux-2.6.15/fs/proc/base.c
===================================================================
--- linux-2.6.15.orig/fs/proc/base.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/fs/proc/base.c	2006-01-17 08:37:06.000000000 -0500
@@ -1975,6 +1975,7 @@
 	tgid = name_to_int(dentry);
 	if (tgid == ~0U)
 		goto out;
+	tgid = vpid_to_pid(tgid);
 
 	read_lock(&tasklist_lock);
 	task = find_task_by_pid(tgid);
@@ -2032,6 +2033,7 @@
 	unsigned tid;
 
 	tid = name_to_int(dentry);
+	tid = vpid_to_pid(tid);
 	if (tid == ~0U)
 		goto out;
 
Index: linux-2.6.15/kernel/capability.c
===================================================================
--- linux-2.6.15.orig/kernel/capability.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/kernel/capability.c	2006-01-17 08:37:06.000000000 -0500
@@ -63,6 +63,7 @@
      if (pid < 0) 
              return -EINVAL;
 
+     pid = vpid_to_pid(pid);
      spin_lock(&task_capability_lock);
      read_lock(&tasklist_lock); 
 
Index: linux-2.6.15/kernel/exit.c
===================================================================
--- linux-2.6.15.orig/kernel/exit.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/kernel/exit.c	2006-01-17 08:37:06.000000000 -0500
@@ -1529,10 +1529,12 @@
 	case P_PID:
 		if (pid <= 0)
 			return -EINVAL;
+		pid = vpid_to_pid(pid);
 		break;
 	case P_PGID:
 		if (pid <= 0)
 			return -EINVAL;
+		pid = vpid_to_pid(pid);
 		pid = -pid;
 		break;
 	default:
Index: linux-2.6.15/kernel/sched.c
===================================================================
--- linux-2.6.15.orig/kernel/sched.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/kernel/sched.c	2006-01-17 08:37:06.000000000 -0500
@@ -3680,7 +3680,11 @@
  */
 static inline task_t *find_process_by_pid(pid_t pid)
 {
-	return pid ? find_task_by_pid(pid) : current;
+	if (pid) {
+		pid = vpid_to_pid(pid);
+		return find_task_by_pid(pid);
+	}
+	return current;
 }
 
 /* Actually do priority change: must hold rq lock. */
Index: linux-2.6.15/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/kernel/signal.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/kernel/signal.c	2006-01-17 08:37:06.000000000 -0500
@@ -1209,9 +1209,9 @@
 		read_unlock(&tasklist_lock);
 		return count ? retval : -ESRCH;
 	} else if (pid < 0) {
-		return kill_pg_info(sig, info, -pid);
+		return kill_pg_info(sig, info, vpid_to_pid(-pid));
 	} else {
-		return kill_proc_info(sig, info, pid);
+		return kill_proc_info(sig, info, vpid_to_pid(pid));
 	}
 }
 
@@ -2264,6 +2264,8 @@
 	info.si_pid = task_vtgid(current);
 	info.si_uid = current->uid;
 
+	pid  = vpid_to_pid(pid);
+	tgid = vpid_to_pid(tgid);
 	read_lock(&tasklist_lock);
 	p = find_task_by_pid(pid);
 	if (p && (tgid <= 0 || task_tgid(p) == tgid)) {
@@ -2331,6 +2333,7 @@
 	info.si_signo = sig;
 
 	/* POSIX.1b doesn't mention process groups.  */
+	pid = vpid_to_pid(pid);
 	return kill_proc_info(sig, &info, pid);
 }
 
Index: linux-2.6.15/kernel/sys.c
===================================================================
--- linux-2.6.15.orig/kernel/sys.c	2006-01-17 08:37:05.000000000 -0500
+++ linux-2.6.15/kernel/sys.c	2006-01-17 08:37:06.000000000 -0500
@@ -269,6 +269,8 @@
 		case PRIO_PROCESS:
 			if (!who)
 				who = task_pid(current);
+			else
+				who = vpid_to_pid(who);
 			p = find_task_by_pid(who);
 			if (p)
 				error = set_one_prio(p, niceval, error);
@@ -276,6 +278,8 @@
 		case PRIO_PGRP:
 			if (!who)
 				who = process_group(current);
+			else
+				who = vpid_to_pid(who);
 			do_each_task_pid(who, PIDTYPE_PGID, p) {
 				error = set_one_prio(p, niceval, error);
 			} while_each_task_pid(who, PIDTYPE_PGID, p);
@@ -322,6 +326,8 @@
 		case PRIO_PROCESS:
 			if (!who)
 				who = task_pid(current);
+			else
+				who = vpid_to_pid(who);
 			p = find_task_by_pid(who);
 			if (p) {
 				niceval = 20 - task_nice(p);
@@ -332,6 +338,8 @@
 		case PRIO_PGRP:
 			if (!who)
 				who = process_group(current);
+			else
+				who = vpid_to_pid(who);
 			do_each_task_pid(who, PIDTYPE_PGID, p) {
 				niceval = 20 - task_nice(p);
 				if (niceval > retval)
@@ -1088,8 +1096,12 @@
 
 	if (!pid)
 		pid = task_pid(current);
+	else
+		pid = vpid_to_pid(pid);
 	if (!pgid)
 		pgid = pid;
+	else
+		pgid = vpid_to_pid(pgid);
 	if (pgid < 0)
 		return -EINVAL;
 
@@ -1160,6 +1172,7 @@
 		int retval;
 		struct task_struct *p;
 
+		pid = vpid_to_pid(pid);
 		read_lock(&tasklist_lock);
 		p = find_task_by_pid(pid);
 
@@ -1192,6 +1205,7 @@
 		int retval;
 		struct task_struct *p;
 
+		pid = vpid_to_pid(pid);
 		read_lock(&tasklist_lock);
 		p = find_task_by_pid(pid);
 
Index: linux-2.6.15/kernel/ptrace.c
===================================================================
--- linux-2.6.15.orig/kernel/ptrace.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/kernel/ptrace.c	2006-01-17 08:37:06.000000000 -0500
@@ -440,6 +440,7 @@
 	/*
 	 * You may not mess with init
 	 */
+	pid = vpid_to_pid(pid);
 	if (pid == 1)
 		return -EPERM;
 

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 24/34] PID Virtualization use vpgid_to_pgid function
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (22 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 23/34] PID Virtualization Use " Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 25/34] PID Virtualization Context for pid_to_vpid conversition functions Serge Hallyn
                   ` (10 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: FB-vpgid-to-pgid-translation.patch --]
[-- Type: text/plain, Size: 1766 bytes --]

Same as previous patch for pids, but here we focus on virtual
ids that are interpreted as process group ids. Since process
groups ids can be negative, they are handled as to deal with
the negative value.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 fs/fcntl.c          |    1 +
 kernel/capability.c |    1 +
 kernel/exit.c       |    2 ++
 3 files changed, 4 insertions(+)

Index: linux-2.6.15/fs/fcntl.c
===================================================================
--- linux-2.6.15.orig/fs/fcntl.c	2006-01-17 08:37:05.000000000 -0500
+++ linux-2.6.15/fs/fcntl.c	2006-01-17 08:37:06.000000000 -0500
@@ -267,6 +267,7 @@
 	if (err)
 		return err;
 
+	arg = vpgid_to_pgid(arg);
 	f_modown(filp, arg, current->uid, current->euid, force);
 	return 0;
 }
Index: linux-2.6.15/kernel/capability.c
===================================================================
--- linux-2.6.15.orig/kernel/capability.c	2006-01-17 08:37:06.000000000 -0500
+++ linux-2.6.15/kernel/capability.c	2006-01-17 08:37:06.000000000 -0500
@@ -188,6 +188,7 @@
      if (get_user(pid, &header->pid))
 	     return -EFAULT; 
 
+     pid = vpgid_to_pgid(pid);
      if (pid && pid != task_pid(current) && !capable(CAP_SETPCAP))
              return -EPERM;
 
Index: linux-2.6.15/kernel/exit.c
===================================================================
--- linux-2.6.15.orig/kernel/exit.c	2006-01-17 08:37:06.000000000 -0500
+++ linux-2.6.15/kernel/exit.c	2006-01-17 08:37:06.000000000 -0500
@@ -1556,6 +1556,8 @@
 	if (options & ~(WNOHANG|WUNTRACED|WCONTINUED|
 			__WNOTHREAD|__WCLONE|__WALL))
 		return -EINVAL;
+	if (pid != -1)
+		pid = vpgid_to_pgid(pid);
 	ret = do_wait(pid, options | WEXITED, NULL, stat_addr, ru);
 
 	/* avoid REGPARM breakage on x86: */

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 25/34] PID Virtualization Context for pid_to_vpid conversition functions
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (23 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 24/34] PID Virtualization use vpgid_to_pgid function Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 26/34] PID Virtualization Documentation Serge Hallyn
                   ` (9 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: FC-context-for-pid2vpid.patch --]
[-- Type: text/plain, Size: 13206 bytes --]

pid_to_vpid conversion require the context task relative to which
the conversion should take place. For instance, the virtual init process 
of a container is vpid=1 relative to the tasks of that container, 
vpid=-1 from within a different container and vpid=pid in the global context.

By default we assume that the virtual access functions are called
within the context of the task's container itself.
Provide the context for the pid_to_vpid translations
vpids are with respect to a context task. 
In this patch we therefore only identify where the context is different
then the default. 

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 fs/binfmt_elf.c       |    4 ++--
 fs/fcntl.c            |    2 +-
 fs/proc/array.c       |   18 +++++++++---------
 fs/proc/base.c        |    4 ++--
 include/linux/sched.h |   41 +++++++++++++++++++++++++++++++++++++----
 ipc/msg.c             |    8 ++++----
 ipc/sem.c             |    2 +-
 ipc/shm.c             |    8 ++++----
 kernel/exit.c         |    4 ++--
 kernel/fork.c         |    3 ++-
 kernel/signal.c       |    4 ++--
 kernel/sys.c          |    7 ++++---
 kernel/timer.c        |    2 +-
 13 files changed, 71 insertions(+), 36 deletions(-)

Index: linux-2.6.15/fs/proc/base.c
===================================================================
--- linux-2.6.15.orig/fs/proc/base.c	2006-01-17 08:37:06.000000000 -0500
+++ linux-2.6.15/fs/proc/base.c	2006-01-17 08:37:07.000000000 -0500
@@ -2102,7 +2102,7 @@
 		p = next_task(&init_task);
 
 	for ( ; p != &init_task; p = next_task(p)) {
-		int tgid = task_vpid(p);
+		int tgid = task_vpid_ctx(p, current);
 		if (!pid_alive(p))
 			continue;
 		if (--index >= 0)
@@ -2135,7 +2135,7 @@
 	 * via next_thread().
 	 */
 	if (pid_alive(task)) do {
-		int tid = task_vpid(task);
+		int tid = task_vpid_ctx(task, current);
 
 		if (--index >= 0)
 			continue;
Index: linux-2.6.15/fs/proc/array.c
===================================================================
--- linux-2.6.15.orig/fs/proc/array.c	2006-01-17 08:37:05.000000000 -0500
+++ linux-2.6.15/fs/proc/array.c	2006-01-17 08:37:07.000000000 -0500
@@ -165,11 +165,11 @@
 
 	read_lock(&tasklist_lock);
 	if (pid_alive(p))
-		ppid = task_vtgid(p->group_leader->real_parent);
+		ppid = task_vtgid_ctx(p->group_leader->real_parent, current);
 	else
 		ppid = 0;
 	if (pid_alive(p) && p->ptrace)
-		tpid = task_vppid(p);
+		tpid = task_vppid_ctx(p, current);
 	else
 		tpid = 0;
 	buffer += sprintf(buffer,
@@ -183,8 +183,8 @@
 		"Gid:\t%d\t%d\t%d\t%d\n",
 		get_task_state(p),
 		(p->sleep_avg/1024)*100/(1020000000/1024),
-	       	task_vtgid(p),
-		task_vpid(p),
+	       	task_vtgid_ctx(p,current),
+		task_vpid_ctx(p,current),
 		ppid, tpid,
 		p->uid, p->euid, p->suid, p->fsuid,
 		p->gid, p->egid, p->sgid, p->fsgid);
@@ -379,11 +379,11 @@
 	}
 	if (task->signal) {
 		if (task->signal->tty) {
-			tty_pgrp = pid_to_vpid(task->signal->tty->pgrp);
+			tty_pgrp = pid_to_vpid_ctx(task->signal->tty->pgrp, current);
 			tty_nr = new_encode_dev(tty_devnum(task->signal->tty));
 		}
-		pgid = virt_process_group(task);
-		sid = pid_to_vpid(task->signal->session);
+		pgid = pid_to_vpid_ctx(process_group(task), current);
+		sid = pid_to_vpid_ctx(task->signal->session, current);
 		cmin_flt = task->signal->cmin_flt;
 		cmaj_flt = task->signal->cmaj_flt;
 		cutime = task->signal->cutime;
@@ -398,7 +398,7 @@
 		it_real_value = task->signal->it_real_value;
 	}
 	ppid = pid_alive(task) ?
-		task_vtgid(task->group_leader->real_parent) : 0;
+		pid_to_vpid_ctx(task_tgid(task->group_leader->real_parent), current) : 0;
 	read_unlock(&tasklist_lock);
 
 	if (!whole || num_threads<2)
@@ -425,7 +425,7 @@
 	res = sprintf(buffer,"%d (%s) %c %d %d %d %d %d %lu %lu \
 %lu %lu %lu %lu %lu %ld %ld %ld %ld %d %ld %llu %lu %ld %lu %lu %lu %lu %lu \
 %lu %lu %lu %lu %lu %lu %lu %lu %d %d %lu %lu\n",
-		task_vpid(task),
+		task_vpid_ctx(task,current),
 		tcomm,
 		state,
 		ppid,
Index: linux-2.6.15/fs/fcntl.c
===================================================================
--- linux-2.6.15.orig/fs/fcntl.c	2006-01-17 08:37:06.000000000 -0500
+++ linux-2.6.15/fs/fcntl.c	2006-01-17 08:37:07.000000000 -0500
@@ -317,7 +317,7 @@
 		 * current syscall conventions, the only way
 		 * to fix this will be in libc.
 		 */
-		err = pgid_to_vpgid(filp->f_owner.pid);
+		err = pgid_to_vpgid_ctx(filp->f_owner.pid, current);
 		force_successful_syscall_return();
 		break;
 	case F_SETOWN:
Index: linux-2.6.15/ipc/msg.c
===================================================================
--- linux-2.6.15.orig/ipc/msg.c	2006-01-17 08:37:05.000000000 -0500
+++ linux-2.6.15/ipc/msg.c	2006-01-17 08:37:07.000000000 -0500
@@ -416,8 +416,8 @@
 		tbuf.msg_cbytes = msq->q_cbytes;
 		tbuf.msg_qnum   = msq->q_qnum;
 		tbuf.msg_qbytes = msq->q_qbytes;
-		tbuf.msg_lspid  = pid_to_vpid(msq->q_lspid);
-		tbuf.msg_lrpid  = pid_to_vpid(msq->q_lrpid);
+		tbuf.msg_lspid  = pid_to_vpid_ctx(msq->q_lspid, current);
+		tbuf.msg_lrpid  = pid_to_vpid_ctx(msq->q_lrpid, current);
 		msg_unlock(msq);
 		if (copy_msqid_to_user(buf, &tbuf, version))
 			return -EFAULT;
@@ -821,8 +821,8 @@
 			  msq->q_perm.mode,
 			  msq->q_cbytes,
 			  msq->q_qnum,
-			  pid_to_vpid(msq->q_lspid),
-			  pid_to_vpid(msq->q_lrpid),
+			  pid_to_vpid_ctx(msq->q_lspid, current),
+			  pid_to_vpid_ctx(msq->q_lrpid, current),
 			  msq->q_perm.uid,
 			  msq->q_perm.gid,
 			  msq->q_perm.cuid,
Index: linux-2.6.15/ipc/shm.c
===================================================================
--- linux-2.6.15.orig/ipc/shm.c	2006-01-17 08:37:05.000000000 -0500
+++ linux-2.6.15/ipc/shm.c	2006-01-17 08:37:07.000000000 -0500
@@ -508,8 +508,8 @@
 		tbuf.shm_atime	= shp->shm_atim;
 		tbuf.shm_dtime	= shp->shm_dtim;
 		tbuf.shm_ctime	= shp->shm_ctim;
-		tbuf.shm_cpid	= pid_to_vpid(shp->shm_cprid);
-		tbuf.shm_lpid	= pid_to_vpid(shp->shm_lprid);
+		tbuf.shm_cpid	= pid_to_vpid_ctx(shp->shm_cprid, current);
+		tbuf.shm_lpid	= pid_to_vpid_ctx(shp->shm_lprid, current);
 		if (!is_file_hugepages(shp->shm_file))
 			tbuf.shm_nattch	= shp->shm_nattch;
 		else
@@ -896,8 +896,8 @@
 			  shp->id,
 			  shp->shm_flags,
 			  shp->shm_segsz,
-			  pid_to_vpid(shp->shm_cprid),
-			  pid_to_vpid(shp->shm_lprid),
+			  pid_to_vpid_ctx(shp->shm_cprid, current),
+			  pid_to_vpid_ctx(shp->shm_lprid, current),
 			  is_file_hugepages(shp->shm_file) ? (file_count(shp->shm_file) - 1) : shp->shm_nattch,
 			  shp->shm_perm.uid,
 			  shp->shm_perm.gid,
Index: linux-2.6.15/ipc/sem.c
===================================================================
--- linux-2.6.15.orig/ipc/sem.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/ipc/sem.c	2006-01-17 08:37:07.000000000 -0500
@@ -723,7 +723,7 @@
 		err = curr->semval;
 		goto out_unlock;
 	case GETPID:
-		err = curr->sempid;
+		err = pid_to_vpid_ctx(curr->sempid, current);
 		goto out_unlock;
 	case GETNCNT:
 		err = count_semncnt(sma,semnum);
Index: linux-2.6.15/kernel/exit.c
===================================================================
--- linux-2.6.15.orig/kernel/exit.c	2006-01-17 08:37:06.000000000 -0500
+++ linux-2.6.15/kernel/exit.c	2006-01-17 08:37:07.000000000 -0500
@@ -1143,7 +1143,7 @@
 		p->exit_state = EXIT_ZOMBIE;
 		return retval;
 	}
-	retval = task_vpid(p);
+	retval = task_vpid_ctx(p, current);
 	if (p->real_parent != p->parent) {
 		write_lock_irq(&tasklist_lock);
 		/* Double-check with lock held.  */
@@ -1278,7 +1278,7 @@
 	if (!retval && infop)
 		retval = put_user(p->uid, &infop->si_uid);
 	if (!retval)
-		retval = task_vpid(p);
+		retval = task_vpid_ctx(p, current);
 	put_task_struct(p);
 
 	BUG_ON(!retval);
Index: linux-2.6.15/kernel/fork.c
===================================================================
--- linux-2.6.15.orig/kernel/fork.c	2006-01-17 08:37:05.000000000 -0500
+++ linux-2.6.15/kernel/fork.c	2006-01-17 08:37:07.000000000 -0500
@@ -926,9 +926,10 @@
 	p->did_exec = 0;
 	copy_flags(clone_flags, p);
 	p->__pid = pid;
+
 	retval = -EFAULT;
 	if (clone_flags & CLONE_PARENT_SETTID)
-		if (put_user(task_vpid(p), parent_tidptr))
+		if (put_user(task_vpid_ctx(p, current), parent_tidptr))
 			goto bad_fork_cleanup;
 
 	p->proc_dentry = NULL;
Index: linux-2.6.15/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/kernel/signal.c	2006-01-17 08:37:06.000000000 -0500
+++ linux-2.6.15/kernel/signal.c	2006-01-17 08:37:07.000000000 -0500
@@ -1469,7 +1469,7 @@
 
 	info.si_signo = sig;
 	info.si_errno = 0;
-	info.si_pid = task_vpid(tsk);
+	info.si_pid = task_vpid_ctx(tsk, tsk->parent);
 	info.si_uid = tsk->uid;
 
 	/* FIXME: find out whether or not this is supposed to be c*time. */
@@ -1534,7 +1534,7 @@
 
 	info.si_signo = SIGCHLD;
 	info.si_errno = 0;
-	info.si_pid = task_vpid(tsk);
+	info.si_pid = task_vpid_ctx(tsk, tsk->parent);
 	info.si_uid = tsk->uid;
 
 	/* FIXME: find out whether or not this is supposed to be c*time. */
Index: linux-2.6.15/kernel/sys.c
===================================================================
--- linux-2.6.15.orig/kernel/sys.c	2006-01-17 08:37:06.000000000 -0500
+++ linux-2.6.15/kernel/sys.c	2006-01-17 08:37:07.000000000 -0500
@@ -1180,7 +1180,8 @@
 		if (p) {
 			retval = security_task_getpgid(p);
 			if (!retval)
-				retval = virt_process_group(p);
+				retval = pid_to_vpid_ctx(process_group(p),
+							 current);
 		}
 		read_unlock(&tasklist_lock);
 		return retval;
@@ -1200,7 +1201,7 @@
 asmlinkage long sys_getsid(pid_t pid)
 {
 	if (!pid) {
-		return pid_to_vpid(current->signal->session);
+		return pid_to_vpid_ctx(current->signal->session,current);
 	} else {
 		int retval;
 		struct task_struct *p;
@@ -1213,7 +1214,7 @@
 		if(p) {
 			retval = security_task_getsid(p);
 			if (!retval)
-				retval = pid_to_vpid(p->signal->session);
+				retval = pid_to_vpid_ctx(p->signal->session, current);
 		}
 		read_unlock(&tasklist_lock);
 		return retval;
Index: linux-2.6.15/kernel/timer.c
===================================================================
--- linux-2.6.15.orig/kernel/timer.c	2006-01-17 08:37:04.000000000 -0500
+++ linux-2.6.15/kernel/timer.c	2006-01-17 08:37:07.000000000 -0500
@@ -968,7 +968,7 @@
 
 	parent = me->group_leader->real_parent;
 	for (;;) {
-		pid = task_tgid(parent);
+		pid = task_vtgid_ctx(parent, current);
 #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT)
 {
 		struct task_struct *old = parent;
Index: linux-2.6.15/include/linux/sched.h
===================================================================
--- linux-2.6.15.orig/include/linux/sched.h	2006-01-17 08:37:06.000000000 -0500
+++ linux-2.6.15/include/linux/sched.h	2006-01-17 08:37:07.000000000 -0500
@@ -868,14 +868,29 @@
  *  pid domain translation functions:
  *	- from kernel to user pid domain
  */
+static inline pid_t pid_to_vpid_ctx(pid_t pid, const struct task_struct *ctx)
+{
+	return pid;
+}
+
 static inline pid_t pid_to_vpid(pid_t pid)
 {
+	return pid_to_vpid_ctx(pid, current);
+}
+
+static inline pid_t pgid_to_vpgid_ctx(pid_t pid, const struct task_struct *ctx)
+{
+	int isgrp = (pid < 0) ;
+
+	if (isgrp) pid = -pid;
+	pid = pid_to_vpid_ctx(pid, ctx);
+	if (isgrp) pid = -pid;
 	return pid;
 }
 
 static inline pid_t pgid_to_vpgid(pid_t pid)
 {
-	return pid;
+	return pgid_to_vpgid_ctx(pid, current);
 }
 
 static inline pid_t vpid_to_pid(pid_t pid)
@@ -911,19 +926,37 @@
 	return p->__tgid;
 }
 
-static inline pid_t task_vpid(const struct task_struct *p)
+static inline pid_t task_vpid_ctx(const struct task_struct *p,
+				   const struct task_struct *ctx)
 {
 	return task_pid(p);
 }
 
+static inline pid_t task_vpid(const struct task_struct *p)
+{
+	return task_vpid_ctx(p, p);
+}
+
+static inline pid_t task_vppid_ctx(const struct task_struct *p,
+			      	   const struct task_struct *ctx)
+{
+	return task_vpid_ctx(p->parent, ctx);
+}
+
 static inline pid_t task_vppid(const struct task_struct *p)
 {
-	return task_pid(p->parent);
+	return task_vppid_ctx(p, p);
+}
+
+static inline pid_t task_vtgid_ctx(const struct task_struct *p,
+				    const struct task_struct *ctx)
+{
+	return pid_to_vpid_ctx(task_tgid(p), ctx);
 }
 
 static inline pid_t task_vtgid(const struct task_struct *p)
 {
-	return task_tgid(p);
+	return task_vtgid_ctx(p, p);
 }
 
 static inline pid_t virt_process_group(const struct task_struct *p)
Index: linux-2.6.15/fs/binfmt_elf.c
===================================================================
--- linux-2.6.15.orig/fs/binfmt_elf.c	2006-01-17 08:37:05.000000000 -0500
+++ linux-2.6.15/fs/binfmt_elf.c	2006-01-17 08:37:07.000000000 -0500
@@ -1273,7 +1273,7 @@
 	prstatus->pr_pid = task_vpid(p);
 	prstatus->pr_ppid = task_vppid(p);
 	prstatus->pr_pgrp = virt_process_group(p);
-	prstatus->pr_sid = pid_to_vpid(p->signal->session);
+	prstatus->pr_sid = pid_to_vpid_ctx(p->signal->session, p);
 	if (thread_group_leader(p)) {
 		/*
 		 * This is the record for the group leader.  Add in the
@@ -1319,7 +1319,7 @@
 	psinfo->pr_pid = task_vpid(p);
 	psinfo->pr_ppid = task_vppid(p);
 	psinfo->pr_pgrp = virt_process_group(p);
-	psinfo->pr_sid = pid_to_vpid(p->signal->session);
+	psinfo->pr_sid = pid_to_vpid_ctx(p->signal->session, p);
 
 	i = p->state ? ffz(~p->state) + 1 : 0;
 	psinfo->pr_state = i;

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 26/34] PID Virtualization Documentation
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (24 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 25/34] PID Virtualization Context for pid_to_vpid conversition functions Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 27/34] PID Virtualization pidspace Serge Hallyn
                   ` (8 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: G0-documentation.patch --]
[-- Type: text/plain, Size: 2292 bytes --]

First (incomplete) attempt of documentation
Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 containers.txt |   64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 64 insertions(+)

Index: linux-2.6.15/Documentation/containers.txt
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.15/Documentation/containers.txt	2006-01-17 08:37:07.000000000 -0500
@@ -0,0 +1,64 @@
+This document describes the basics of the container
+
+Hubertus Franke	<frankeh@watson.ibm.com>
+Serge E Hallyn	<serue@us.ibm.com>
+Cedric Legoater <clg@fr.ibm.com>
+
+Applications and associated processes can be containerized into
+"isolated" soft partitions. The goal is to make containers
+transparently migratable. To do so certain resources identifiers
+need to be virtualized.
+These includes
+	- pids, gids,
+	- SysV ids
+	- procfs
+Only resource belonging to a container can be accessed within
+the container.
+
+A "container" is created through a helper program <contexe>,
+that is supplied separately.
+A process moves itself to a container by writing
+the name of the container to create to /proc/container.
+Doing so makes the calling process the pseudo init process
+of the container.
+
+
+For example "contexe -j2 /bin/bash" spawns a bash within
+a new container <cont_2> and make the contexe process
+the containers virtual initproc.
+
+
+PID-VIRTUALIZATION:
+-------------------
+
+Let Process <A> be the currently running process ( e.g. bash with pid 913 )
+Each container has an associated pidspace id associated. Each pidspace
+id is managed like the standard pid range in linux.
+
+We obtain the following tree, where <pidspace | vpid > denotes the
+internal pid which is obtained by bitmasking.
+
+A some older bash < 0 | 913 >
+	|
+	\/
+B == contexe == < 0 | 1087 >      ( also container->init_proc := A
+				   	 container->init_pid  := 1087
+	|
+	\/
+C == /bin/bash == < 1 | 2 >
+
+
+let's define the results here we are expecting.
+
+C in context of B:      vpid = 2
+B in context of C:	vpid = 1
+
+B in context of A:	vpid = pid = 1087
+C in context of A:	vpid = pid = < 1 | 2 >
+
+A in context of B:	vpid = pid = 913
+A in context of C:	vpid = -1
+
+< More to Follow >
+
+

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 27/34] PID Virtualization pidspace
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (25 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 26/34] PID Virtualization Documentation Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 28/34] PID Virtualization container object and functions Serge Hallyn
                   ` (7 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: G1-pidspace.patch --]
[-- Type: text/plain, Size: 7933 bytes --]

This patch introduces pitspaces to provide pid virtualization
capabilities. A pidspace will be allocated for each container
and destroyed (resources freed) when the container is 
terminated.

The global pid range ( 32 bit) is partitioned into 
PID_MAX_LIMIT sized pidspaces. The virtualization
is defined as kernel_pid ::= < pidspace_id, vpid >

In this patch we are utilizing the existing pid management,
i.e. allocation and hashing. We are providing a pidspace, as managed 
previously, for each pidspace id. 

Patch eliminates the explicit management of vpids and allows
continued usage of the existing pid hashing and lookup functions.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 include/linux/pid.h     |   27 +++++++++++-
 include/linux/threads.h |   17 +++++--
 kernel/fork.c           |    2 
 kernel/pid.c            |  105 +++++++++++++++++++++++++++++++++++++++++++-----
 4 files changed, 135 insertions(+), 16 deletions(-)

Index: linux-2.6.15/kernel/fork.c
===================================================================
--- linux-2.6.15.orig/kernel/fork.c	2006-01-17 08:37:07.000000000 -0500
+++ linux-2.6.15/kernel/fork.c	2006-01-17 08:37:08.000000000 -0500
@@ -1238,7 +1238,7 @@
 {
 	struct task_struct *p;
 	int trace = 0;
-	long pid = alloc_pidmap();
+	long pid = alloc_pidmap(DEFAULT_PIDSPACE);
 	long vpid;
 
 	if (pid < 0)
Index: linux-2.6.15/include/linux/pid.h
===================================================================
--- linux-2.6.15.orig/include/linux/pid.h	2006-01-17 08:17:29.000000000 -0500
+++ linux-2.6.15/include/linux/pid.h	2006-01-17 08:37:08.000000000 -0500
@@ -36,7 +36,7 @@
  */
 extern struct pid *FASTCALL(find_pid(enum pid_type, int));
 
-extern int alloc_pidmap(void);
+extern int alloc_pidmap(int pidspace_id);
 extern void FASTCALL(free_pidmap(int));
 extern void switch_exec_pids(struct task_struct *leader, struct task_struct *thread);
 
@@ -51,5 +51,30 @@
 			prefetch((task)->pids[type].pid_list.next),	\
 			hlist_unhashed(&(task)->pids[type].pid_chain));	\
 	}								\
+/*
+ * Pidspace related definition for translation  real <-> virtual
+ * and initialization functions
+ */
+
+#define DEFAULT_PIDSPACE	0
+
+extern int pidspace_init(int pidspace_id);
+extern int pidspace_free(int pidspace_id);
+
+static inline int pid_to_pidspace(int pid)
+{
+	return (pid >> PID_MAX_LIMIT_SHIFT);
+}
+
+static inline int pidspace_vpid_to_pid(int pidspace_id, pid_t pid)
+{
+	return (pidspace_id << PID_MAX_LIMIT_SHIFT) | pid;
+}
+
+static inline int pidspace_pid_to_vpid(pid_t pid)
+{
+	return (pid & (PID_MAX_LIMIT-1));
+}
+
 
 #endif /* _LINUX_PID_H */
Index: linux-2.6.15/include/linux/threads.h
===================================================================
--- linux-2.6.15.orig/include/linux/threads.h	2006-01-17 08:17:29.000000000 -0500
+++ linux-2.6.15/include/linux/threads.h	2006-01-17 08:37:08.000000000 -0500
@@ -25,12 +25,21 @@
 /*
  * This controls the default maximum pid allocated to a process
  */
-#define PID_MAX_DEFAULT (CONFIG_BASE_SMALL ? 0x1000 : 0x8000)
+#define PID_MAX_DEFAULT_SHIFT	(CONFIG_BASE_SMALL ? 12 : 15)
+#define PID_MAX_DEFAULT 	(1<< PID_MAX_DEFAULT_SHIFT)
 
 /*
- * A maximum of 4 million PIDs should be enough for a while:
+ * The entire global pid range is devided into pidspaces
+ * each able to hold upto PID_MAX_LIMIT pids.
+ * A maximum of 512 pidspace should be enough for a while
+ * A maximum of 4 million PIDs per pidspace should be enough for a while:
+ * we keep high bit reserved for negative values
  */
-#define PID_MAX_LIMIT (CONFIG_BASE_SMALL ? PAGE_SIZE * 8 : \
-	(sizeof(long) > 4 ? 4 * 1024 * 1024 : PID_MAX_DEFAULT))
+#define PID_MAX_LIMIT_SHIFT (CONFIG_BASE_SMALL ? PAGE_SHIFT + 8 : \
+	(sizeof(long) > 4 ? 22 : PID_MAX_DEFAULT_SHIFT))
+#define PID_MAX_LIMIT 		(1<<PID_MAX_LIMIT_SHIFT)
+
+#define MAX_NR_PIDSPACES 	(PID_MAX_LIMIT_SHIFT > 22 ?   \
+				 1<<(32-PID_MAX_LIMIT_SHIFT-1) : 512)
 
 #endif
Index: linux-2.6.15/kernel/pid.c
===================================================================
--- linux-2.6.15.orig/kernel/pid.c	2006-01-17 08:36:59.000000000 -0500
+++ linux-2.6.15/kernel/pid.c	2006-01-17 08:37:08.000000000 -0500
@@ -35,6 +35,7 @@
 int last_pid;
 
 #define RESERVED_PIDS		300
+#define RESERVED_PIDS_NON_DFLT    1
 
 int pid_max_min = RESERVED_PIDS + 1;
 int pid_max_max = PID_MAX_LIMIT;
@@ -57,29 +58,103 @@
 	void *page;
 } pidmap_t;
 
-static pidmap_t pidmap_array[PIDMAP_ENTRIES] =
+struct pidspace {
+	int last_pid;
+	pidmap_t *pidmap_array;
+};
+
+static pidmap_t dflt_pidmap_array[PIDMAP_ENTRIES] =
 	 { [ 0 ... PIDMAP_ENTRIES-1 ] = { ATOMIC_INIT(BITS_PER_PAGE), NULL } };
 
+static struct pidspace pid_spaces[MAX_NR_PIDSPACES] =
+	{ { 0, dflt_pidmap_array } };
+
 static  __cacheline_aligned_in_smp DEFINE_SPINLOCK(pidmap_lock);
 
+int pidspace_init(int pidspace_id)
+{
+	pidmap_t *map;
+	struct pidspace *pid_space =  &pid_spaces[pidspace_id];
+	int i;
+	int rc;
+
+	if (unlikely(pid_space->pidmap_array))
+		return -EBUSY;
+
+	map = kmalloc(PIDMAP_ENTRIES*sizeof(pidmap_t), GFP_KERNEL);
+	if (!map)
+		return -ENOMEM;
+
+	for (i=0 ; i< PIDMAP_ENTRIES ; i++)
+		map[i] = (pidmap_t){ ATOMIC_INIT(BITS_PER_PAGE), NULL };
+
+	/*
+	 * Free the pidspace if someone raced with us
+	 * installing it:
+	 */
+
+	spin_lock(&pidmap_lock);
+	if (pid_space->pidmap_array) {
+		kfree(map);
+		rc = -EAGAIN;
+	} else {
+		pid_space->pidmap_array = map;
+		pid_space->last_pid = RESERVED_PIDS_NON_DFLT;
+		rc = 0;
+	}
+	spin_unlock(&pidmap_lock);
+	return rc;
+}
+
+int pidspace_free(int pidspace_id)
+{
+	struct pidspace *pid_space =  &pid_spaces[pidspace_id];
+	pidmap_t *map;
+	int i;
+
+	spin_lock(&pidmap_lock);
+	BUG_ON(pid_space->pidmap_array == NULL);
+	map = pid_space->pidmap_array;
+	pid_space->pidmap_array = NULL;
+	spin_unlock(&pidmap_lock);
+
+	for ( i=0; i<PIDMAP_ENTRIES; i++)
+		free_page((unsigned long)map[i].page);
+	kfree(map);
+	return 0;
+}
+
 fastcall void free_pidmap(int pid)
 {
-	pidmap_t *map = pidmap_array + pid / BITS_PER_PAGE;
-	int offset = pid & BITS_PER_PAGE_MASK;
+	pidmap_t *map, *pidmap_array;
+	int offset;
+
+	pidmap_array = pid_spaces[pid_to_pidspace(pid)].pidmap_array;
+	pid = pidspace_pid_to_vpid(pid);
+	map = pidmap_array + pid / BITS_PER_PAGE;
+	offset = pid & BITS_PER_PAGE_MASK;
 
 	clear_bit(offset, map->page);
 	atomic_inc(&map->nr_free);
 }
 
-int alloc_pidmap(void)
+int alloc_pidmap(int pidspace_id)
 {
-	int i, offset, max_scan, pid, last = last_pid;
-	pidmap_t *map;
+	int i, offset, max_scan, pid, last;
+	struct pidspace *pid_space;
+	pidmap_t *map, *pidmap_array;
 
+	pid_space = &pid_spaces[pidspace_id];
+	last = pid_space->last_pid;
 	pid = last + 1;
-	if (pid >= pid_max)
-		pid = RESERVED_PIDS;
+	if (pid >= pid_max) {
+		if (pidspace_id == DEFAULT_PIDSPACE)
+			pid = RESERVED_PIDS;
+		else
+			pid = RESERVED_PIDS_NON_DFLT;
+	}
 	offset = pid & BITS_PER_PAGE_MASK;
+	pidmap_array = pid_space->pidmap_array;
 	map = &pidmap_array[pid/BITS_PER_PAGE];
 	max_scan = (pid_max + BITS_PER_PAGE - 1)/BITS_PER_PAGE - !offset;
 	for (i = 0; i <= max_scan; ++i) {
@@ -102,7 +177,12 @@
 			do {
 				if (!test_and_set_bit(offset, map->page)) {
 					atomic_dec(&map->nr_free);
-					last_pid = pid;
+					pid_space->last_pid = pid;
+					if (pidspace_id == 0) {
+						last_pid = pid;
+						return pid;
+					}
+					pid = pidspace_vpid_to_pid(pidspace_id, pid);
 					return pid;
 				}
 				offset = find_next_offset(map, offset);
@@ -122,7 +202,10 @@
 			offset = 0;
 		} else {
 			map = &pidmap_array[0];
-			offset = RESERVED_PIDS;
+			if (pidspace_id == DEFAULT_PIDSPACE)
+				offset = RESERVED_PIDS;
+			else
+				offset = RESERVED_PIDS_NON_DFLT;
 			if (unlikely(last == offset))
 				break;
 		}
@@ -279,6 +362,8 @@
 {
 	int i;
 
+	pidmap_t *pidmap_array = dflt_pidmap_array;
+
 	pidmap_array->page = (void *)get_zeroed_page(GFP_KERNEL);
 	set_bit(0, pidmap_array->page);
 	atomic_dec(&pidmap_array->nr_free);

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 28/34] PID Virtualization container object and functions
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (26 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 27/34] PID Virtualization pidspace Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 29/34] PID Virtualization container attach/detach calls Serge Hallyn
                   ` (6 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: G2-container.patch --]
[-- Type: text/plain, Size: 7299 bytes --]

Introduce the container object and its managemenent functions,
in particular the creation/deletion of containers and the
linkage between the container object and the task.
By default, if the task->container object is NULL, then the task belongs
to the default global container. 

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 include/linux/container.h |   37 ++++++++++++
 include/linux/sched.h     |   10 +++
 kernel/Makefile           |    3 
 kernel/container.c        |  140 ++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 189 insertions(+), 1 deletion(-)

Index: linux-2.6.15/include/linux/container.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.15/include/linux/container.h	2006-01-17 08:37:08.000000000 -0500
@@ -0,0 +1,37 @@
+
+#ifndef _LINUX_CONTAINER_H
+#define _LINUX_CONTAINER_H
+
+/* number of containers will depend on many constraints, which will have to
+ * be integrated here as they become apparent
+ */
+
+
+#define MAX_NR_CONTAINERS		MAX_NR_PIDSPACES
+
+#define MAX_CONTAINER_NAME_LEN 		32
+
+struct container_struct {
+	spinlock_t	    lock;
+	char		    name[MAX_CONTAINER_NAME_LEN];
+	int		    pidspace_id;
+	struct task_struct *init_proc;			/* root proc   */
+	int		    init_pid;			/* pid of root */
+	atomic_t	    tcount;			/* thread count */
+
+	/* and all the other things that will be necessary to track
+	 * for a container
+	 */
+};
+
+/****************************************************************
+ *      Container Management Functions
+ ****************************************************************/
+
+extern struct container_struct *container_find(const char *container_name);
+extern int    container_new     (const char *container_name);
+extern void   container_attach  (struct task_struct *task);
+extern void   container_detach  (struct task_struct *task);
+
+#endif
+
Index: linux-2.6.15/include/linux/sched.h
===================================================================
--- linux-2.6.15.orig/include/linux/sched.h	2006-01-17 08:37:07.000000000 -0500
+++ linux-2.6.15/include/linux/sched.h	2006-01-17 08:37:08.000000000 -0500
@@ -36,6 +36,7 @@
 #include <linux/seccomp.h>
 
 #include <linux/auxvec.h>	/* For AT_VECTOR_SIZE */
+#include <linux/container.h>
 
 struct exec_domain;
 
@@ -857,6 +858,7 @@
 	int cpuset_mems_generation;
 #endif
 	atomic_t fs_excl;	/* holding fs exclusive resources */
+	struct container_struct *container;
 };
 
 static inline pid_t process_group(const struct task_struct *tsk)
@@ -964,6 +966,14 @@
 	return process_group(p);
 }
 
+static inline unsigned int task_pidspace_id(const struct task_struct *p)
+{
+	if (p->container)
+		return p->container->pidspace_id;
+	else
+		return DEFAULT_PIDSPACE;
+}
+
 extern void free_task(struct task_struct *tsk);
 extern void __put_task_struct(struct task_struct *tsk);
 #define get_task_struct(tsk) do { atomic_inc(&(tsk)->usage); } while(0)
Index: linux-2.6.15/kernel/Makefile
===================================================================
--- linux-2.6.15.orig/kernel/Makefile	2006-01-17 08:17:29.000000000 -0500
+++ linux-2.6.15/kernel/Makefile	2006-01-17 08:37:08.000000000 -0500
@@ -7,7 +7,8 @@
 	    sysctl.o capability.o ptrace.o timer.o user.o \
 	    signal.o sys.o kmod.o workqueue.o pid.o \
 	    rcupdate.o intermodule.o extable.o params.o posix-timers.o \
-	    kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o
+	    kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o \
+	    container.o
 
 obj-$(CONFIG_FUTEX) += futex.o
 obj-$(CONFIG_GENERIC_ISA_DMA) += dma.o
Index: linux-2.6.15/kernel/container.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.15/kernel/container.c	2006-01-17 08:37:08.000000000 -0500
@@ -0,0 +1,140 @@
+/*
+ * Management of Containers
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2005 <frankeh@watson.ibm.com>
+ *
+ */
+
+/* Changes
+ *
+ * 11/22/2005:  Created
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <asm/uaccess.h>
+#include <linux/proc_fs.h>
+#include <linux/timer.h>
+#include <linux/mm.h>
+#include <linux/container.h>
+
+#define DPRINTK( fmt, args... ) // printk( "%s: " fmt, __FUNCTION__, ##args )
+
+static struct container_struct *containers[MAX_NR_CONTAINERS];
+static DEFINE_SPINLOCK(container_lock);
+
+/****************************************************************
+ *      Container Management
+ ****************************************************************/
+
+void container_attach(struct task_struct *task)
+{
+	struct container_struct *container = task->container;
+
+	if (!container)
+		return;
+	atomic_inc(&container->tcount);
+
+	DPRINTK("c=<%p:%s> atask=<%x:%x:%s>\n",
+		container, container->name,
+		task_pid(task), task_vpid(task), task->comm);
+}
+
+void container_detach(struct task_struct *task)
+{
+	struct container_struct *container = task->container;
+	unsigned long flags;
+	int empty;
+
+	if (!container)
+		return;
+
+	DPRINTK("c=<%p:%s> dtask=<%x:%x:%s>\n",
+		container, container->name,
+		task_pid(task), task_vpid(task), task->comm);
+
+	task->container = NULL;
+	if (unlikely(task == container->init_proc)) {
+		container->init_proc = NULL;
+		container->init_pid  = 0;
+		memset(container->name, 0, MAX_CONTAINER_NAME_LEN);
+	}
+	empty = atomic_dec_and_test(&container->tcount);
+	if (!empty)
+		return;
+
+	/* we are the last process, so lets destroy the container */
+
+	DPRINTK("c=<%p:%s> destroy container exiting root proc\n",
+		container, container->name);
+
+	spin_lock_irqsave(&container_lock,flags);
+	containers[container->pidspace_id] = NULL;
+	pidspace_free(container->pidspace_id);
+
+	spin_lock(&container->lock);
+	/* ANYTHING UNDER THE LOCK */
+	spin_unlock(&container->lock);
+
+	spin_unlock_irqrestore(&container_lock,flags);
+
+	kfree(container);
+}
+
+/*
+ * create a new container and make the caller the virtual init_proc
+ * of the container
+ */
+
+int container_new(const char *container_name)
+{
+	struct container_struct *newc = NULL;
+	unsigned long flags;
+	int i;
+	int rc;
+
+	newc = kmalloc(sizeof(struct container_struct),GFP_KERNEL);
+	if (newc == NULL)
+		return -ENOMEM;
+	memset(newc,0,sizeof(struct container_struct));
+	strncpy(newc->name, container_name, MAX_CONTAINER_NAME_LEN-1);
+	newc->init_proc = current;
+	newc->init_pid  = task_pid(current);
+	atomic_set(&newc->tcount,0);
+
+	spin_lock_irqsave(&container_lock,flags);
+	for ( i=1; i<MAX_NR_CONTAINERS; i++) {
+		struct container_struct *cptr = containers[i];
+
+		if (cptr == NULL)
+			break;
+		if (strncmp(container_name, cptr->name, MAX_CONTAINER_NAME_LEN) == 0) {
+			rc = -EEXIST;
+			goto out_unlock_free;
+		}
+	}
+	if ( i == MAX_NR_CONTAINERS ) {
+		rc = -ENOMEM;
+		goto out_unlock_free;
+	}
+
+	spin_lock_init(&newc->lock);
+	pidspace_init(i);
+	newc->pidspace_id = i;
+	containers[i] = newc;
+	DPRINTK("created container #%d: %s\n", newc->pidspace_id, newc->name);
+	current->container = newc;
+	container_attach(current);
+	rc = 0;
+	goto out_unlock;
+
+out_unlock_free:
+	kfree(newc);
+out_unlock:
+	spin_unlock_irqrestore(&container_lock,flags);
+	return rc;
+}
+

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 29/34] PID Virtualization container attach/detach calls
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (27 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 28/34] PID Virtualization container object and functions Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 30/34] PID Virtualization /proc/container filesystem Serge Hallyn
                   ` (5 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: G3-container-fork-exit.patch --]
[-- Type: text/plain, Size: 1773 bytes --]

Call the container attach and detach functions at their respective
locations. This happens during the fork and exit functions.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 exit.c |    1 +
 fork.c |    5 ++++-
 2 files changed, 5 insertions(+), 1 deletion(-)

Index: linux-2.6.15/kernel/exit.c
===================================================================
--- linux-2.6.15.orig/kernel/exit.c	2006-01-17 08:37:07.000000000 -0500
+++ linux-2.6.15/kernel/exit.c	2006-01-17 08:37:08.000000000 -0500
@@ -101,6 +101,7 @@
 		zap_leader = (leader->exit_signal == -1);
 	}
 
+	container_detach(p);
 	sched_exit(p);
 	write_unlock_irq(&tasklist_lock);
 	spin_unlock(&p->proc_lock);
Index: linux-2.6.15/kernel/fork.c
===================================================================
--- linux-2.6.15.orig/kernel/fork.c	2006-01-17 08:37:08.000000000 -0500
+++ linux-2.6.15/kernel/fork.c	2006-01-17 08:37:08.000000000 -0500
@@ -43,6 +43,7 @@
 #include <linux/rmap.h>
 #include <linux/acct.h>
 #include <linux/cn_proc.h>
+#include <linux/container.h>
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -999,6 +1000,7 @@
 		goto bad_fork_cleanup_mm;
 	if ((retval = copy_namespace(clone_flags, p)))
 		goto bad_fork_cleanup_keys;
+	container_attach(p);
 	retval = copy_thread(0, clone_flags, stack_start, stack_size, p, regs);
 	if (retval)
 		goto bad_fork_cleanup_namespace;
@@ -1175,6 +1177,7 @@
 	mpol_free(p->mempolicy);
 #endif
 bad_fork_cleanup:
+	container_detach(p);
 	if (p->binfmt)
 		module_put(p->binfmt->module);
 bad_fork_cleanup_put_domain:
@@ -1238,7 +1241,7 @@
 {
 	struct task_struct *p;
 	int trace = 0;
-	long pid = alloc_pidmap(DEFAULT_PIDSPACE);
+	long pid = alloc_pidmap(task_pidspace_id(current));
 	long vpid;
 
 	if (pid < 0)

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 30/34] PID Virtualization /proc/container filesystem
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (28 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 29/34] PID Virtualization container attach/detach calls Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 31/34] PID Virtualization Implementation of low level virtualization functions Serge Hallyn
                   ` (4 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: G4-container-procfs.patch --]
[-- Type: text/plain, Size: 4253 bytes --]

Provide the /proc/container directory to
containerize a process or retrieve an associated container.
We need a reasonable quick mechanism to trigger container creation.

A process becomes the root of a container if it writes
a unique name to the /proc/container file. If the process does
not already belong to a container and the name is unique, 
a container is created and the calling process becomes the root.
Reading from the file returns the name of the container.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 Makefile        |    2 
 container_api.c |  116 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 117 insertions(+), 1 deletion(-)

Index: linux-2.6.15/kernel/Makefile
===================================================================
--- linux-2.6.15.orig/kernel/Makefile	2006-01-17 08:37:08.000000000 -0500
+++ linux-2.6.15/kernel/Makefile	2006-01-17 08:37:09.000000000 -0500
@@ -8,7 +8,7 @@
 	    signal.o sys.o kmod.o workqueue.o pid.o \
 	    rcupdate.o intermodule.o extable.o params.o posix-timers.o \
 	    kthread.o wait.o kfifo.o sys_ni.o posix-cpu-timers.o \
-	    container.o
+	    container.o container_api.o
 
 obj-$(CONFIG_FUTEX) += futex.o
 obj-$(CONFIG_GENERIC_ISA_DMA) += dma.o
Index: linux-2.6.15/kernel/container_api.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.15/kernel/container_api.c	2006-01-17 08:37:09.000000000 -0500
@@ -0,0 +1,116 @@
+/*
+ * External Interface to containers
+ *
+ * This is only for quick bootstrapping the container support
+ * A proper external API needs to be found
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2005 <frankeh@watson.ibm.com>
+ *
+ */
+
+/* Changes
+ *
+ * 11/22/2005:  Created
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <asm/uaccess.h>
+#include <linux/proc_fs.h>
+#include <linux/timer.h>
+#include <linux/mm.h>
+#include <linux/container.h>
+
+MODULE_LICENSE("GPL");
+
+#define DPRINTK( fmt, args...)  // printk( "%s: " fmt, __FUNCTION__, ##args)
+
+/****************************************************************
+ *		P R O C   F S   S T U F F
+ ****************************************************************/
+
+static ssize_t container_write(struct file *file, const char __user *ubuf,
+			       size_t count, loff_t *p)
+{
+	const char *delims = " \t\n";
+	char kbuf[MAX_CONTAINER_NAME_LEN];
+	char *cptr;
+	char *cname;
+	int rc;
+
+	if (current->container)
+		return -EPERM;
+	if (count >= MAX_CONTAINER_NAME_LEN)
+		return -EINVAL;
+	if (copy_from_user(kbuf, ubuf, count))
+		return -EFAULT;
+	kbuf[MAX_CONTAINER_NAME_LEN-1] = '\0';
+
+	cptr = kbuf;
+	cname = strsep(&cptr,delims);
+	DPRINTK("<%s:%d>: <%s>\n", current->comm, task_pid(current), cname);
+	rc = container_new(cname);
+	if (rc < 0)
+		return rc;
+	return count;
+}
+
+static ssize_t container_read(struct file *file, char __user *ubuf,
+	       		      size_t count, loff_t *ppos)
+{
+	char kbuf[MAX_CONTAINER_NAME_LEN];
+	int len;
+	char *cname;
+	loff_t __ppos = *ppos;
+
+	cname = current->container ? current->container->name : "";
+	len = sprintf(kbuf,"%s\n",cname);
+	if (__ppos >= len)
+		return 0;
+	if (count > len-__ppos)
+		count = len-__ppos;
+	if (copy_to_user(ubuf, kbuf+__ppos, count))
+		return -EFAULT;
+	*ppos += __ppos + count;
+	DPRINTK("%s: caller <%s:%d>: <%s>\n",
+		current->comm, task_pid(current), cname);
+	return count;
+}
+
+static struct file_operations container_proc_operations = {
+	.read  = container_read,
+	.write = container_write,
+};
+
+/****************************************************************
+ *
+ ****************************************************************/
+
+static int __init container_init(void)
+{
+	int rc = 0;
+	struct proc_dir_entry *entry;
+
+	entry = create_proc_entry("container", S_IWUGO|S_IRUGO, NULL);
+	if (entry)
+		entry->proc_fops = &container_proc_operations;
+	else
+		rc = -EINVAL;
+
+	/* Other initialization */
+
+	if (rc)
+		remove_proc_entry("container", NULL);
+	return rc;
+}
+
+static void __exit container_exit(void)
+{
+}
+
+module_init(container_init);
+module_exit(container_exit);
+

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 31/34] PID Virtualization Implementation of low level virtualization functions
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (29 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 30/34] PID Virtualization /proc/container filesystem Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 32/34] PID Virtualization Handle special case vpid return cases Serge Hallyn
                   ` (3 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: G5-virtfunct-impl.patch --]
[-- Type: text/plain, Size: 3906 bytes --]

We finally utilize the pid space implementation to obtain a real virtualizaton
inside the pid/vpid conversion functions. Care has been taken to retain
the fast path (either in global context or in the same pidspace) as inline, 
while the exception case (typically involves checking for container root)
is handled separately.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 include/linux/sched.h |   49 +++++++++++++++++++++++++++++++++++++++++++------
 kernel/container.c    |   28 ++++++++++++++++++++++++++++
 2 files changed, 71 insertions(+), 6 deletions(-)

Index: linux-2.6.15/include/linux/sched.h
===================================================================
--- linux-2.6.15.orig/include/linux/sched.h	2006-01-17 08:37:08.000000000 -0500
+++ linux-2.6.15/include/linux/sched.h	2006-01-17 08:37:09.000000000 -0500
@@ -870,9 +870,25 @@
  *  pid domain translation functions:
  *	- from kernel to user pid domain
  */
+
+extern pid_t __pid_to_vpid_ctx_excp(pid_t pid, int psid_pid,
+				     const struct task_struct *ctx);
+
 static inline pid_t pid_to_vpid_ctx(pid_t pid, const struct task_struct *ctx)
 {
-	return pid;
+	int psid_pid, psid_ctx;
+
+	if (!ctx->container)
+		return pid;
+
+	psid_ctx = pid_to_pidspace(ctx->__pid);
+	psid_pid = pid_to_pidspace(pid);
+	pid      = pidspace_pid_to_vpid(pid);
+
+	if (likely(psid_ctx == psid_pid))
+		return pid;
+
+	return __pid_to_vpid_ctx_excp(pid, psid_pid, ctx);
 }
 
 static inline pid_t pid_to_vpid(pid_t pid)
@@ -884,9 +900,11 @@
 {
 	int isgrp = (pid < 0) ;
 
-	if (isgrp) pid = -pid;
+	if (isgrp)
+		pid = -pid;
 	pid = pid_to_vpid_ctx(pid, ctx);
-	if (isgrp) pid = -pid;
+	if (isgrp && pid != -1)
+		pid = -pid;
 	return pid;
 }
 
@@ -895,13 +913,32 @@
 	return pgid_to_vpgid_ctx(pid, current);
 }
 
+extern pid_t __vpid_to_pid_excp(pid_t pid);
+
 static inline pid_t vpid_to_pid(pid_t pid)
 {
-	return pid;
+	if (!current->container)
+		return pid;
+
+	if (pid == 1)
+		return current->container->init_pid;
+
+	if (!pid_to_pidspace(pid)) {
+		int psid = pid_to_pidspace(current->__pid);
+		return pidspace_vpid_to_pid(psid, pid);
+	}
+	return __vpid_to_pid_excp(pid);
 }
 
 static inline pid_t vpgid_to_pgid(pid_t pid)
 {
+	int isgrp = (pid < 0) ;
+
+	if (isgrp)
+		pid = -pid;
+	pid = vpid_to_pid(pid);
+	if (isgrp && pid != -1)
+		pid = -pid;
 	return pid;
 }
 
@@ -931,7 +968,7 @@
 static inline pid_t task_vpid_ctx(const struct task_struct *p,
 				   const struct task_struct *ctx)
 {
-	return task_pid(p);
+	return pid_to_vpid_ctx(task_pid(p), ctx);
 }
 
 static inline pid_t task_vpid(const struct task_struct *p)
@@ -963,7 +1000,7 @@
 
 static inline pid_t virt_process_group(const struct task_struct *p)
 {
-	return process_group(p);
+	return pid_to_vpid(process_group(p));
 }
 
 static inline unsigned int task_pidspace_id(const struct task_struct *p)
Index: linux-2.6.15/kernel/container.c
===================================================================
--- linux-2.6.15.orig/kernel/container.c	2006-01-17 08:37:08.000000000 -0500
+++ linux-2.6.15/kernel/container.c	2006-01-17 08:37:09.000000000 -0500
@@ -138,3 +138,31 @@
 	return rc;
 }
 
+pid_t __pid_to_vpid_ctx_excp(pid_t pid, int pidspace_id,
+			     const struct task_struct *ctx)
+{
+	/* figure out whether pid .. virtual to pidspace_id_pid space
+	 * is meaningful to ctx (which is in differnt pidspace_id).
+	 * since a container's init_proc resides physically in psdi=0
+	 */
+	if (unlikely(ctx == ctx->container->init_proc)) {
+		if (pidspace_id != ctx->container->pidspace_id)
+			pid = -1;
+		return pid;
+	}
+	if (pid == ctx->container->init_pid)
+		return 1;
+	return -1;
+}
+
+pid_t __vpid_to_pid_excp(pid_t pid)
+{
+	/* we only let realpid pass as vpid if it marks the top of
+	 * current is the init_proc and vpid == init_pid
+	 */
+	if (current->container->pidspace_id == pid_to_pidspace(pid))
+		return pid;
+	return -1;
+}
+
+

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 32/34] PID Virtualization Handle special case vpid return cases
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (30 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 31/34] PID Virtualization Implementation of low level virtualization functions Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 33/34] PID Virtualization per container /proc filesystem Serge Hallyn
                   ` (2 subsequent siblings)
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: G6-vpid-rc-special-handling.patch --]
[-- Type: text/plain, Size: 2586 bytes --]

Certain places in the virtual pid return locations need special handling
to return the appropriate information back to the user.

Signed-off-by: Hubertus Franke <frankeh@watson.ibm.com>
---
 fs/proc/array.c |   15 +++++++++------
 fs/proc/base.c  |    2 ++
 kernel/signal.c |    8 ++++++--
 3 files changed, 17 insertions(+), 8 deletions(-)

Index: linux-2.6.15/fs/proc/base.c
===================================================================
--- linux-2.6.15.orig/fs/proc/base.c	2006-01-17 08:37:07.000000000 -0500
+++ linux-2.6.15/fs/proc/base.c	2006-01-17 08:37:09.000000000 -0500
@@ -2103,6 +2103,8 @@
 
 	for ( ; p != &init_task; p = next_task(p)) {
 		int tgid = task_vpid_ctx(p, current);
+		if (tgid < 0)
+			continue;
 		if (!pid_alive(p))
 			continue;
 		if (--index >= 0)
Index: linux-2.6.15/fs/proc/array.c
===================================================================
--- linux-2.6.15.orig/fs/proc/array.c	2006-01-17 08:37:07.000000000 -0500
+++ linux-2.6.15/fs/proc/array.c	2006-01-17 08:37:09.000000000 -0500
@@ -164,13 +164,16 @@
 	pid_t ppid, tpid;
 
 	read_lock(&tasklist_lock);
-	if (pid_alive(p))
+	if (pid_alive(p)) {
 		ppid = task_vtgid_ctx(p->group_leader->real_parent, current);
-	else
+		if (ppid < 0) ppid = 1;
+	} else {
 		ppid = 0;
-	if (pid_alive(p) && p->ptrace)
+	}
+	if (pid_alive(p) && p->ptrace) {
 		tpid = task_vppid_ctx(p, current);
-	else
+		if (tpid < 0) tpid = 0;
+	} else
 		tpid = 0;
 	buffer += sprintf(buffer,
 		"State:\t%s\n"
@@ -183,8 +186,8 @@
 		"Gid:\t%d\t%d\t%d\t%d\n",
 		get_task_state(p),
 		(p->sleep_avg/1024)*100/(1020000000/1024),
-	       	task_vtgid_ctx(p,current),
-		task_vpid_ctx(p,current),
+	       	task_vtgid_ctx(p, current),
+		task_vpid_ctx(p, current),
 		ppid, tpid,
 		p->uid, p->euid, p->suid, p->fsuid,
 		p->gid, p->egid, p->sgid, p->fsgid);
Index: linux-2.6.15/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/kernel/signal.c	2006-01-17 08:37:07.000000000 -0500
+++ linux-2.6.15/kernel/signal.c	2006-01-17 08:37:09.000000000 -0500
@@ -2257,6 +2257,12 @@
 	struct siginfo info;
 	struct task_struct *p;
 
+	pid  = vpid_to_pid(pid);
+	if (pid < 0)
+		return pid;
+	tgid = vpid_to_pid(tgid);
+	if (tgid < 0)
+		return tgid;
 	error = -ESRCH;
 	info.si_signo = sig;
 	info.si_errno = 0;
@@ -2264,8 +2270,6 @@
 	info.si_pid = task_vtgid(current);
 	info.si_uid = current->uid;
 
-	pid  = vpid_to_pid(pid);
-	tgid = vpid_to_pid(tgid);
 	read_lock(&tasklist_lock);
 	p = find_task_by_pid(pid);
 	if (p && (tgid <= 0 || task_tgid(p) == tgid)) {

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 33/34] PID Virtualization per container /proc filesystem
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (31 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 32/34] PID Virtualization Handle special case vpid return cases Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 14:33 ` RFC [patch 34/34] PID Virtualization pidspace parent : signal behavior Serge Hallyn
  2006-01-17 16:19 ` RFC [patch 00/34] PID Virtualization Overview Suleiman Souhlal
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: G7-percontainer-procfs.patch --]
[-- Type: text/plain, Size: 2577 bytes --]

Provide the interception and virtualization of the proc interface.
In particular, from within the container the processes need to be 
identified as virtual under /proc as well as we need to limit the 
ones shown to the ones in the container.
NOTE: This is only temporarily since this exhibits some performance problems.

Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Serge E Hallyn <serue@us.ibm.com>
---
 base.c  |    2 ++
 inode.c |   28 ++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

Index: linux-2.6.15/fs/proc/inode.c
===================================================================
--- linux-2.6.15.orig/fs/proc/inode.c	2006-01-17 08:17:28.000000000 -0500
+++ linux-2.6.15/fs/proc/inode.c	2006-01-17 08:37:10.000000000 -0500
@@ -190,6 +190,33 @@
 	return NULL;
 }			
 
+/* This service performs checks on virtualization marker to allow multiple
+ * dentries with the same name in the dcache.
+ */
+
+#define procpid_check_marker(task, data) (task->container == data)
+static int proc_root_compare(struct dentry *dentry, struct qstr *a,
+			      struct qstr *b)
+{
+	/* CAUTION: to evaluate pointer of target dentry, we assume parameter
+	 * 'a' is its 'd_name' field. This is always the case anyway.
+	 */
+	struct dentry* d = (struct dentry *)
+		((unsigned long) a -
+		((unsigned long) &dentry->d_name - (unsigned long) dentry));
+	int result = 1;
+
+	if (a->len == b->len && !memcmp(a->name, b->name, a->len))
+		result = !procpid_check_marker(current, d->d_fsdata);
+
+	return result;
+}
+
+static struct dentry_operations root_dentry_operations =
+{
+	d_compare:      proc_root_compare,
+};
+
 int proc_fill_super(struct super_block *s, void *data, int silent)
 {
 	struct inode * root_inode;
@@ -213,6 +240,7 @@
 	s->s_root = d_alloc_root(root_inode);
 	if (!s->s_root)
 		goto out_no_root;
+	s->s_root->d_op = &root_dentry_operations;
 	return 0;
 
 out_no_root:
Index: linux-2.6.15/fs/proc/base.c
===================================================================
--- linux-2.6.15.orig/fs/proc/base.c	2006-01-17 08:37:09.000000000 -0500
+++ linux-2.6.15/fs/proc/base.c	2006-01-17 08:37:10.000000000 -0500
@@ -1497,6 +1497,7 @@
 	inode->i_op = &proc_pid_link_inode_operations;
 	inode->i_size = 64;
 	ei->op.proc_get_link = proc_fd_link;
+	dentry->d_fsdata = current->container;
 	dentry->d_op = &tid_fd_dentry_operations;
 	d_add(dentry, inode);
 	return NULL;
@@ -2002,6 +2003,7 @@
 	inode->i_nlink = 4;
 #endif
 
+	dentry->d_fsdata = current->container;
 	dentry->d_op = &pid_base_dentry_operations;
 
 	died = 0;

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC [patch 34/34] PID Virtualization pidspace parent : signal behavior
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (32 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 33/34] PID Virtualization per container /proc filesystem Serge Hallyn
@ 2006-01-17 14:33 ` Serge Hallyn
  2006-01-17 16:19 ` RFC [patch 00/34] PID Virtualization Overview Suleiman Souhlal
  34 siblings, 0 replies; 136+ messages in thread
From: Serge Hallyn @ 2006-01-17 14:33 UTC (permalink / raw)
  To: linux-kernel
  Cc: Hubertus Franke, Cedric Le Goater, Dave Hansen, Serge E Hallyn

[-- Attachment #1: G8-prohibit-init-kill.patch --]
[-- Type: text/plain, Size: 765 bytes --]

make sure a process parent of a pidspace discards signals sent
from processes in that pidspace.

Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
---
 signal.c |    4 ++++
 1 files changed, 4 insertions(+)

Index: linux-2.6.15/kernel/signal.c
===================================================================
--- linux-2.6.15.orig/kernel/signal.c	2006-01-17 08:37:09.000000000 -0500
+++ linux-2.6.15/kernel/signal.c	2006-01-17 08:37:10.000000000 -0500
@@ -642,6 +642,10 @@
 	if (!valid_signal(sig))
 		return error;
 	error = -EPERM;
+
+	if (task_vpid_ctx(t, current) == 1)
+	    return error;
+
 	if ((info == SEND_SIG_NOINFO || (!is_si_special(info) && SI_FROMUSER(info)))
 	    && ((sig != SIGCONT) ||
 		(current->signal->session != t->signal->session))

--


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-17 14:33 ` RFC [patch 13/34] PID Virtualization Define new task_pid api Serge Hallyn
@ 2006-01-17 15:32   ` Arjan van de Ven
  2006-01-17 15:56     ` Serge E. Hallyn
  0 siblings, 1 reply; 136+ messages in thread
From: Arjan van de Ven @ 2006-01-17 15:32 UTC (permalink / raw)
  To: Serge Hallyn; +Cc: linux-kernel, Hubertus Franke, Cedric Le Goater, Dave Hansen

On Tue, 2006-01-17 at 08:33 -0600, Serge Hallyn wrote:
> plain text document attachment (BC-define-pid-handlers)
> Actually define the task_pid() and task_tgid() functions.  Also
> replace pid with __pid so as to make sure any missed accessors are
> caught.

This question was asked a few times before without satisfactory answer:
*WHY* this abstraction.
There is *NO* point. Really. 

(And if the answer is "but we want to play tricks later", just make a
current->realpid or whatever, but leave current->pid be the virtual pid.
Your abstraction helps NOTHING there. Zero Nada Noppes).



^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-17 15:32   ` Arjan van de Ven
@ 2006-01-17 15:56     ` Serge E. Hallyn
  2006-01-17 16:02       ` Arjan van de Ven
  2006-01-17 16:03       ` Alan Cox
  0 siblings, 2 replies; 136+ messages in thread
From: Serge E. Hallyn @ 2006-01-17 15:56 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Serge Hallyn, linux-kernel, Hubertus Franke, Cedric Le Goater,
	Dave Hansen

Quoting Arjan van de Ven (arjan@infradead.org):
> On Tue, 2006-01-17 at 08:33 -0600, Serge Hallyn wrote:
> > plain text document attachment (BC-define-pid-handlers)
> > Actually define the task_pid() and task_tgid() functions.  Also
> > replace pid with __pid so as to make sure any missed accessors are
> > caught.
> 
> This question was asked a few times before without satisfactory answer:
> *WHY* this abstraction.
> There is *NO* point. Really. 
> 
> (And if the answer is "but we want to play tricks later", just make a
> current->realpid or whatever, but leave current->pid be the virtual pid.
> Your abstraction helps NOTHING there. Zero Nada Noppes).

The virtual pid is different depending on who is asking.  So simply
storing current->realpid and current->pid isn't helpful, as we would
still need to call a function when a pid crosses user->kernel boundary.

However we could make the patch far less invasive by skipping the
task_pid() macro altogether.  Switching current->pid to current->__pid
was to make sure we catch any ->pid accesses which we may have missed,
during compilation.

Is that approach (keeping task->pid as the real pid and dropping the
task_pid() macro) preferred by all?

-serge


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-17 15:56     ` Serge E. Hallyn
@ 2006-01-17 16:02       ` Arjan van de Ven
  2006-01-17 16:03       ` Alan Cox
  1 sibling, 0 replies; 136+ messages in thread
From: Arjan van de Ven @ 2006-01-17 16:02 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-kernel, Hubertus Franke, Cedric Le Goater, Dave Hansen


> Is that approach (keeping task->pid as the real pid and dropping the
> task_pid() macro) preferred by all?

it sure is what I think is the best approach




^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-17 15:56     ` Serge E. Hallyn
  2006-01-17 16:02       ` Arjan van de Ven
@ 2006-01-17 16:03       ` Alan Cox
  2006-01-17 17:16         ` Kyle Moffett
  2006-01-17 17:25         ` Dave Hansen
  1 sibling, 2 replies; 136+ messages in thread
From: Alan Cox @ 2006-01-17 16:03 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Arjan van de Ven, linux-kernel, Hubertus Franke,
	Cedric Le Goater, Dave Hansen

On Maw, 2006-01-17 at 09:56 -0600, Serge E. Hallyn wrote:
> The virtual pid is different depending on who is asking.  So simply
> storing current->realpid and current->pid isn't helpful, as we would
> still need to call a function when a pid crosses user->kernel boundary.

This is an obscure, weird piece of functionality for some special case
usages most of which are going to be eliminated by Xen. I don't see the
kernel side justification for it at all.

Maybe you should remap it the other side of the user->kernel boundary ?


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
                   ` (33 preceding siblings ...)
  2006-01-17 14:33 ` RFC [patch 34/34] PID Virtualization pidspace parent : signal behavior Serge Hallyn
@ 2006-01-17 16:19 ` Suleiman Souhlal
  2006-01-17 17:08   ` Dave Hansen
  34 siblings, 1 reply; 136+ messages in thread
From: Suleiman Souhlal @ 2006-01-17 16:19 UTC (permalink / raw)
  To: Serge Hallyn; +Cc: linux-kernel, Hubertus Franke, Cedric Le Goater, Dave Hansen

Serge Hallyn wrote:

> The mechanism to start a container 
> is to 'echo "container_name" > /proc/container'  which creates a new
> container and associates the calling process with it. All subsequently
> forked tasks then belong to that container.
> There is a separate pid space associated with each container.
> Only processes/task belonging to the same container "see" each other.

Why does there need a separate pid space for each container?
You don't really need one to make sure that only processes in the same 
containers can see each other.

-- Suleiman

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-17 16:19 ` RFC [patch 00/34] PID Virtualization Overview Suleiman Souhlal
@ 2006-01-17 17:08   ` Dave Hansen
  2006-01-17 18:09     ` Suleiman Souhlal
  0 siblings, 1 reply; 136+ messages in thread
From: Dave Hansen @ 2006-01-17 17:08 UTC (permalink / raw)
  To: Suleiman Souhlal
  Cc: Serge Hallyn, linux-kernel, Hubertus Franke, Cedric Le Goater

On Tue, 2006-01-17 at 08:19 -0800, Suleiman Souhlal wrote:
> Serge Hallyn wrote:
> > The mechanism to start a container 
> > is to 'echo "container_name" > /proc/container'  which creates a new
> > container and associates the calling process with it. All subsequently
> > forked tasks then belong to that container.
> > There is a separate pid space associated with each container.
> > Only processes/task belonging to the same container "see" each other.
> 
> Why does there need a separate pid space for each container?
> You don't really need one to make sure that only processes in the same 
> containers can see each other.

One use for containers might be to pick a container from a system, wrap
it up, and transport it to another system where it would continue to
run.  We would have to make sure that the pids did not collide with any
containers running on the target system.

-- Dave


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-17 16:03       ` Alan Cox
@ 2006-01-17 17:16         ` Kyle Moffett
  2006-01-17 17:25         ` Dave Hansen
  1 sibling, 0 replies; 136+ messages in thread
From: Kyle Moffett @ 2006-01-17 17:16 UTC (permalink / raw)
  To: Alan Cox, Suleiman Souhlal
  Cc: Serge E. Hallyn, Arjan van de Ven, LKML Kernel, Hubertus Franke,
	Cedric Le Goater, Dave Hansen

On Jan 17, 2006, at 11:19, Suleiman Souhlal wrote:
> Why does there need a separate pid space for each container?  You  
> don't really need one to make sure that only processes in the same  
> containers can see each other.

On Jan 17, 2006, at 11:03, Alan Cox wrote:
> This is an obscure, weird piece of functionality for some special  
> case usages most of which are going to be eliminated by Xen. I  
> don't see the kernel side justification for it at all.
>
> Maybe you should remap it the other side of the user->kernel  
> boundary ?

To answer both questions at the same time, this is to make it  
practical to reliably freeze and restore whole process-trees/jobs,  
possibly restarting or migrating to a different computer in-between.   
Such freeze/restore code is mostly possible now, except for programs  
that store a pid internally to send a signal to another process.  The  
usage would be something like this:

start:
	create container:
		run jobs

freeze
	for each process/thread in the container:
		send SIGSTOP to halt execution
	for each process in the container:
		store process data, filehandles, vpid, etc

restore:
	create container:
		iterate over the frozen processes in the freeze file:
			clone_with_pid(flags, original_vpid);
			adjust session, connect filehandles, etc
			remap shared memory, etc
			block on atomic "resume" variable in mmap-ed file
		
		set "resume" to 1

The end result is that you could freeze and resume _any_ process tree  
in a container, even ones that do weird things with filehandles,  
sockets, pids, etc.  Personally I would find this useful to migrate  
an extensive memory-leak-debugging session in a large application  
over from laptop to desktop or vice versa.  You could also freeze/ 
migrate/restore a whole X session (not including the X process  
itself, but all the client apps).  You could not do this at all for  
statically-linked applications without kernel support, and it would  
be rather inefficient to do even for dynamically-linked ones.

The one other option would be to allow opening a file /proc/$PID/ 
control, to which you could write a signal number, and require  
freezable programs to use that interface to reliably send signals  
(This also makes signals non-racy if you're reusing all 60000+ pids  
on a regular basis).  This has the disadvantage of not even working  
for existing dynamically-linked programs either..

Cheers,
Kyle Moffett

--
There is no way to make Linux robust with unreliable memory  
subsystems, sorry.  It would be like trying to make a human more  
robust with an unreliable O2 supply. Memory just has to work.
   -- Andi Kleen



^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-17 16:03       ` Alan Cox
  2006-01-17 17:16         ` Kyle Moffett
@ 2006-01-17 17:25         ` Dave Hansen
  2006-01-18  4:54           ` Greg KH
                             ` (2 more replies)
  1 sibling, 3 replies; 136+ messages in thread
From: Dave Hansen @ 2006-01-17 17:25 UTC (permalink / raw)
  To: Alan Cox
  Cc: Serge E. Hallyn, Arjan van de Ven, linux-kernel, Hubertus Franke,
	Cedric Le Goater

On Tue, 2006-01-17 at 16:03 +0000, Alan Cox wrote: 
> On Maw, 2006-01-17 at 09:56 -0600, Serge E. Hallyn wrote:
> > The virtual pid is different depending on who is asking.  So simply
> > storing current->realpid and current->pid isn't helpful, as we would
> > still need to call a function when a pid crosses user->kernel boundary.
> 
> This is an obscure, weird piece of functionality for some special case
> usages most of which are going to be eliminated by Xen. I don't see the
> kernel side justification for it at all.

At least OpenVZ and vserver want very similar functionality.  They're
both working with out-of-tree patch sets.  We each want to do subtly
different things with tsk->pid, and task_pid() seemed to be a decent
place to start.  OpenVZ has a very similar concept in its pid_task()
function.

Arjan had a very good point last time we posted these: we should
consider getting rid of as many places in the kernel where pids are used
to uniquely identify tasks, and just stick with task_struct pointers.  

> Maybe you should remap it the other side of the user->kernel boundary ?

We would very much like to run unmodified applications and libraries.
Doing it in a patched libc is certainly a possibility, but it wouldn't
make any future customers very happy.

I also wonder if RedHat or SUSE would ever ship and support a special
set of libraries for us.  Oh, and there are always statically linked
apps... :)

-- Dave


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-17 17:08   ` Dave Hansen
@ 2006-01-17 18:09     ` Suleiman Souhlal
  2006-01-17 18:12       ` Dave Hansen
  0 siblings, 1 reply; 136+ messages in thread
From: Suleiman Souhlal @ 2006-01-17 18:09 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Serge Hallyn, linux-kernel, Hubertus Franke, Cedric Le Goater

Dave Hansen wrote:
> One use for containers might be to pick a container from a system, wrap
> it up, and transport it to another system where it would continue to
> run.  We would have to make sure that the pids did not collide with any
> containers running on the target system.

Couldn't you assign new pids when the container is transported to the 
other system?

-- Suleiman

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-17 18:09     ` Suleiman Souhlal
@ 2006-01-17 18:12       ` Dave Hansen
  2006-01-17 18:29         ` Alan Cox
  0 siblings, 1 reply; 136+ messages in thread
From: Dave Hansen @ 2006-01-17 18:12 UTC (permalink / raw)
  To: Suleiman Souhlal
  Cc: Serge Hallyn, linux-kernel, Hubertus Franke, Cedric Le Goater

On Tue, 2006-01-17 at 10:09 -0800, Suleiman Souhlal wrote:
> Dave Hansen wrote:
> > One use for containers might be to pick a container from a system, wrap
> > it up, and transport it to another system where it would continue to
> > run.  We would have to make sure that the pids did not collide with any
> > containers running on the target system.
> 
> Couldn't you assign new pids when the container is transported to the 
> other system?

You do assign new pids, at least as far as the kernel is concerned.
However, any processes that continue to run would get confused if their
pid changed.  You have to make sure that the tasks have a _consistent_
view of which process is which pid.

-- Dave


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-17 18:12       ` Dave Hansen
@ 2006-01-17 18:29         ` Alan Cox
  2006-01-18 19:01           ` Dave Hansen
  0 siblings, 1 reply; 136+ messages in thread
From: Alan Cox @ 2006-01-17 18:29 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Suleiman Souhlal, Serge Hallyn, linux-kernel, Hubertus Franke,
	Cedric Le Goater

On Maw, 2006-01-17 at 10:12 -0800, Dave Hansen wrote:
> You do assign new pids, at least as far as the kernel is concerned.
> However, any processes that continue to run would get confused if their
> pid changed.  You have to make sure that the tasks have a _consistent_
> view of which process is which pid.

Don't reassign the pid at all. Keep task->container and do the job
explicitly. Most task searches for a pid are abstracted already and most
users of ->pid who try and use it for comparing two tasks for equality
or for keeping a task reference are already terminally racey and want
fixing anyway.

It raises a few other minor questions - one is /proc - but if container
0 was the usual one then putting the other containers into a subdir
would break nothing. Alternatively proc could allow multiple mounts and
a container = option to get the fs view right in chroot trees. The
subdirectories would be nice for management views.

You'd also need some process management items for other contexts - kill
etc but most of that can be done just by having a fork_into_container()
ability.

Alan


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-17 17:25         ` Dave Hansen
@ 2006-01-18  4:54           ` Greg KH
  2006-01-18  4:55           ` Greg KH
  2006-02-02 14:49           ` Kirill Korotaev
  2 siblings, 0 replies; 136+ messages in thread
From: Greg KH @ 2006-01-18  4:54 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Alan Cox, Serge E. Hallyn, Arjan van de Ven, linux-kernel,
	Hubertus Franke, Cedric Le Goater

On Tue, Jan 17, 2006 at 09:25:14AM -0800, Dave Hansen wrote:
> 
> I also wonder if RedHat or SUSE would ever ship and support a special
> set of libraries for us.

Don't both companies do that today with their s390 releases?  :)

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-17 17:25         ` Dave Hansen
  2006-01-18  4:54           ` Greg KH
@ 2006-01-18  4:55           ` Greg KH
  2006-01-18 16:23             ` Dave Hansen
  2006-02-02 14:49           ` Kirill Korotaev
  2 siblings, 1 reply; 136+ messages in thread
From: Greg KH @ 2006-01-18  4:55 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Alan Cox, Serge E. Hallyn, Arjan van de Ven, linux-kernel,
	Hubertus Franke, Cedric Le Goater

On Tue, Jan 17, 2006 at 09:25:14AM -0800, Dave Hansen wrote:
> 
> Arjan had a very good point last time we posted these: we should
> consider getting rid of as many places in the kernel where pids are used
> to uniquely identify tasks, and just stick with task_struct pointers.  

That's a very good idea, why didn't you do that?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-18  4:55           ` Greg KH
@ 2006-01-18 16:23             ` Dave Hansen
  2006-01-20 17:00               ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Dave Hansen @ 2006-01-18 16:23 UTC (permalink / raw)
  To: Greg KH
  Cc: Alan Cox, Serge E. Hallyn, Arjan van de Ven, linux-kernel,
	Hubertus Franke, Cedric Le Goater

On Tue, 2006-01-17 at 20:55 -0800, Greg KH wrote:
> On Tue, Jan 17, 2006 at 09:25:14AM -0800, Dave Hansen wrote:
> > 
> > Arjan had a very good point last time we posted these: we should
> > consider getting rid of as many places in the kernel where pids are used
> > to uniquely identify tasks, and just stick with task_struct pointers.  
> 
> That's a very good idea, why didn't you do that?

Because we were being stupid and shoudn't have posted this massive set
of patches to LKML again before addressing the comments we got last
time, or doing _anything_ new with them.

-- Dave


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-17 18:29         ` Alan Cox
@ 2006-01-18 19:01           ` Dave Hansen
  2006-01-18 19:28             ` Arjan van de Ven
  2006-01-21 10:31             ` RFC [patch 00/34] PID Virtualization Overview Pavel Machek
  0 siblings, 2 replies; 136+ messages in thread
From: Dave Hansen @ 2006-01-18 19:01 UTC (permalink / raw)
  To: Alan Cox
  Cc: Suleiman Souhlal, Serge Hallyn, linux-kernel, Hubertus Franke,
	Cedric Le Goater

On Tue, 2006-01-17 at 18:29 +0000, Alan Cox wrote:
> On Maw, 2006-01-17 at 10:12 -0800, Dave Hansen wrote:
> > You do assign new pids, at least as far as the kernel is concerned.
> > However, any processes that continue to run would get confused if their
> > pid changed.  You have to make sure that the tasks have a _consistent_
> > view of which process is which pid.
> 
> Don't reassign the pid at all. Keep task->container and do the job
> explicitly. Most task searches for a pid are abstracted already and most
> users of ->pid who try and use it for comparing two tasks for equality
> or for keeping a task reference are already terminally racey and want
> fixing anyway.

Other than searches, there appear to be quite a number of drivers an
subsystems that like to print out pids.  I can't find any cases yet
where these are integral to functionality, but I wonder what approach we
should take.  Should we deprecate printk'ing of pids?  Make a special
function or % modifier to turn a task_struct into something printable?

A function would run into issues of having buffers in which to print the
output.  But, we'd be able to do things like:

	sprintf(buffer, "%d:%d", tsk->container, tsk->pid);

-- Dave


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-18 19:01           ` Dave Hansen
@ 2006-01-18 19:28             ` Arjan van de Ven
  2006-01-18 19:38               ` Dave Hansen
  2006-01-21 10:31             ` RFC [patch 00/34] PID Virtualization Overview Pavel Machek
  1 sibling, 1 reply; 136+ messages in thread
From: Arjan van de Ven @ 2006-01-18 19:28 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Alan Cox, Suleiman Souhlal, Serge Hallyn, linux-kernel,
	Hubertus Franke, Cedric Le Goater

On Wed, 2006-01-18 at 11:01 -0800, Dave Hansen wrote:
> On Tue, 2006-01-17 at 18:29 +0000, Alan Cox wrote:
> > On Maw, 2006-01-17 at 10:12 -0800, Dave Hansen wrote:
> > > You do assign new pids, at least as far as the kernel is concerned.
> > > However, any processes that continue to run would get confused if their
> > > pid changed.  You have to make sure that the tasks have a _consistent_
> > > view of which process is which pid.
> > 
> > Don't reassign the pid at all. Keep task->container and do the job
> > explicitly. Most task searches for a pid are abstracted already and most
> > users of ->pid who try and use it for comparing two tasks for equality
> > or for keeping a task reference are already terminally racey and want
> > fixing anyway.
> 
> Other than searches, there appear to be quite a number of drivers an
> subsystems that like to print out pids.  I can't find any cases yet
> where these are integral to functionality, but I wonder what approach we
> should take. 

those should obviously print out the REAL pid, not the application
pid ... so no changes needed.



^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-18 19:28             ` Arjan van de Ven
@ 2006-01-18 19:38               ` Dave Hansen
  2006-01-18 19:50                 ` Arjan van de Ven
  2006-01-18 22:54                 ` Alan Cox
  0 siblings, 2 replies; 136+ messages in thread
From: Dave Hansen @ 2006-01-18 19:38 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Alan Cox, Suleiman Souhlal, Serge Hallyn, linux-kernel,
	Hubertus Franke, Cedric Le Goater

On Wed, 2006-01-18 at 20:28 +0100, Arjan van de Ven wrote:
> On Wed, 2006-01-18 at 11:01 -0800, Dave Hansen wrote:
> > Other than searches, there appear to be quite a number of drivers an
> > subsystems that like to print out pids.  I can't find any cases yet
> > where these are integral to functionality, but I wonder what approach we
> > should take. 
> 
> those should obviously print out the REAL pid, not the application
> pid ... so no changes needed.

One suggestion was to make all pid comparisons meaningless without some
kind of "container" context along with it.  The thought is that using
pids is inherently racy, and relatively meaningless anyway, so the
kernel shouldn't be dealing with them. (The obvious exception being in
userspace interfaces)

This would let tsk->pid be anything that it likes as long as it has a
unique pid in its container.

But, it seems that many drivers like to print out pids as a unique
identifier for the task.  Should we just let them print those
potentially non-unique identifiers, deprecate and kill them, or provide
a replacement with something else which is truly unique?

-- Dave


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-18 19:38               ` Dave Hansen
@ 2006-01-18 19:50                 ` Arjan van de Ven
  2006-01-18 22:54                 ` Alan Cox
  1 sibling, 0 replies; 136+ messages in thread
From: Arjan van de Ven @ 2006-01-18 19:50 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Alan Cox, Suleiman Souhlal, Serge Hallyn, linux-kernel,
	Hubertus Franke, Cedric Le Goater

On Wed, 2006-01-18 at 11:38 -0800, Dave Hansen wrote:
> On Wed, 2006-01-18 at 20:28 +0100, Arjan van de Ven wrote:
> > On Wed, 2006-01-18 at 11:01 -0800, Dave Hansen wrote:
> > > Other than searches, there appear to be quite a number of drivers an
> > > subsystems that like to print out pids.  I can't find any cases yet
> > > where these are integral to functionality, but I wonder what approach we
> > > should take. 
> > 
> > those should obviously print out the REAL pid, not the application
> > pid ... so no changes needed.
> 
> One suggestion was to make all pid comparisons meaningless without some
> kind of "container" context along with it.  The thought is that using
> pids is inherently racy

current->pid sure isn't racey, you yourself KNOW you're not going
away :)






^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-18 19:38               ` Dave Hansen
  2006-01-18 19:50                 ` Arjan van de Ven
@ 2006-01-18 22:54                 ` Alan Cox
  2006-01-19  7:15                   ` Arjan van de Ven
  2006-01-20 19:53                   ` RFC: Multiple instances of kernel namespaces Eric W. Biederman
  1 sibling, 2 replies; 136+ messages in thread
From: Alan Cox @ 2006-01-18 22:54 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Arjan van de Ven, Suleiman Souhlal, Serge Hallyn, linux-kernel,
	Hubertus Franke, Cedric Le Goater

On Mer, 2006-01-18 at 11:38 -0800, Dave Hansen wrote:
> But, it seems that many drivers like to print out pids as a unique
> identifier for the task.  Should we just let them print those
> potentially non-unique identifiers, deprecate and kill them, or provide
> a replacement with something else which is truly unique?

Pick a format for container number + pid and document/stick with it -
something like container::pid (eg 0::114) or 114[0] whatever so long as
it is consistent



^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-18 22:54                 ` Alan Cox
@ 2006-01-19  7:15                   ` Arjan van de Ven
  2006-01-20  5:11                     ` Eric W. Biederman
  2006-01-20 19:53                   ` RFC: Multiple instances of kernel namespaces Eric W. Biederman
  1 sibling, 1 reply; 136+ messages in thread
From: Arjan van de Ven @ 2006-01-19  7:15 UTC (permalink / raw)
  To: Alan Cox
  Cc: Dave Hansen, Suleiman Souhlal, Serge Hallyn, linux-kernel,
	Hubertus Franke, Cedric Le Goater

On Wed, 2006-01-18 at 22:54 +0000, Alan Cox wrote:
> On Mer, 2006-01-18 at 11:38 -0800, Dave Hansen wrote:
> > But, it seems that many drivers like to print out pids as a unique
> > identifier for the task.  Should we just let them print those
> > potentially non-unique identifiers, deprecate and kill them, or provide
> > a replacement with something else which is truly unique?
> 
> Pick a format for container number + pid and document/stick with it -
> something like container::pid (eg 0::114) or 114[0] whatever so long as
> it is consistent

having a pid_to_string(<task struct>) or maybe task_to_string() thing
for convenient printing of pids/tasks.. I'm all for that. Means you can
even configure how verbose you want it to be (include ->comm or not,
->state maybe etc)


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-19  7:15                   ` Arjan van de Ven
@ 2006-01-20  5:11                     ` Eric W. Biederman
  2006-01-20 20:23                       ` Serge E. Hallyn
  0 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-20  5:11 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Alan Cox, Dave Hansen, Suleiman Souhlal, Serge Hallyn,
	linux-kernel, Hubertus Franke, Cedric Le Goater

Arjan van de Ven <arjan@infradead.org> writes:

> On Wed, 2006-01-18 at 22:54 +0000, Alan Cox wrote:
>> On Mer, 2006-01-18 at 11:38 -0800, Dave Hansen wrote:
>> > But, it seems that many drivers like to print out pids as a unique
>> > identifier for the task.  Should we just let them print those
>> > potentially non-unique identifiers, deprecate and kill them, or provide
>> > a replacement with something else which is truly unique?
>> 
>> Pick a format for container number + pid and document/stick with it -
>> something like container::pid (eg 0::114) or 114[0] whatever so long as
>> it is consistent
>
> having a pid_to_string(<task struct>) or maybe task_to_string() thing
> for convenient printing of pids/tasks.. I'm all for that. Means you can
> even configure how verbose you want it to be (include ->comm or not,
> ->state maybe etc)

The only way I can see to sanely do this is to pass it the temporary
buffer it writes it's contents into.
Something like:
printk(KERN_XXX "%s\n", task_to_string(buf, tsk)); ?


Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-18 16:23             ` Dave Hansen
@ 2006-01-20 17:00               ` Eric W. Biederman
  2006-01-20 20:18                 ` Hubertus Franke
  0 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-20 17:00 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Greg KH, Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	linux-kernel, Hubertus Franke, Cedric Le Goater

Dave Hansen <haveblue@us.ibm.com> writes:

> On Tue, 2006-01-17 at 20:55 -0800, Greg KH wrote:
>> On Tue, Jan 17, 2006 at 09:25:14AM -0800, Dave Hansen wrote:
>> > 
>> > Arjan had a very good point last time we posted these: we should
>> > consider getting rid of as many places in the kernel where pids are used
>> > to uniquely identify tasks, and just stick with task_struct pointers.  
>> 
>> That's a very good idea, why didn't you do that?
>
> Because we were being stupid and shoudn't have posted this massive set
> of patches to LKML again before addressing the comments we got last
> time, or doing _anything_ new with them.

Actually a little progress has been made.  I think the patch set
continues to the point of usability this time or at least is close.

Although it feels like there are still some gaps when I read through
it.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* RFC: Multiple instances of kernel namespaces.
  2006-01-18 22:54                 ` Alan Cox
  2006-01-19  7:15                   ` Arjan van de Ven
@ 2006-01-20 19:53                   ` Eric W. Biederman
  2006-01-20 20:13                     ` Serge E. Hallyn
  1 sibling, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-20 19:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Dave Hansen,
	Arjan van de Ven, Suleiman Souhlal, Serge Hallyn,
	Hubertus Franke, Cedric Le Goater


At this point I have to confess I have been working on something
similar, to IBM's pid virtualization work.  But I have what is at
least for me a unifying concept, that makes things easier to think
about.

The idea is to think about things in terms of namespaces.  Currently
in the kernel we have the fs/mount namespace already implemented.

Partly this helps on what the interface for creating a new namespace
instance should be.  'clone(CLONE_NEW<NAMESPACE_TYPE>)', and how
it should be managed from the kernel data structures.

Partly thinking of things as namespaces helps me scope the problem.

Does this sound like a sane approach?

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC: Multiple instances of kernel namespaces.
  2006-01-20 19:53                   ` RFC: Multiple instances of kernel namespaces Eric W. Biederman
@ 2006-01-20 20:13                     ` Serge E. Hallyn
  2006-01-20 20:22                       ` Hubertus Franke
  2006-01-21 10:04                       ` Eric W. Biederman
  0 siblings, 2 replies; 136+ messages in thread
From: Serge E. Hallyn @ 2006-01-20 20:13 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: linux-kernel,
	Alan Cox <alan@lxorguk.ukuu.org.uk> Dave Hansen,
	Arjan van de Ven, Suleiman Souhlal, Serge Hallyn,
	Hubertus Franke, Cedric Le Goater

Quoting Eric W. Biederman (ebiederm@xmission.com):
> 
> At this point I have to confess I have been working on something
> similar, to IBM's pid virtualization work.  But I have what is at
> least for me a unifying concept, that makes things easier to think
> about.
> 
> The idea is to think about things in terms of namespaces.  Currently
> in the kernel we have the fs/mount namespace already implemented.
> 
> Partly this helps on what the interface for creating a new namespace
> instance should be.  'clone(CLONE_NEW<NAMESPACE_TYPE>)', and how
> it should be managed from the kernel data structures.
> 
> Partly thinking of things as namespaces helps me scope the problem.
> 
> Does this sound like a sane approach?

And a bonus of this is that for security and vserver-type applications,
the CLONE_NEWPID and CLONE_NEWFS will often happen at the same time.

How do you (or do you?) address naming namespaces?  This would be
necessary for transitioning into an existing namespace, performing
actions on existing namespaces (i.e. checkpoint, migrate to another
machine, enter the namespace and kill pid 521), and would just be
useful for accounting purposes, i.e. how else do you have a
"ps --all-namespaces" specify a process' namespace?

Doubt we want to add an argument to clone(), so do we just add a new
proc, sysfs, or syscall for setting a pid-namespace name?

Do we need a new syscall for transitioning into an existing namespace?

thanks,
-serge

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-20 17:00               ` Eric W. Biederman
@ 2006-01-20 20:18                 ` Hubertus Franke
  2006-01-21 10:25                   ` Eric W. Biederman
                                     ` (2 more replies)
  0 siblings, 3 replies; 136+ messages in thread
From: Hubertus Franke @ 2006-01-20 20:18 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Eric W. Biederman wrote:
> Dave Hansen <haveblue@us.ibm.com> writes:
> 
> 
>>On Tue, 2006-01-17 at 20:55 -0800, Greg KH wrote:
>>
>>>On Tue, Jan 17, 2006 at 09:25:14AM -0800, Dave Hansen wrote:
>>>
>>>>Arjan had a very good point last time we posted these: we should
>>>>consider getting rid of as many places in the kernel where pids are used
>>>>to uniquely identify tasks, and just stick with task_struct pointers.  
>>>
>>>That's a very good idea, why didn't you do that?
>>
>>Because we were being stupid and shoudn't have posted this massive set
>>of patches to LKML again before addressing the comments we got last
>>time, or doing _anything_ new with them.
> 
> 
> Actually a little progress has been made.  I think the patch set
> continues to the point of usability this time or at least is close.
> 
> Although it feels like there are still some gaps when I read through
> it.
> 
> Eric
> 

Let me just summarize the discussion that has taken place so far
and the consequences I/we seem to be drawing out of it.

We discussed the various approaches that are floating around now, enough
has been said about each, so I leave it at that ...
(a) GLIBC intercept LD_PRELOAD	
(b) Binary Rewrite of glibc
(c) syscall table intercept		(see ZAP)
(d) vpid approach			(see "IBM" patches posted)
(e) <pid,container> approach 		(see below, suggested by Alan?.. )

There are several issues that came up in the email exchange ( Arjen, Alan Cox, .. ).
[ Please feel free to tell me if I/we captured or misinterpregin these wrong ]

1st:	
====
Issue: we don't need all the task_pid() etc functions just stick to what
it was  task->pid !

Consens: It seems consensus is forming on that ..
Actions: remove the patches 1-12/34  and adopt the rest straight forward

2nd:
====	
Issue: we don't need pid virtualization, instead simply use <container,pid> pair.

This requires a bit more thought. Essentially that's what I was doing, but I mangled
them into the same pid and using masking to add/remove the container for internal use.
As pointed out by Alan(?), we can indeed reused the same pid internally many times
as long as we can distinguish during the pid-to-task_struct lookup. This is easily
done because, the caller provides the context hence the container for the lookup.

Actions: The vpid_to_pid will disappear and the check for whether we are in the same
container needs to be pushed down into the task lookup. question remains to figure out
whether the context of the task lookup (will always remain the caller ?).

Doing so has an implication, namely that we are moving over to "system containers".
The current implementation requires the vpid/pid only for the boundary condition at the
top of the container (to rewrite pid=1) and its parent and the fact that we wanted
a global look through container=0.
If said boundary would be eliminated and we simply make a container a child of the
initproc (pid=1), this would be unnecessary.

all together this would provide private namespaces (as just suggested by Eric).

The feeling would be that large parts of patch could be reduce by this.

What we need is a new system calls (similar to vserver) or maybe we can continue
the /proc approach for now...

sys_exec_container(const *char container_name, pid_t pid, unsigned int flags, const *char argv, const *char envp);

exec_container creates a new container (if indicated in flags) and a new task in it that reports to parent initproc.
if a non-zero pid is specified we use that pid, otherwise the system will allocate it. Finally
it create new session id ; chroot and exec's the specified program.

What we loose with this is the session and the tty, which Cedric described as application
container...

The sys_exec_container(...)  seems to be similar to what Eric just called clone_namespace()

-- Hubertus

______________________________________________________


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC: Multiple instances of kernel namespaces.
  2006-01-20 20:13                     ` Serge E. Hallyn
@ 2006-01-20 20:22                       ` Hubertus Franke
       [not found]                         ` <20060120203555.GC13265@sergelap.austin.ibm.com>
  2006-01-21 10:04                       ` Eric W. Biederman
  1 sibling, 1 reply; 136+ messages in thread
From: Hubertus Franke @ 2006-01-20 20:22 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Eric W. Biederman, linux-kernel,
	Alan Cox <alan@lxorguk.ukuu.org.uk> Dave Hansen,
	Arjan van de Ven, Suleiman Souhlal, Cedric Le Goater

Serge E. Hallyn wrote:
> Quoting Eric W. Biederman (ebiederm@xmission.com):
> 
>>At this point I have to confess I have been working on something
>>similar, to IBM's pid virtualization work.  But I have what is at
>>least for me a unifying concept, that makes things easier to think
>>about.
>>
>>The idea is to think about things in terms of namespaces.  Currently
>>in the kernel we have the fs/mount namespace already implemented.
>>
>>Partly this helps on what the interface for creating a new namespace
>>instance should be.  'clone(CLONE_NEW<NAMESPACE_TYPE>)', and how
>>it should be managed from the kernel data structures.
>>
>>Partly thinking of things as namespaces helps me scope the problem.
>>
>>Does this sound like a sane approach?
> 
> 
> And a bonus of this is that for security and vserver-type applications,
> the CLONE_NEWPID and CLONE_NEWFS will often happen at the same time.
> 
> How do you (or do you?) address naming namespaces?  This would be
> necessary for transitioning into an existing namespace, performing
> actions on existing namespaces (i.e. checkpoint, migrate to another
> machine, enter the namespace and kill pid 521), and would just be
> useful for accounting purposes, i.e. how else do you have a
> "ps --all-namespaces" specify a process' namespace?
> 
> Doubt we want to add an argument to clone(), so do we just add a new
> proc, sysfs, or syscall for setting a pid-namespace name?
> 
> Do we need a new syscall for transitioning into an existing namespace?
> 
> thanks,
> -serge
> 


Just addressed a few of this in my previous reply to the other thread.

However, question here is whether the container (as we used it) provides
the "binding" object for these clones. One question for me then is
whether cloning of namespaces is always done in tandem.
As you are bringing the migration up, we can only clone fully contained
namespaces ! One could make that a condition of the migration or build
it right into the initial structure. Any thoughts on that ?

-- Hubertus


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-20  5:11                     ` Eric W. Biederman
@ 2006-01-20 20:23                       ` Serge E. Hallyn
  2006-01-20 20:33                         ` Hubertus Franke
  0 siblings, 1 reply; 136+ messages in thread
From: Serge E. Hallyn @ 2006-01-20 20:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Arjan van de Ven, Alan Cox, Dave Hansen, Suleiman Souhlal,
	Serge Hallyn, linux-kernel, Hubertus Franke, Cedric Le Goater

Quoting Eric W. Biederman (ebiederm@xmission.com):
> Arjan van de Ven <arjan@infradead.org> writes:
> 
> > On Wed, 2006-01-18 at 22:54 +0000, Alan Cox wrote:
> >> On Mer, 2006-01-18 at 11:38 -0800, Dave Hansen wrote:
> >> > But, it seems that many drivers like to print out pids as a unique
> >> > identifier for the task.  Should we just let them print those
> >> > potentially non-unique identifiers, deprecate and kill them, or provide
> >> > a replacement with something else which is truly unique?
> >> 
> >> Pick a format for container number + pid and document/stick with it -
> >> something like container::pid (eg 0::114) or 114[0] whatever so long as
> >> it is consistent
> >
> > having a pid_to_string(<task struct>) or maybe task_to_string() thing
> > for convenient printing of pids/tasks.. I'm all for that. Means you can
> > even configure how verbose you want it to be (include ->comm or not,
> > ->state maybe etc)
> 
> The only way I can see to sanely do this is to pass it the temporary
> buffer it writes it's contents into.
> Something like:
> printk(KERN_XXX "%s\n", task_to_string(buf, tsk)); ?

That's kind of neat :)

The only other thing I can think of is to do something like

#define task_str(tsk) tsk->container_id, tsk->pid
or
#define task_str(tsk) tsk->container_id, ":", tsk->pid

and have it be used as

printk(KERN_XXX "%s::%s\n", task_str(tsk));
or
printk(KERN_XXX "%s%s%s\n", task_str(tsk));

The only reason I point it out is that we don't risk memory corruption
if the printk caller forgets to give the extra '%s's, like we do if
the caller forgets they need char buf[PID_CONTAINER_MAXLENGTH] instead
of 'char *buf;' or 'char buf;'.

-serge

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-20 20:23                       ` Serge E. Hallyn
@ 2006-01-20 20:33                         ` Hubertus Franke
  2006-01-21 10:34                           ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Hubertus Franke @ 2006-01-20 20:33 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Eric W. Biederman, Arjan van de Ven, Alan Cox, Dave Hansen,
	Suleiman Souhlal, linux-kernel, Cedric Le Goater

Serge E. Hallyn wrote:
> Quoting Eric W. Biederman (ebiederm@xmission.com):
> 
>>Arjan van de Ven <arjan@infradead.org> writes:
>>
>>
>>>On Wed, 2006-01-18 at 22:54 +0000, Alan Cox wrote:
>>>
>>>>On Mer, 2006-01-18 at 11:38 -0800, Dave Hansen wrote:
>>>>
>>>>>But, it seems that many drivers like to print out pids as a unique
>>>>>identifier for the task.  Should we just let them print those
>>>>>potentially non-unique identifiers, deprecate and kill them, or provide
>>>>>a replacement with something else which is truly unique?
>>>>
>>>>Pick a format for container number + pid and document/stick with it -
>>>>something like container::pid (eg 0::114) or 114[0] whatever so long as
>>>>it is consistent
>>>
>>>having a pid_to_string(<task struct>) or maybe task_to_string() thing
>>>for convenient printing of pids/tasks.. I'm all for that. Means you can
>>>even configure how verbose you want it to be (include ->comm or not,
>>>->state maybe etc)
>>
>>The only way I can see to sanely do this is to pass it the temporary
>>buffer it writes it's contents into.
>>Something like:
>>printk(KERN_XXX "%s\n", task_to_string(buf, tsk)); ?
> 
> 
> That's kind of neat :)
> 
> The only other thing I can think of is to do something like
> 
> #define task_str(tsk) tsk->container_id, tsk->pid
> or
> #define task_str(tsk) tsk->container_id, ":", tsk->pid
> 
> and have it be used as
> 
> printk(KERN_XXX "%s::%s\n", task_str(tsk));
> or
> printk(KERN_XXX "%s%s%s\n", task_str(tsk));
> 
> The only reason I point it out is that we don't risk memory corruption
> if the printk caller forgets to give the extra '%s's, like we do if
> the caller forgets they need char buf[PID_CONTAINER_MAXLENGTH] instead
> of 'char *buf;' or 'char buf;'.
> 
> -serge
> 

As odd as this looks .. it does have the benefits and anything that avoids
potential problems.

On the other hand you might run into problems with the following.

		char *str = task_str(tsk);

Eitherway .. I don't think these are the big fish to fry now :-)

-- Hubertus


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC: Multiple instances of kernel namespaces.
       [not found]                         ` <20060120203555.GC13265@sergelap.austin.ibm.com>
@ 2006-01-20 21:47                           ` Hubertus Franke
  0 siblings, 0 replies; 136+ messages in thread
From: Hubertus Franke @ 2006-01-20 21:47 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Eric W. Biederman, linux-kernel,
	Alan Cox <alan@lxorguk.ukuu.org.uk> Dave Hansen,
	Arjan van de Ven, Suleiman Souhlal, Cedric Le Goater

Serge E. Hallyn wrote:
> Quoting Hubertus Franke (frankeh@watson.ibm.com):
> 
>>However, question here is whether the container (as we used it) provides
>>the "binding" object for these clones. One question for me then is
>>whether cloning of namespaces is always done in tandem.
> 
> 
> No.

Thought so..

> 
> 
>>As you are bringing the migration up, we can only clone fully contained
> 
> 
> By clone do you actually mean clone(), or did you mean restart from
> checkpoint?

clone_<namespace>  , so its neither nor ...
Essentially creating a new namespace ! That's what Eric was suggesting.

> 
> If clone, then I don't understand the problem.
> 
> If restart from checkpoint/migrate, then I think the answer has to be
> that that is a special case which we have to handle.  Note that to clone
> a fs namespace, you need CAP_SYS_ADMIN.  We could add another check in
> there to deny CLONE_NEWNS when CLONE_NEWPID is not specified IF and ONLY
> IF we are already no longer in container_id==0.  Or even better, when
> a pid-namespace has been designated as migrateable.
> 
> Anything other than that would be too limiting.  Note that fs namespaces
> are going to be used for multi-level directories, for instance.

That's a reasonable approach. Give the general capability (since C/R + migration
is an additional capability that might not be utilized by many) and leave it to
the sys_admin to specify what is allowed or not
>  
>>namespaces ! One could make that a condition of the migration or build
>>it right into the initial structure. Any thoughts on that ?
>  
> So in other words I'm saying that this is the admin/user's problem to
> keep straight.  Dealing with fs-namespaces in this sense could perhaps be
> dealt with later by hand in checkpoint/migrate/restore code by
> 	a) at checkpoint:
> 		i) checking the fs-namespace of each process or thread
> 		ii) storing /proc/mounts for each fs-namespace
> 	b) at restore, do CLONE_NEWNS for each process which needs it,
> 		and using the stored /proc/mounts to rebuild the
> 		namespace.
> 
Something like it .. yes...

> Of course /proc mounts is itself relative to a namespace in the
> case of bind mounts, so I'm actually not sure this is feasible.
> 
> -serge
> 



^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC: Multiple instances of kernel namespaces.
  2006-01-20 20:13                     ` Serge E. Hallyn
  2006-01-20 20:22                       ` Hubertus Franke
@ 2006-01-21 10:04                       ` Eric W. Biederman
  2006-01-26 19:47                         ` Herbert Poetzl
  1 sibling, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-21 10:04 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: linux-kernel,
	Alan Cox <alan@lxorguk.ukuu.org.uk> Dave Hansen,
	Arjan van de Ven, Suleiman Souhlal, Hubertus Franke,
	Cedric Le Goater

"Serge E. Hallyn" <serue@us.ibm.com> writes:

> Quoting Eric W. Biederman (ebiederm@xmission.com):
>> 
>> At this point I have to confess I have been working on something
>> similar, to IBM's pid virtualization work.  But I have what is at
>> least for me a unifying concept, that makes things easier to think
>> about.
>> 
>> The idea is to think about things in terms of namespaces.  Currently
>> in the kernel we have the fs/mount namespace already implemented.
>> 
>> Partly this helps on what the interface for creating a new namespace
>> instance should be.  'clone(CLONE_NEW<NAMESPACE_TYPE>)', and how
>> it should be managed from the kernel data structures.
>> 
>> Partly thinking of things as namespaces helps me scope the problem.
>> 
>> Does this sound like a sane approach?
>
> And a bonus of this is that for security and vserver-type applications,
> the CLONE_NEWPID and CLONE_NEWFS will often happen at the same time.
>
> How do you (or do you?) address naming namespaces?  This would be
> necessary for transitioning into an existing namespace, performing
> actions on existing namespaces (i.e. checkpoint, migrate to another
> machine, enter the namespace and kill pid 521), and would just be
> useful for accounting purposes, i.e. how else do you have a
> "ps --all-namespaces" specify a process' namespace?

So I address naming indirectly.  The last thing I want to have
is to add yet another namespace to the kernel for naming namespaces.
We have enough namespaces already.

In any sane context for a pid-namespace we need a pid that
we can call waitpid on, so we don't break the process tree.
Which means at least the init process has 2 pids, one
that it's parent sees, and another (1) that it and it's
children see.

So I name pidspaces like we do sessions of process groups
and sessions by the pid of the leader.

So in the simple case I have names like:
1178/1632

> Doubt we want to add an argument to clone(), so do we just add a new
> proc, sysfs, or syscall for setting a pid-namespace name?

That shouldn't be necessary.

> Do we need a new syscall for transitioning into an existing namespace?

That is a good question.  The FS namespaces that we already have
has much the same problem.  A completely different solution to
this problem seems to have been implemented but I don't grasp it
yet.

Inherently transitioning to an existing namespace is something
that is straight forward to implement, so it is worth thinking
about.

If I want a guest that can keep secrets from the host sysadmin I don't
want transitioning into a guest namespace to come too easily.

Currently I can always just create an extra child of pid 1
that I will be my slave.  The problem is that this is an extra
process laying around.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-20 20:18                 ` Hubertus Franke
@ 2006-01-21 10:25                   ` Eric W. Biederman
  2006-01-23 18:38                     ` Hubertus Franke
  2006-01-21 14:42                   ` Eric W. Biederman
  2006-01-31 21:02                   ` RFC [patch 13/34] PID Virtualization Define new task_pid api Linus Torvalds
  2 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-21 10:25 UTC (permalink / raw)
  To: Hubertus Franke
  Cc: Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Hubertus Franke <frankeh@watson.ibm.com> writes:

> Eric W. Biederman wrote:

> Let me just summarize the discussion that has taken place so far
> and the consequences I/we seem to be drawing out of it.
>
> We discussed the various approaches that are floating around now, enough
> has been said about each, so I leave it at that ...
> (a) GLIBC intercept LD_PRELOAD	
> (b) Binary Rewrite of glibc
> (c) syscall table intercept		(see ZAP)
> (d) vpid approach			(see "IBM" patches posted)
> (e) <pid,container> approach 		(see below, suggested by Alan?.. )
      That seems to have been an observation of the current patchset.
>
> There are several issues that came up in the email exchange ( Arjen, Alan Cox,
> .. ).
> [ Please feel free to tell me if I/we captured or misinterpregin these wrong ]

...
> Actions: The vpid_to_pid will disappear and the check for whether we are in the
> same
> container needs to be pushed down into the task lookup. question remains to
> figure out
> whether the context of the task lookup (will always remain the caller ?).

You don't need a same container check.  If something is in another container
it becomes invisible to you.

> Doing so has an implication, namely that we are moving over to "system
> containers".
> The current implementation requires the vpid/pid only for the boundary condition
> at the
> top of the container (to rewrite pid=1) and its parent and the fact that we
> wanted
> a global look through container=0.
> If said boundary would be eliminated and we simply make a container a child of
> the
> initproc (pid=1), this would be unnecessary.
>
> all together this would provide private namespaces (as just suggested by Eric).
>
> The feeling would be that large parts of patch could be reduce by this.

I concur.  Except I think the initial impact could still be large.
It may be worth breaking all users of pids just so we audit them.

But that will certainly result in no long term cost, or runtime overhead.

> What we need is a new system calls (similar to vserver) or maybe we can continue
> the /proc approach for now...
>
> sys_exec_container(const *char container_name, pid_t pid, unsigned int flags,
> const *char argv, const *char envp);
>
> exec_container creates a new container (if indicated in flags) and a new task in
> it that reports to parent initproc.
> if a non-zero pid is specified we use that pid, otherwise the system will
> allocate it. Finally
> it create new session id ; chroot and exec's the specified program.
>
> What we loose with this is the session and the tty, which Cedric described as
> application
> container...
>
> The sys_exec_container(...)  seems to be similar to what Eric just called
> clone_namespace()

Similar. But I was actually talking about just adding another flag to
sys_clone the syscall underlying fork().  Basically it is just another
resource not share or not-share.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-18 19:01           ` Dave Hansen
  2006-01-18 19:28             ` Arjan van de Ven
@ 2006-01-21 10:31             ` Pavel Machek
  1 sibling, 0 replies; 136+ messages in thread
From: Pavel Machek @ 2006-01-21 10:31 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Alan Cox, Suleiman Souhlal, Serge Hallyn, linux-kernel,
	Hubertus Franke, Cedric Le Goater

On St 18-01-06 11:01:52, Dave Hansen wrote:
> On Tue, 2006-01-17 at 18:29 +0000, Alan Cox wrote:
> > On Maw, 2006-01-17 at 10:12 -0800, Dave Hansen wrote:
> > > You do assign new pids, at least as far as the kernel is concerned.
> > > However, any processes that continue to run would get confused if their
> > > pid changed.  You have to make sure that the tasks have a _consistent_
> > > view of which process is which pid.
> > 
> > Don't reassign the pid at all. Keep task->container and do the job
> > explicitly. Most task searches for a pid are abstracted already and most
> > users of ->pid who try and use it for comparing two tasks for equality
> > or for keeping a task reference are already terminally racey and want
> > fixing anyway.
> 
> Other than searches, there appear to be quite a number of drivers an
> subsystems that like to print out pids.  I can't find any cases yet
> where these are integral to functionality, but I wonder what approach we
> should take.  Should we deprecate printk'ing of pids?  Make a special
> function or % modifier to turn a task_struct into something printable?
> 
> A function would run into issues of having buffers in which to print the
> output.  But, we'd be able to do things like:
> 
> 	sprintf(buffer, "%d:%d", tsk->container, tsk->pid);

What about first fixing all the driver to print_task() or
something like that, where print_task would print name too (for
example). That way, we get more useful data *now* and you can fix it
any way you want in future.

char *print_task() doing pretty-printing should be enough.
							Pavel



-- 
Thanks, Sharp!

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 00/34] PID Virtualization Overview
  2006-01-20 20:33                         ` Hubertus Franke
@ 2006-01-21 10:34                           ` Eric W. Biederman
  0 siblings, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-21 10:34 UTC (permalink / raw)
  To: Hubertus Franke
  Cc: Serge E. Hallyn, Arjan van de Ven, Alan Cox, Dave Hansen,
	Suleiman Souhlal, linux-kernel, Cedric Le Goater

Hubertus Franke <frankeh@watson.ibm.com> writes:

> As odd as this looks .. it does have the benefits and anything that avoids
> potential problems.
>
> On the other hand you might run into problems with the following.
>
> 		char *str = task_str(tsk);
>
> Eitherway .. I don't think these are the big fish to fry now :-)

Except there are really no small fish :)

This solves the one really ugly part of my current patch,
that I had simply not thought through.

There is already something similar for paths in the fs
namespace.

char * d_path(struct dentry *dentry, struct vfsmount *vfsmnt,
				char *buf, int buflen);
Which does exactly this.

Now frequently it is passed in a page sized buffer so
it's not quite the same but close enough.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-20 20:18                 ` Hubertus Franke
  2006-01-21 10:25                   ` Eric W. Biederman
@ 2006-01-21 14:42                   ` Eric W. Biederman
  2006-01-22  6:43                     ` Kyle Moffett
  2006-01-23 18:50                     ` Hubertus Franke
  2006-01-31 21:02                   ` RFC [patch 13/34] PID Virtualization Define new task_pid api Linus Torvalds
  2 siblings, 2 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-21 14:42 UTC (permalink / raw)
  To: Hubertus Franke
  Cc: Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Hubertus Franke <frankeh@watson.ibm.com> writes:

> 2nd:
> ====	
> Issue: we don't need pid virtualization, instead simply use <container,pid>
> pair.
>
> This requires a bit more thought. Essentially that's what I was doing, but I
> mangled them into the same pid and using masking to add/remove the
> container for internal use.  As pointed out by Alan(?), we can
> indeed reused the same pid internally many times as long as we can
> distinguish during the pid-to-task_struct lookup. This is easily
> done because, the caller provides the context hence the container for the
> lookup.
>
> Actions: The vpid_to_pid will disappear and the check for whether we
> are in the same container needs to be pushed down into the task
> lookup. question remains to  figure out  whether the context of the
> task lookup (will always remain the caller ?). 

Any place the kernel saves a pid and then proceeds to signal it later.
At that later point in time it is possibly you will be in the wrong
context.

This probably justifies having a kpid_t that has both the process
space id and the pid in it.  For when the kernel is storing pids to
use as weak references, for signal purposes etc.

At least tty_io.c and fcntl.c have examples where you the caller
may not have the proper context.

> Doing so has an implication, namely that we are moving over to "system
> containers". The current implementation requires the vpid/pid only
> for the boundary condition at the top of the container (to rewrite
> pid=1) and its parent and the fact that we wanted a global look
> through container=0. If said boundary would be eliminated and we
> simply make a container a child of the initproc (pid=1), this would
> be unnecessary. 
>
> all together this would provide private namespaces (as just suggested by Eric).
>
> The feeling would be that large parts of patch could be reduce by
> this.

Simplified, and made easier to understand.  I don't know if the number
of lines affected can or should be reduced.  

One of my problems with your current approach is that it doesn't help
identify where you have problems.

I have found a specific example that your current patches get wrong,
because you make assumptions about which context is valid.

>From function kernel/khtread.c
> static void keventd_create_kthread(void *_create)
> {
> 	struct kthread_create_info *create = _create;
> 	int pid;
> 
> 	/* We want our own signal handler (we take no signals by default). */
> 	pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
> 	if (pid < 0) {
> 		create->result = ERR_PTR(pid);
> 	} else {
> 		wait_for_completion(&create->started);
> 		create->result = find_task_by_pid(pid);
> 	}
> 	complete(&create->done);
> }

kernel_thread() is a light wrapper around do_fork().
do_fork returns a virtual pid.
find_task_by_pid takes a pid with the upper bits holding the process
   space id.

Therefore if this function or a cousin of it was ever triggered
by a userspace application in a virtual context find_task_by_pid
would fail to find the task structure.

The only way I know to make this change safely is to make compilation
of all functions that manipulate pids in possibly dangerous ways fail.
And then to manually and slowly fix them up.

That way if something is missed.  You get a compile error instead
of incorrect execution.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-21 14:42                   ` Eric W. Biederman
@ 2006-01-22  6:43                     ` Kyle Moffett
  2006-01-22 15:48                       ` Eric W. Biederman
  2006-01-23 18:50                     ` Hubertus Franke
  1 sibling, 1 reply; 136+ messages in thread
From: Kyle Moffett @ 2006-01-22  6:43 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

On Jan 21, 2006, at 09:42, Eric W. Biederman wrote:
> Hubertus Franke <frankeh@watson.ibm.com> writes:
>
>> Actions: The vpid_to_pid will disappear and the check for whether  
>> we are in the same container needs to be pushed down into the task  
>> lookup. question remains to  figure out  whether the context of  
>> the task lookup (will always remain the caller ?).
>
> Any place the kernel saves a pid and then proceeds to signal it  
> later. At that later point in time it is possibly you will be in  
> the wrong context.
>
> This probably justifies having a kpid_t that has both the process
> space id and the pid in it.  For when the kernel is storing pids to
> use as weak references, for signal purposes etc.

The kernel should not be saving a PID.  The kernel should be sticking  
a pointer to a struct task_struct somewhere (with appropriate  
refcounting) and using that.

> The only way I know to make this change safely is to make  
> compilation of all functions that manipulate pids in possibly  
> dangerous ways fail. And then to manually and slowly fix them up.
>
> That way if something is missed.  You get a compile error instead  
> of incorrect execution.

I agree.  This is one of the things I really liked about the recent  
mutex patch; it added a lot of checks to various codepaths to verify  
at both compile time and run time that the code was correct.

My personal opinion is that we need to add a new race-free API, say  
open("/proc/fork"); that forks a process and returns an open "process  
handle", essentially a filehandle that references a particular  
process.  (Also, an open("/proc/self/handle") or something to return  
a current-process handle)   Through some method of signaling the  
kernel (syscall, ioctl, some other?) a process can send a signal to  
the process referenced by the handle, check its status, etc.  A  
process handle might be passed to other processes using a UNIX-domain  
socket.  You would be able to dup() a process handle and then  
restrict the set of valid operations on the new process handle, so  
that it could be passed to another process  without giving that  
process access to the full set of operations (check status only, not  
able to send a signal, for example).

Obviously we would need to maintain support for the old interface for  
some period of time, but I think the new one would make it much  
easier to write simple race-free programs.

Cheers,
Kyle Moffett

--
Simple things should be simple and complex things should be possible
   -- Alan Kay




^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-22  6:43                     ` Kyle Moffett
@ 2006-01-22 15:48                       ` Eric W. Biederman
  2006-01-22 15:55                         ` Arjan van de Ven
  0 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-22 15:48 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Kyle Moffett <mrmacman_g4@mac.com> writes:

> On Jan 21, 2006, at 09:42, Eric W. Biederman wrote:
>> Hubertus Franke <frankeh@watson.ibm.com> writes:
>>
>>> Actions: The vpid_to_pid will disappear and the check for whether  we are in
>>> the same container needs to be pushed down into the task  lookup. question
>>> remains to  figure out  whether the context of  the task lookup (will always
>>> remain the caller ?).
>>
>> Any place the kernel saves a pid and then proceeds to signal it  later. At
>> that later point in time it is possibly you will be in  the wrong context.
>>
>> This probably justifies having a kpid_t that has both the process
>> space id and the pid in it.  For when the kernel is storing pids to
>> use as weak references, for signal purposes etc.
>
> The kernel should not be saving a PID.  The kernel should be sticking a pointer
> to a struct task_struct somewhere (with appropriate refcounting) and using that.

That has all of the wrong semantics, and simply will not work.

>> The only way I know to make this change safely is to make  compilation of all
>> functions that manipulate pids in possibly  dangerous ways fail. And then to
>> manually and slowly fix them up.
>>
>> That way if something is missed.  You get a compile error instead  of
>> incorrect execution.
>
> I agree.  This is one of the things I really liked about the recent  mutex
> patch; it added a lot of checks to various codepaths to verify  at both compile
> time and run time that the code was correct.

And changing how we handle pids is if anything even more intrusive.
>
> My personal opinion is that we need to add a new race-free API, say
> open("/proc/fork"); that forks a process and returns an open "process  handle",
> essentially a filehandle that references a particular  process.  (Also, an
> open("/proc/self/handle") or something to return  a current-process handle)
> Through some method of signaling the  kernel (syscall, ioctl, some other?) a
> process can send a signal to  the process referenced by the handle, check its
> status, etc.  A  process handle might be passed to other processes using a
> UNIX-domain  socket.  You would be able to dup() a process handle and then
> restrict the set of valid operations on the new process handle, so  that it
> could be passed to another process  without giving that  process access to the
> full set of operations (check status only, not  able to send a signal, for
> example).

Ok. There are 2 sides to this, an internal kernel implementation,
and exporting to user space.  Until we have something inside
the kernel exporting it is silly.

A pointer to a task_struct while it kind of sort of works.  Is not
a good solution.  The problem is that in a lot of cases we register
a pid to get a signal or something similar and then we never unregister
it.  So by using a pointer to a trask_struct you effectively hold the
process in memory forever.

Then there is the second problem.  A pointer to a task_struct is
insufficient.  It does not handle the case of process groups which
are equally important.

Further a task_struct points at a thread not at a process so holding
a pointer to it would not do what you would expect.

Possibly holding a struct pid would be interesting.

> Obviously we would need to maintain support for the old interface for  some
> period of time, but I think the new one would make it much  easier to write
> simple race-free programs.

Well since this is the user space interface we would need to maintain
the old interface for as long as the kernel runs on existing architectures 
or their are user space programs using it.  Even in plan9 they weren't
creative enough to do away with PIDS.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-22 15:48                       ` Eric W. Biederman
@ 2006-01-22 15:55                         ` Arjan van de Ven
  2006-01-22 16:24                           ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Arjan van de Ven @ 2006-01-22 15:55 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kyle Moffett, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, linux-kernel, Cedric Le Goater


> 
> A pointer to a task_struct while it kind of sort of works.  Is not
> a good solution.  The problem is that in a lot of cases we register
> a pid to get a signal or something similar and then we never unregister
> it.  So by using a pointer to a trask_struct you effectively hold the
> process in memory forever.

this is not right. Both the PID and the task struct have the exact same
lifetime rules, they HAVE to, to guard against pid reuse problems.



^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-22 15:55                         ` Arjan van de Ven
@ 2006-01-22 16:24                           ` Eric W. Biederman
  2006-01-26 20:01                             ` Herbert Poetzl
  0 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-22 16:24 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Kyle Moffett, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, linux-kernel, Cedric Le Goater

Arjan van de Ven <arjan@infradead.org> writes:

>> 
>> A pointer to a task_struct while it kind of sort of works.  Is not
>> a good solution.  The problem is that in a lot of cases we register
>> a pid to get a signal or something similar and then we never unregister
>> it.  So by using a pointer to a trask_struct you effectively hold the
>> process in memory forever.
>
> this is not right. Both the PID and the task struct have the exact same
> lifetime rules, they HAVE to, to guard against pid reuse problems.

Yes PIDs reserved for the lifetime of the task_struct (baring minor
details).  There are actually a few races in /proc where it can
see the task_struct after the pid has been freed (see the pid_alive macro
in sched.h)

However when used as a reference the number can live as long as you
want.  The classic example is a pid file that can exist even after
you reboot a machine.

So currently a use of a PID as a reference to processes or process
groups can last forever.  An example of this is the kernel is
the result of fcntl(fd, F_SETOWN).  The session of a tty is similar.

Since the in kernel references have a lifetime that is completely
different than the lifetime of a process or a PID.  It is
not safe to simply replace such references with a direct reference
to a task_struct (besides being technically impossible).  Adding
those references could potentially increase the lifespan of a task_struct
for to the life of the kernel depending on the reference.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-21 10:25                   ` Eric W. Biederman
@ 2006-01-23 18:38                     ` Hubertus Franke
  2006-01-23 18:48                       ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Hubertus Franke @ 2006-01-23 18:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Eric W. Biederman wrote:
> Hubertus Franke <frankeh@watson.ibm.com> writes:
> 

> ...
> 
>>Actions: The vpid_to_pid will disappear and the check for whether we are in the
>>same
>>container needs to be pushed down into the task lookup. question remains to
>>figure out
>>whether the context of the task lookup (will always remain the caller ?).
> 
> 
> You don't need a same container check.  If something is in another container
> it becomes invisible to you.
> 

Eric, agreed.... that was implied by me (but poorly worded). What I meant (lets try this
again) is that the context defines/provides the namespace in which the lookup
is performed, hence as you say state.. naturally things in different containers
(namespaces) are invisible to you..

> 
>>Doing so has an implication, namely that we are moving over to "system
>>containers".
>>The current implementation requires the vpid/pid only for the boundary condition
>>at the
>>top of the container (to rewrite pid=1) and its parent and the fact that we
>>wanted
>>a global look through container=0.
>>If said boundary would be eliminated and we simply make a container a child of
>>the
>>initproc (pid=1), this would be unnecessary.
>>
>>all together this would provide private namespaces (as just suggested by Eric).
>>
>>The feeling would be that large parts of patch could be reduce by this.
> 
> 
> I concur.  Except I think the initial impact could still be large.
> It may be worth breaking all users of pids just so we audit them.
> 
> But that will certainly result in no long term cost, or runtime overhead.
> 
> 
>>What we need is a new system calls (similar to vserver) or maybe we can continue
>>the /proc approach for now...
>>
>>sys_exec_container(const *char container_name, pid_t pid, unsigned int flags,
>>const *char argv, const *char envp);
>>
>>exec_container creates a new container (if indicated in flags) and a new task in
>>it that reports to parent initproc.
>>if a non-zero pid is specified we use that pid, otherwise the system will
>>allocate it. Finally
>>it create new session id ; chroot and exec's the specified program.
>>
>>What we loose with this is the session and the tty, which Cedric described as
>>application
>>container...
>>
>>The sys_exec_container(...)  seems to be similar to what Eric just called
>>clone_namespace()
> 
> 
> Similar. But I was actually talking about just adding another flag to
> sys_clone the syscall underlying fork().  Basically it is just another
> resource not share or not-share.
> 
> Eric
> 

That's a good idea .. right now we simply did this through a flag left by the call
to the /proc/container fs ... (awkward at best, but didn't break the API).
I have a concern wrt doing it in during fork namely the sharing of resources.
Whe obviously are looking at some constraints here wrt to sharing. We need to
ensure that this ain't a thread etc that will share resources
across "containers" (which then later aren't migratable due to that sharing).
So doing the fork_exec() atomically would avoid that problem.




^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-23 18:38                     ` Hubertus Franke
@ 2006-01-23 18:48                       ` Eric W. Biederman
  0 siblings, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-23 18:48 UTC (permalink / raw)
  To: Hubertus Franke
  Cc: Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Hubertus Franke <frankeh@watson.ibm.com> writes:

> That's a good idea .. right now we simply did this through a flag left by the
> call
> to the /proc/container fs ... (awkward at best, but didn't break the API).
> I have a concern wrt doing it in during fork namely the sharing of resources.
> Whe obviously are looking at some constraints here wrt to sharing. We need to
> ensure that this ain't a thread etc that will share resources
> across "containers" (which then later aren't migratable due to that sharing).
> So doing the fork_exec() atomically would avoid that problem.

Checking that we aren't sharing things to become a thread is fairly straight
forward. do_fork already has similar checks in place.

This sounds like a classic case of if you don't want that don't do that
then.

Eric






^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-21 14:42                   ` Eric W. Biederman
  2006-01-22  6:43                     ` Kyle Moffett
@ 2006-01-23 18:50                     ` Hubertus Franke
  2006-01-23 19:28                       ` Eric W. Biederman
       [not found]                       ` <m1k6cqlmfe.fsf_-_@ebiederm.dsl.xmission.com>
  1 sibling, 2 replies; 136+ messages in thread
From: Hubertus Franke @ 2006-01-23 18:50 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Eric W. Biederman wrote:
> Hubertus Franke <frankeh@watson.ibm.com> writes:
> 
> 
> 
> Any place the kernel saves a pid and then proceeds to signal it later.
> At that later point in time it is possibly you will be in the wrong
> context.
> 

Yes, that's possible.. In the current patch that is not a problem, because
the internal pid (aka kpid) == <vpid,containerid>  mangeled together.
So in those cases, the kernel would have to keep <pid, container_id>

> This probably justifies having a kpid_t that has both the process
> space id and the pid in it.  For when the kernel is storing pids to
> use as weak references, for signal purposes etc.
>

An that's what the current patch does. Only thing is we did not rename
everything to kpid_t!

> At least tty_io.c and fcntl.c have examples where you the caller
> may not have the proper context.

Can you point those out directly .. thanks..

> 
> 
>>Doing so has an implication, namely that we are moving over to "system
>>containers". The current implementation requires the vpid/pid only
>>for the boundary condition at the top of the container (to rewrite
>>pid=1) and its parent and the fact that we wanted a global look
>>through container=0. If said boundary would be eliminated and we
>>simply make a container a child of the initproc (pid=1), this would
>>be unnecessary. 
>>
>>all together this would provide private namespaces (as just suggested by Eric).
>>
>>The feeling would be that large parts of patch could be reduce by
>>this.
> 
> 
> Simplified, and made easier to understand.  I don't know if the number
> of lines affected can or should be reduced.

Unless we do it .. we won't know for sure.....

> 
> One of my problems with your current approach is that it doesn't help
> identify where you have problems.
> 
> I have found a specific example that your current patches get wrong,
> because you make assumptions about which context is valid.
> 
> From function kernel/khtread.c
> 
>>static void keventd_create_kthread(void *_create)
>>{
>>	struct kthread_create_info *create = _create;
>>	int pid;
>>
>>	/* We want our own signal handler (we take no signals by default). */
>>	pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
>>	if (pid < 0) {
>>		create->result = ERR_PTR(pid);
>>	} else {
>>		wait_for_completion(&create->started);
>>		create->result = find_task_by_pid(pid);
>>	}
>>	complete(&create->done);
>>}
> 

So what we where thinking (and have experimented a bit with) is using
SPARSE for this. Didn't go too far, but is a potential.
Another possibility is to really introduce a pid_t (user perspective)
and a kpid_t (kernel pid <spaceid, pid>) type explicitely.
Then this would be solved.

> 
> kernel_thread() is a light wrapper around do_fork().
> do_fork returns a virtual pid.
> find_task_by_pid takes a pid with the upper bits holding the process
>    space id.
> 
> Therefore if this function or a cousin of it was ever triggered
> by a userspace application in a virtual context find_task_by_pid
> would fail to find the task structure.
> 
> The only way I know to make this change safely is to make compilation
> of all functions that manipulate pids in possibly dangerous ways fail.
> And then to manually and slowly fix them up.

See above (SPARSE or type strictness).
And yes, we internally discussed that, but the changes might be huge
to change all the occurences.
It would be good to know whether going this route will lead us to the
promise land or not.

> 
> That way if something is missed.  You get a compile error instead
> of incorrect execution.


> 
> Eric
> 



^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-23 18:50                     ` Hubertus Franke
@ 2006-01-23 19:28                       ` Eric W. Biederman
  2006-01-23 21:11                         ` Alan Cox
       [not found]                       ` <m1k6cqlmfe.fsf_-_@ebiederm.dsl.xmission.com>
  1 sibling, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-23 19:28 UTC (permalink / raw)
  To: Hubertus Franke
  Cc: Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Hubertus Franke <frankeh@watson.ibm.com> writes:

> Eric W. Biederman wrote:
>> Hubertus Franke <frankeh@watson.ibm.com> writes:
>> Any place the kernel saves a pid and then proceeds to signal it later.
>> At that later point in time it is possibly you will be in the wrong
>> context.
>>
>
> Yes, that's possible.. In the current patch that is not a problem, because
> the internal pid (aka kpid) == <vpid,containerid>  mangeled together.
> So in those cases, the kernel would have to keep <pid, container_id>

Agreed, and for the internal implementation I think having them mangled
together make sense, so long as we never export that form to userspace.

>> This probably justifies having a kpid_t that has both the process
>> space id and the pid in it.  For when the kernel is storing pids to
>> use as weak references, for signal purposes etc.
>>
>
> An that's what the current patch does. Only thing is we did not rename
> everything to kpid_t!

Yep. But because of that you couldn't detect mixing of pid and kpid.

>> At least tty_io.c and fcntl.c have examples where you the caller
>> may not have the proper context.
>
> Can you point those out directly .. thanks..

Short version.  tty's send signals on hangup and f_setown can trigger signals
being sent.

Eric




^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-23 19:28                       ` Eric W. Biederman
@ 2006-01-23 21:11                         ` Alan Cox
  2006-01-23 21:30                           ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Alan Cox @ 2006-01-23 21:11 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

On Llu, 2006-01-23 at 12:28 -0700, Eric W. Biederman wrote:
> > Yes, that's possible.. In the current patch that is not a problem, because
> > the internal pid (aka kpid) == <vpid,containerid>  mangeled together.
> > So in those cases, the kernel would have to keep <pid, container_id>
> 
> Agreed, and for the internal implementation I think having them mangled
> together make sense, so long as we never export that form to userspace.

You have to refcount the container ids anyway or you may have stale
container references and end up reusing them.


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-23 21:11                         ` Alan Cox
@ 2006-01-23 21:30                           ` Eric W. Biederman
  2006-01-23 22:15                             ` Hubertus Franke
  2006-01-24  0:22                             ` Alan Cox
  0 siblings, 2 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-23 21:30 UTC (permalink / raw)
  To: Alan Cox
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> On Llu, 2006-01-23 at 12:28 -0700, Eric W. Biederman wrote:
>> > Yes, that's possible.. In the current patch that is not a problem, because
>> > the internal pid (aka kpid) == <vpid,containerid>  mangeled together.
>> > So in those cases, the kernel would have to keep <pid, container_id>
>> 
>> Agreed, and for the internal implementation I think having them mangled
>> together make sense, so long as we never export that form to userspace.
>
> You have to refcount the container ids anyway or you may have stale
> container references and end up reusing them.

The short observation is currently we use at most 22bits of the pid
space, and we don't need a huge number of containers so combining them
into one integer makes sense for an efficient implementation, and it
is cheaper than comparing pointers.

Additional identifiers are really not necessary to user space and providing
them is one more thing that needs to be virtualized.  We can already
talk about them indirectly by referring to processes that use them.

And there will be at least one processes id assigned to the pid space
from the outside pid space unless we choose to break waitpid, and friends.

I just don't want a neat implementation trick to cause us maintenance grief.

Eric


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC: [PATCH] pids as weak references.
       [not found]                       ` <m1k6cqlmfe.fsf_-_@ebiederm.dsl.xmission.com>
@ 2006-01-23 21:57                         ` Dave Hansen
  0 siblings, 0 replies; 136+ messages in thread
From: Dave Hansen @ 2006-01-23 21:57 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Hubertus Franke, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater, Vserver

On Mon, 2006-01-23 at 13:27 -0700, Eric W. Biederman wrote:
> So currently I can see to justifications for introducing
> a separation between kpid_t pid_t.
> 1) pid virtualization
> 2) In kernel pids that act as weak references, and avoid
>    the problems of pid wrap-around.

It is an interesting approach.  But, in its current state, it is very,
very hard to review.  For starters, could you break it up so that the
meat of the patch is separate from the easy
s/foo->pid/pid_nr(foo->pid)/ stuff?

-- Dave


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-23 21:30                           ` Eric W. Biederman
@ 2006-01-23 22:15                             ` Hubertus Franke
  2006-01-24  6:56                               ` Arjan van de Ven
  2006-01-24 19:34                               ` Eric W. Biederman
  2006-01-24  0:22                             ` Alan Cox
  1 sibling, 2 replies; 136+ messages in thread
From: Hubertus Franke @ 2006-01-23 22:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Alan Cox, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Eric W. Biederman wrote:
> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> 
> 
>>On Llu, 2006-01-23 at 12:28 -0700, Eric W. Biederman wrote:
>>
>>>>Yes, that's possible.. In the current patch that is not a problem, because
>>>>the internal pid (aka kpid) == <vpid,containerid>  mangeled together.
>>>>So in those cases, the kernel would have to keep <pid, container_id>
>>>
>>>Agreed, and for the internal implementation I think having them mangled
>>>together make sense, so long as we never export that form to userspace.
>>
>>You have to refcount the container ids anyway or you may have stale
>>container references and end up reusing them.
> 
> 
> The short observation is currently we use at most 22bits of the pid
> space, and we don't need a huge number of containers so combining them
> into one integer makes sense for an efficient implementation, and it
> is cheaper than comparing pointers.
> Additional identifiers are really not necessary to user space and providing
> them is one more thing that needs to be virtualized.  We can already
> talk about them indirectly by referring to processes that use them.
> 
> And there will be at least one processes id assigned to the pid space
> from the outside pid space unless we choose to break waitpid, and friends.
> 
> I just don't want a neat implementation trick to cause us maintenance grief.
> 
> Eric
> 

In that case, I think we do require the current vpid_to_pid(translations)
in order to transfer the external user pid ( relative to the namespace )
into one that combines namespace (aka container_id) with the external pid.
Exactly how it is done today.
What will slightly change is the low level implementations of the

inline pid_t pid_to_vpid_ctx(pid_t pid, const struct task_struct *ctx);
pid_t __pid_to_vpid_ctx_excp(pid_t pid, int pidspace_id,const struct task_struct *ctx);

and reverse.
The VPID_2_PID and PID_2_VPID still remain at same locations.

Did I get your comments correctly, Eric ?..

Thanks as usual
-- Hubertus




^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-23 21:30                           ` Eric W. Biederman
  2006-01-23 22:15                             ` Hubertus Franke
@ 2006-01-24  0:22                             ` Alan Cox
  2006-01-24 19:26                               ` Eric W. Biederman
  1 sibling, 1 reply; 136+ messages in thread
From: Alan Cox @ 2006-01-24  0:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

On Llu, 2006-01-23 at 14:30 -0700, Eric W. Biederman wrote:
> The short observation is currently we use at most 22bits of the pid
> space, and we don't need a huge number of containers so combining them
> into one integer makes sense for an efficient implementation, and it
> is cheaper than comparing pointers.

Currently. In addition it becomes more costly the moment you have to
start masking them. Remember the point of this was to virtualise the
pid, so you are going to add a ton of masking versus a cheap extra
comparison from the same cache line. And you lose pid space you may well
need in the future for the sake of a quick hack.

> And there will be at least one processes id assigned to the pid space
> from the outside pid space unless we choose to break waitpid, and friends.

That comes out in the wash because it is already done by process tree
pointers anyway. It has to be because using ->ppid would be racy.

Alan


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-23 22:15                             ` Hubertus Franke
@ 2006-01-24  6:56                               ` Arjan van de Ven
  2006-01-24 19:34                               ` Eric W. Biederman
  1 sibling, 0 replies; 136+ messages in thread
From: Arjan van de Ven @ 2006-01-24  6:56 UTC (permalink / raw)
  To: Hubertus Franke
  Cc: Eric W. Biederman, Alan Cox, Dave Hansen, Greg KH,
	Serge E. Hallyn, linux-kernel, Cedric Le Goater


> 
> In that case, I think we do require the current vpid_to_pid(translations)
> in order to transfer the external user pid ( relative to the namespace )
> into one that combines namespace (aka container_id) with the external pid.
> Exactly how it is done today.
> What will slightly change is the low level implementations of the
> 
> inline pid_t pid_to_vpid_ctx(pid_t pid, const struct task_struct *ctx);
> pid_t __pid_to_vpid_ctx_excp(pid_t pid, int pidspace_id,const struct task_struct *ctx);
> 
> and reverse.
> The VPID_2_PID and PID_2_VPID still remain at same locations.
> 
> Did I get your comments correctly, Eric ?..

please call it 'userpid' not 'vpid', to make more clear what the pid is
used for/what domain it is in.



^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-24  0:22                             ` Alan Cox
@ 2006-01-24 19:26                               ` Eric W. Biederman
  2006-01-24 21:11                                 ` Alan Cox
  0 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-24 19:26 UTC (permalink / raw)
  To: Alan Cox
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> On Llu, 2006-01-23 at 14:30 -0700, Eric W. Biederman wrote:
>> The short observation is currently we use at most 22bits of the pid
>> space, and we don't need a huge number of containers so combining them
>> into one integer makes sense for an efficient implementation, and it
>> is cheaper than comparing pointers.
>
> Currently. In addition it becomes more costly the moment you have to
> start masking them. Remember the point of this was to virtualise the
> pid, so you are going to add a ton of masking versus a cheap extra
> comparison from the same cache line. And you lose pid space you may well
> need in the future for the sake of a quick hack.

I do disagree that as I am envisioning it will get in the way but I
do agree that putting them in the unsigned long may be overkill.

There is at least NFS lockd that appreciates having a single integer
per process unique identifier.  So there is a practical basis for
wanting such a thing.

At least for this first round I think talking about a kpid
as a container, pid pair makes a lot of sense for the moment, as
the other implementations just confuse things.

>> And there will be at least one processes id assigned to the pid space
>> from the outside pid space unless we choose to break waitpid, and friends.
>
> That comes out in the wash because it is already done by process tree
> pointers anyway. It has to be because using ->ppid would be racy.

Possibly.  Again, it is one of the more interesting cases, to get
just right.

However it looks to me that the biggest challenge right now about
development is the size of a patch to change any one of these things.
So it looks to me like the first step is to add wrappers for common idioms
that use pids, like printing the name of a task or testing if it is the
init task or if it is an idle task.

Or can you think of a case where it would be wise to leave
both the type and size of current->pid alone?

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-23 22:15                             ` Hubertus Franke
  2006-01-24  6:56                               ` Arjan van de Ven
@ 2006-01-24 19:34                               ` Eric W. Biederman
  2006-01-24 21:09                                 ` Hubertus Franke
  1 sibling, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-24 19:34 UTC (permalink / raw)
  To: Hubertus Franke
  Cc: Alan Cox, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Hubertus Franke <frankeh@watson.ibm.com> writes:

> In that case, I think we do require the current vpid_to_pid(translations)
> in order to transfer the external user pid ( relative to the namespace )
> into one that combines namespace (aka container_id) with the external pid.
> Exactly how it is done today.
> What will slightly change is the low level implementations of the
>
> inline pid_t pid_to_vpid_ctx(pid_t pid, const struct task_struct *ctx);
> pid_t __pid_to_vpid_ctx_excp(pid_t pid, int pidspace_id,const struct task_struct
> *ctx);
>
> and reverse.
> The VPID_2_PID and PID_2_VPID still remain at same locations.
>
> Did I get your comments correctly, Eric ?..

Well we may need that.   For the moment let's consider putting both
a kpid and upid and the task_struct, and elsewhere. Basically I don't think
translation is necessary in the common case.

However let's look at a single practical case to see how it would need
to be implemnted.

struct fown_struct.  Every file has one and you can modify it both on
a socket with ioctls FIOSETOWN,SIOCSPGRP,FIOGETOWN,SIOCPGRP.  And on
a normal file handle with fcntl with FSETOWN, and FGETOWN.

Since a struct file can be passed between processes in different
pid spaces using unix domain sockets we cannot count on the context
of the signaler to be the same as the context of the setter.

So we need to look at how to handle this case cleanly, and safely.

Eric


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-24 19:34                               ` Eric W. Biederman
@ 2006-01-24 21:09                                 ` Hubertus Franke
  0 siblings, 0 replies; 136+ messages in thread
From: Hubertus Franke @ 2006-01-24 21:09 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Alan Cox, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Eric W. Biederman wrote:
> Hubertus Franke <frankeh@watson.ibm.com> writes:
> 
> 
>>In that case, I think we do require the current vpid_to_pid(translations)
>>in order to transfer the external user pid ( relative to the namespace )
>>into one that combines namespace (aka container_id) with the external pid.
>>Exactly how it is done today.
>>What will slightly change is the low level implementations of the
>>
>>inline pid_t pid_to_vpid_ctx(pid_t pid, const struct task_struct *ctx);
>>pid_t __pid_to_vpid_ctx_excp(pid_t pid, int pidspace_id,const struct task_struct
>>*ctx);
>>
>>and reverse.
>>The VPID_2_PID and PID_2_VPID still remain at same locations.
>>
>>Did I get your comments correctly, Eric ?..
> 
> 
> Well we may need that.   For the moment let's consider putting both
> a kpid and upid and the task_struct, and elsewhere. Basically I don't think
> translation is necessary in the common case.

OK for discussion purposes no problem .. what ever is the best at the end
is the GO.
Abstractly speaking, mangling the <container,upid> tuple into the same
long int is an implementation detail.

> 
> However let's look at a single practical case to see how it would need
> to be implemnted.
> 
> struct fown_struct.  Every file has one and you can modify it both on
> a socket with ioctls FIOSETOWN,SIOCSPGRP,FIOGETOWN,SIOCPGRP.  And on
> a normal file handle with fcntl with FSETOWN, and FGETOWN.
> 
> Since a struct file can be passed between processes in different
> pid spaces using unix domain sockets we cannot count on the context
> of the signaler to be the same as the context of the setter.

If you follow the patch set, we do distinguish the context case (we might have
missed a few here and there, as you already pointed out), but going into
the kernel we always take the context of the caller, coming out of the
kernel kpid -> upid we do use the appropriate context.

static inline pid_t pid_to_vpid_ctx(pid_t pid, const struct task_struct *ctx)

> 
> So we need to look at how to handle this case cleanly, and safely.

-- Hubertus


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-24 19:26                               ` Eric W. Biederman
@ 2006-01-24 21:11                                 ` Alan Cox
  2006-01-24 21:15                                   ` Arjan van de Ven
                                                     ` (2 more replies)
  0 siblings, 3 replies; 136+ messages in thread
From: Alan Cox @ 2006-01-24 21:11 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

On Maw, 2006-01-24 at 12:26 -0700, Eric W. Biederman wrote:
> There is at least NFS lockd that appreciates having a single integer
> per process unique identifier.  So there is a practical basis for
> wanting such a thing.

Which gets us back to refcounting.

> At least for this first round I think talking about a kpid
> as a container, pid pair makes a lot of sense for the moment, as
> the other implementations just confuse things.

As an abstract object a kpid to me means a single identifier which
uniquely identifies the process and which in its component parts be they
pointers or not uniquely identifies the process in the container and the
container in the system, both correctly refcounted against re-use.

> However it looks to me that the biggest challenge right now about
> development is the size of a patch to change any one of these things.

Thats where we disagree strongly. Wrappers hide, confuse and obscure. We
want the workings brutally and clearly visible so that people don't make
assumptions and have nasty accidents. Its like typdedefs and overuse of
defines.

Alan


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-24 21:11                                 ` Alan Cox
@ 2006-01-24 21:15                                   ` Arjan van de Ven
  2006-01-25  9:58                                     ` Eric W. Biederman
  2006-01-25  9:13                                   ` Eric W. Biederman
  2006-01-25  9:51                                   ` Eric W. Biederman
  2 siblings, 1 reply; 136+ messages in thread
From: Arjan van de Ven @ 2006-01-24 21:15 UTC (permalink / raw)
  To: Alan Cox
  Cc: Eric W. Biederman, Hubertus Franke, Dave Hansen, Greg KH,
	Serge E. Hallyn, linux-kernel, Cedric Le Goater

On Tue, 2006-01-24 at 21:11 +0000, Alan Cox wrote:
> On Maw, 2006-01-24 at 12:26 -0700, Eric W. Biederman wrote:
> > There is at least NFS lockd that appreciates having a single integer
> > per process unique identifier.  So there is a practical basis for
> > wanting such a thing.
> 
> Which gets us back to refcounting.
> 
> > At least for this first round I think talking about a kpid
> > as a container, pid pair makes a lot of sense for the moment, as
> > the other implementations just confuse things.
> 
> As an abstract object a kpid to me means a single identifier which
> uniquely identifies the process and which in its component parts be they
> pointers or not uniquely identifies the process in the container and the
> container in the system, both correctly refcounted against re-use.

they why not just straight use the task struct pointer for this? It's
guaranteed unique.. ;)



^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-24 21:11                                 ` Alan Cox
  2006-01-24 21:15                                   ` Arjan van de Ven
@ 2006-01-25  9:13                                   ` Eric W. Biederman
  2006-01-25  9:51                                   ` Eric W. Biederman
  2 siblings, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-25  9:13 UTC (permalink / raw)
  To: Alan Cox
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

>> However it looks to me that the biggest challenge right now about
>> development is the size of a patch to change any one of these things.
>
> Thats where we disagree strongly. Wrappers hide, confuse and obscure. We
> want the workings brutally and clearly visible so that people don't make
> assumptions and have nasty accidents. Its like typdedefs and overuse of
> defines.

I totally that we want the uses of pids to be clear and not over abstracted.
However most places that reference pids are debug statements are simply
line noise in any patch.

I threw together a casual patch for purposes of discussion earlier.
Not fully polished, just a rough draft for discussion and it was so big
most people didn't even receive it.  So regardless of other considerations
the sheer number size is a problem even if abstractions can't help.

I think adding a helper for the common debugging idiom of printing out
the current task is generally useful, and is worth considering on it's own merits.

In particular, does this look like a sane piece of code to add to the kernel?
This is based on analogy with dev_printk.

Either tsk_printk or something like NIP_QUAD is what I am thinking about.

Eric


diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0cfcd1c..d5dcbde 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -880,6 +880,19 @@ static inline pid_t process_group(struct
        return tsk->signal->pgrp;
 }

+/* debugging and troubleshooting/disanostic helpers. */
+#define tsk_printk(level, dev, format, arg...) \
+       printk(level "%s(%d): " format , (tsk)->comm , (tsk)->pid , ## arg)
+
+#define tsk_dbg(tsk, format, arg...)           \
+       tsk_printk(KERN_DEBUG , tsk , format, ## arg)
+#define tsk_err(tsk, format, arg...)           \
+       tsk_printk(KERN_ERR , tsk , format, ## arg)
+#define tsk_info(tsk, format, arg...)          \
+       tsk_printk(KERN_INFO , tsk , format, ## arg)
+#define tsk_warn(tsk, format, arg...)          \
+       tsk_printk(KERN_WARN , tsk , format, ## arg)
+
 /**
  * pid_alive - check that a task structure is not stale
  * @p: Task structure to be checked.


^ permalink raw reply related	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-24 21:11                                 ` Alan Cox
  2006-01-24 21:15                                   ` Arjan van de Ven
  2006-01-25  9:13                                   ` Eric W. Biederman
@ 2006-01-25  9:51                                   ` Eric W. Biederman
  2006-01-26 20:23                                     ` Herbert Poetzl
  2 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-25  9:51 UTC (permalink / raw)
  To: Alan Cox
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> On Maw, 2006-01-24 at 12:26 -0700, Eric W. Biederman wrote:

>> At least for this first round I think talking about a kpid
>> as a container, pid pair makes a lot of sense for the moment, as
>> the other implementations just confuse things.
>
> As an abstract object a kpid to me means a single identifier which
> uniquely identifies the process and which in its component parts be they
> pointers or not uniquely identifies the process in the container and the
> container in the system, both correctly refcounted against re-use.

Correct.

Currently by using pids internally we are not correctly refcounted
against reuse.  Nor in the process group case do we even have an
object off of which we can hang a reference count.

In the case of a multiple instances of a process space the problem
is much more acute as we must properly ref count the pid space as
well.

Now to further make this fun we have variables like spawnpid in 
drivers/char/vt_ioctl.c and drivers/char/keyboard.c that
persist indefinitely.  Which cause problems for most traditional
reference counting techniques.

Further in cases where the references persist indefinitely we don't
want to pin the task_struct in memory indefinitely even after
the task has exited and it's zombie has been reaped.

So how do we solve this problem?

There are two possible approaches I can see to solving this problem.
1) Use a non-pointer based kpid and simply accept identifier
   wrap-around problems with kpids just like we currently accept
   these problems with pids.

2) Implement weak references for kpids.

Semantically a weak reference is a pointer that becomes NULL when the
object it refers to goes away.

A couple days ago I conducted an experiment, to see if I could
implement this in the kernel and surprisingly it is fairly straight
forward to do.  First you define a weak kpid as a kpid with a
list_head attached, and whenever you setup a weak kpid you
register it with the pid hash table.  

Then in detach_pid when the last reference to the pid goes away, you
walk the list of weak kpids and you NULL the appropriate entries.

This seems to solve the reference counting problem neatly and
without needing to disturb the logic of the existing code.  Even
outside the context of multiple pid spaces then I think weak
kpids are desirable.

Thoughts?

from kernel/pid.c:
> void fastcall detach_pid(task_t *task, enum pid_type type)
> {
> 	int tmp, nr;
> 
> 	nr = __detach_pid(task, type);
> 	if (!nr)
> 		return;

Walk the list of weak kpids here.

> 
> 	for (tmp = PIDTYPE_MAX; --tmp >= 0; )
> 		if (tmp != type && find_pid(tmp, nr))
> 			return;
> 
> 	free_pidmap(nr);
> }


Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-24 21:15                                   ` Arjan van de Ven
@ 2006-01-25  9:58                                     ` Eric W. Biederman
  2006-01-25 15:10                                       ` Trond Myklebust
  0 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-25  9:58 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Alan Cox, Hubertus Franke, Dave Hansen, Greg KH, Serge E. Hallyn,
	linux-kernel, Cedric Le Goater

Arjan van de Ven <arjan@infradead.org> writes:

> On Tue, 2006-01-24 at 21:11 +0000, Alan Cox wrote:
>> On Maw, 2006-01-24 at 12:26 -0700, Eric W. Biederman wrote:
>> > There is at least NFS lockd that appreciates having a single integer
>> > per process unique identifier.  So there is a practical basis for
>> > wanting such a thing.
>> 
>> Which gets us back to refcounting.
>> 
>> > At least for this first round I think talking about a kpid
>> > as a container, pid pair makes a lot of sense for the moment, as
>> > the other implementations just confuse things.
>> 
>> As an abstract object a kpid to me means a single identifier which
>> uniquely identifies the process and which in its component parts be they
>> pointers or not uniquely identifies the process in the container and the
>> container in the system, both correctly refcounted against re-use.
>
> they why not just straight use the task struct pointer for this? It's
> guaranteed unique.. ;)

Actually I think that is a very sensible solution to this problem.
It does double or triple the length of the string passed to lockd
and is an information leak about which kernel addresses you are
running out of which may be undesirable from a security perspective
but I think that will fix the practical problem.

Reference counting in this case is not an issue, as these are
per process locks and should be freed up when everything goes.

I have a weird memory that simply making the string long and using
%p (current) didn't work as well as of %d (current->kpid) but that is something
very hard to test, as usually even with multiple pid spaces you don't
get pid reuse and the errors from NFS are not at all clear that pid
reuse is what is causing problems.  So I don't have good data on
that situation.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-25  9:58                                     ` Eric W. Biederman
@ 2006-01-25 15:10                                       ` Trond Myklebust
  2006-01-25 18:01                                         ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Trond Myklebust @ 2006-01-25 15:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Arjan van de Ven, Alan Cox, Hubertus Franke, Dave Hansen,
	Greg KH, Serge E. Hallyn, linux-kernel, Cedric Le Goater

On Wed, 2006-01-25 at 02:58 -0700, Eric W. Biederman wrote:
> >> On Maw, 2006-01-24 at 12:26 -0700, Eric W. Biederman wrote:
> >> > There is at least NFS lockd that appreciates having a single integer
> >> > per process unique identifier.  So there is a practical basis for
> >> > wanting such a thing.

The NFS lock manager mainly wants a unique 32-bit identifier that can
follow clone(CLONE_FILES). The reason is that the Linux VFS is forced to
use the pointer to the file table as the "process identifier" for posix
locks (i.e. fcntl() locks).

Cheers,
  Trond


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-25 15:10                                       ` Trond Myklebust
@ 2006-01-25 18:01                                         ` Eric W. Biederman
  2006-01-25 19:30                                           ` Trond Myklebust
  0 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-25 18:01 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Arjan van de Ven, Alan Cox, Hubertus Franke, Dave Hansen,
	Greg KH, Serge E. Hallyn, linux-kernel, Cedric Le Goater

Trond Myklebust <trond.myklebust@fys.uio.no> writes:

> On Wed, 2006-01-25 at 02:58 -0700, Eric W. Biederman wrote:
>> >> On Maw, 2006-01-24 at 12:26 -0700, Eric W. Biederman wrote:
>> >> > There is at least NFS lockd that appreciates having a single integer
>> >> > per process unique identifier.  So there is a practical basis for
>> >> > wanting such a thing.
>
> The NFS lock manager mainly wants a unique 32-bit identifier that can
> follow clone(CLONE_FILES). The reason is that the Linux VFS is forced to
> use the pointer to the file table as the "process identifier" for posix
> locks (i.e. fcntl() locks).

Ok.  I think I was thinking of a different case, but if I missed one
this could explain the weirdness I was seeing.

Let me list the cases I know of and see if I hit what
you are thinking of.

fs/nfs/nfs3proc.c:nfs3_proc_create()
For O_EXCL we have arg.verifier = current->pid.


fs/lockd/clntproc.c:nlmclnt_setlockargs()
We have:	lock->oh.len  = sprintf(req->a_owner, "%d@%s",
					current->pid, system_utsname.nodename);

I think this is the fcntl() case.
I would suggest fl_pid might have something to do with it 
but that is part flock based locking.

So I'm not certain I see the part of NFS you are refering to.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-25 18:01                                         ` Eric W. Biederman
@ 2006-01-25 19:30                                           ` Trond Myklebust
  2006-01-25 21:59                                             ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Trond Myklebust @ 2006-01-25 19:30 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Arjan van de Ven, Alan Cox, Hubertus Franke, Dave Hansen,
	Greg KH, Serge E. Hallyn, linux-kernel, Cedric Le Goater

On Wed, 2006-01-25 at 11:01 -0700, Eric W. Biederman wrote:
> fs/nfs/nfs3proc.c:nfs3_proc_create()
> For O_EXCL we have arg.verifier = current->pid.

Yes, but that does not result in any permanent state that would be tied
to the pid on the server. The verifier here is used only to ensure
idempotency of the exclusive create RPC call.

> fs/lockd/clntproc.c:nlmclnt_setlockargs()
> We have:	lock->oh.len  = sprintf(req->a_owner, "%d@%s",
> 					current->pid, system_utsname.nodename);
> 
> I think this is the fcntl() case.
> I would suggest fl_pid might have something to do with it 
> but that is part flock based locking.

That name is not interpreted by the NLM server. It is, AFAIK, only used
for debugging purposes.
nlm_find_lockowner() is used to define a unique identifier that is
supposed to be sent to the server as the 'pid'.

Cheers,
  Trond


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-25 19:30                                           ` Trond Myklebust
@ 2006-01-25 21:59                                             ` Eric W. Biederman
  0 siblings, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-25 21:59 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Arjan van de Ven, Alan Cox, Hubertus Franke, Dave Hansen,
	Greg KH, Serge E. Hallyn, linux-kernel, Cedric Le Goater

Trond Myklebust <trond.myklebust@fys.uio.no> writes:

> On Wed, 2006-01-25 at 11:01 -0700, Eric W. Biederman wrote:
>> fs/nfs/nfs3proc.c:nfs3_proc_create()
>> For O_EXCL we have arg.verifier = current->pid.
>
> Yes, but that does not result in any permanent state that would be tied
> to the pid on the server. The verifier here is used only to ensure
> idempotency of the exclusive create RPC call.
>
>> fs/lockd/clntproc.c:nlmclnt_setlockargs()
>> We have:	lock->oh.len  = sprintf(req->a_owner, "%d@%s",
>> 					current->pid, system_utsname.nodename);
>> 
>> I think this is the fcntl() case.
>> I would suggest fl_pid might have something to do with it 
>> but that is part flock based locking.
>
> That name is not interpreted by the NLM server. It is, AFAIK, only used
> for debugging purposes.

Ok I though it might have been compared to equality someplace.

> nlm_find_lockowner() is used to define a unique identifier that is
> supposed to be sent to the server as the 'pid'.

Ok interesting.

All I know for certain was that with 2 pidspaces using the
same nfs mount I was confusing something, with regards to locking.


Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC: Multiple instances of kernel namespaces.
  2006-01-21 10:04                       ` Eric W. Biederman
@ 2006-01-26 19:47                         ` Herbert Poetzl
  2006-01-26 20:13                           ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Herbert Poetzl @ 2006-01-26 19:47 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge E. Hallyn, linux-kernel,
	Alan Cox <alan@lxorguk.ukuu.org.uk> Dave Hansen,
	Arjan van de Ven, Suleiman Souhlal, Hubertus Franke,
	Cedric Le Goater

On Sat, Jan 21, 2006 at 03:04:16AM -0700, Eric W. Biederman wrote:
> "Serge E. Hallyn" <serue@us.ibm.com> writes:
> 
> > Quoting Eric W. Biederman (ebiederm@xmission.com):
> >> 
> >> At this point I have to confess I have been working on something
> >> similar, to IBM's pid virtualization work.  But I have what is at
> >> least for me a unifying concept, that makes things easier to think
> >> about.
> >> 
> >> The idea is to think about things in terms of namespaces.  Currently
> >> in the kernel we have the fs/mount namespace already implemented.
> >> 
> >> Partly this helps on what the interface for creating a new namespace
> >> instance should be.  'clone(CLONE_NEW<NAMESPACE_TYPE>)', and how
> >> it should be managed from the kernel data structures.
> >> 
> >> Partly thinking of things as namespaces helps me scope the problem.
> >> 
> >> Does this sound like a sane approach?
> >
> > And a bonus of this is that for security and vserver-type applications,
> > the CLONE_NEWPID and CLONE_NEWFS will often happen at the same time.
> >
> > How do you (or do you?) address naming namespaces?  This would be
> > necessary for transitioning into an existing namespace, performing
> > actions on existing namespaces (i.e. checkpoint, migrate to another
> > machine, enter the namespace and kill pid 521), and would just be
> > useful for accounting purposes, i.e. how else do you have a
> > "ps --all-namespaces" specify a process' namespace?
> 
> So I address naming indirectly.  The last thing I want to have
> is to add yet another namespace to the kernel for naming namespaces.
> We have enough namespaces already.
> 
> In any sane context for a pid-namespace we need a pid that
> we can call waitpid on, so we don't break the process tree.
> Which means at least the init process has 2 pids, one
> that it's parent sees, and another (1) that it and it's
> children see.
> 
> So I name pidspaces like we do sessions of process groups
> and sessions by the pid of the leader.
> 
> So in the simple case I have names like:
> 1178/1632

which is a new namespace in itself, but it doesn't matter
as long as it uniquely and persistently identifies the
namespace for the time it exists ... just leaves the
question how to retrieve a list of all namespaces :)

> > Doubt we want to add an argument to clone(), so do we just add a new
> > proc, sysfs, or syscall for setting a pid-namespace name?
> 
> That shouldn't be necessary.
> 
> > Do we need a new syscall for transitioning into an existing namespace?
> 
> That is a good question.  The FS namespaces that we already have
> has much the same problem.  A completely different solution to
> this problem seems to have been implemented but I don't grasp it
> yet.
> 
> Inherently transitioning to an existing namespace is something
> that is straight forward to implement, so it is worth thinking
> about.
> 
> If I want a guest that can keep secrets from the host sysadmin I don't
> want transitioning into a guest namespace to come too easily.

which can easily be achieved by 'marking' the namespace
as private and/or applying certain rules/checks to the
'enter' procedure ...

best,
Herbert

> Currently I can always just create an extra child of pid 1
> that I will be my slave.  The problem is that this is an extra
> process laying around.
> 
> Eric
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-22 16:24                           ` Eric W. Biederman
@ 2006-01-26 20:01                             ` Herbert Poetzl
  2006-01-27  9:04                               ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Herbert Poetzl @ 2006-01-26 20:01 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Arjan van de Ven, Kyle Moffett, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, linux-kernel,
	Cedric Le Goater

On Sun, Jan 22, 2006 at 09:24:27AM -0700, Eric W. Biederman wrote:
> Arjan van de Ven <arjan@infradead.org> writes:
> 
> >> 
> >> A pointer to a task_struct while it kind of sort of works. Is
> >> not a good solution. The problem is that in a lot of cases we
> >> register a pid to get a signal or something similar and then we
> >> never unregister it. So by using a pointer to a trask_struct you
> >> effectively hold the process in memory forever.

> > this is not right. Both the PID and the task struct have the exact
> > same lifetime rules, they HAVE to, to guard against pid reuse
> > problems.

> Yes PIDs reserved for the lifetime of the task_struct (baring minor
> details). There are actually a few races in /proc where it can see the
> task_struct after the pid has been freed (see the pid_alive macro in
> sched.h)
>
> However when used as a reference the number can live as long as you
> want. The classic example is a pid file that can exist even after you
> reboot a machine.
>
> So currently a use of a PID as a reference to processes or process
> groups can last forever. An example of this is the kernel is the
> result of fcntl(fd, F_SETOWN). The session of a tty is similar.
>
> Since the in kernel references have a lifetime that is completely
> different than the lifetime of a process or a PID. It is not safe
> to simply replace such references with a direct reference to a
> task_struct (besides being technically impossible). Adding those
> references could potentially increase the lifespan of a task_struct
> for to the life of the kernel depending on the reference.

well, yes, but wouldn't that be the RightThing(tm) 
anway? because 'referencing' something via a pid, then
letting the task holding the pid go away and even be
replaced by a new one (with the same pid) which then
will get suddenly signaled from somewhere, just because
the pid matches seems very broken to me ...

best,
Herbert

> Eric
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC: Multiple instances of kernel namespaces.
  2006-01-26 19:47                         ` Herbert Poetzl
@ 2006-01-26 20:13                           ` Eric W. Biederman
  2006-01-26 20:27                             ` Herbert Poetzl
  0 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-26 20:13 UTC (permalink / raw)
  To: Herbert Poetzl
  Cc: Serge E. Hallyn, linux-kernel,
	Alan Cox <alan@lxorguk.ukuu.org.uk> Dave Hansen,
	Arjan van de Ven, Suleiman Souhlal, Hubertus Franke,
	Cedric Le Goater

Herbert Poetzl <herbert@13thfloor.at> writes:

> On Sat, Jan 21, 2006 at 03:04:16AM -0700, Eric W. Biederman wrote:
>> So in the simple case I have names like:
>> 1178/1632
>
> which is a new namespace in itself, but it doesn't matter
> as long as it uniquely and persistently identifies the
> namespace for the time it exists ... just leaves the
> question how to retrieve a list of all namespaces :)

Yes but the name of the namespace is still in the original pid namespace.
And more importantly to me it isn't a new kind of namespace.

>> If I want a guest that can keep secrets from the host sysadmin I don't
>> want transitioning into a guest namespace to come too easily.
>
> which can easily be achieved by 'marking' the namespace
> as private and/or applying certain rules/checks to the
> 'enter' procedure ...

Right.  The trick here is that you must be able to deny
transitioning into a namespace from the inside the namespace.
Or else a guest could never trust it.  Something one of my
coworkers pointed out to me.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-25  9:51                                   ` Eric W. Biederman
@ 2006-01-26 20:23                                     ` Herbert Poetzl
  2006-01-27  8:28                                       ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Herbert Poetzl @ 2006-01-26 20:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Alan Cox, Hubertus Franke, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

On Wed, Jan 25, 2006 at 02:51:22AM -0700, Eric W. Biederman wrote:
> Alan Cox <alan@lxorguk.ukuu.org.uk> writes:
> 
> > On Maw, 2006-01-24 at 12:26 -0700, Eric W. Biederman wrote:
> 
> >> At least for this first round I think talking about a kpid
> >> as a container, pid pair makes a lot of sense for the moment, as
> >> the other implementations just confuse things.
> >
> > As an abstract object a kpid to me means a single identifier which
> > uniquely identifies the process and which in its component parts be they
> > pointers or not uniquely identifies the process in the container and the
> > container in the system, both correctly refcounted against re-use.
> 
> Correct.
> 
> Currently by using pids internally we are not correctly refcounted
> against reuse.  Nor in the process group case do we even have an
> object off of which we can hang a reference count.
> 
> In the case of a multiple instances of a process space the problem
> is much more acute as we must properly ref count the pid space as
> well.
> 
> Now to further make this fun we have variables like spawnpid in 
> drivers/char/vt_ioctl.c and drivers/char/keyboard.c that
> persist indefinitely.  Which cause problems for most traditional
> reference counting techniques.
> 
> Further in cases where the references persist indefinitely we don't
> want to pin the task_struct in memory indefinitely even after
> the task has exited and it's zombie has been reaped.
> 
> So how do we solve this problem?
> 
> There are two possible approaches I can see to solving this problem.
> 1) Use a non-pointer based kpid and simply accept identifier
>    wrap-around problems with kpids just like we currently accept
>    these problems with pids.

sounds like a poor approach (well similar to the
current one, except that the issues might get more
comples when processes are signalled or referenced
across pid spaces :) ...

anyway, if that would be the aim, it could be done
much simpler by 'just' adding a v/upid field to the
task struct and use that for everything userspace
related (i.e. locating tasks, sending signals, etc)
no need to change the current *pid entries at all

best,
Herbert

> 2) Implement weak references for kpids.
> 
> Semantically a weak reference is a pointer that becomes NULL when the
> object it refers to goes away.
> 
> A couple days ago I conducted an experiment, to see if I could
> implement this in the kernel and surprisingly it is fairly straight
> forward to do.  First you define a weak kpid as a kpid with a
> list_head attached, and whenever you setup a weak kpid you
> register it with the pid hash table.  
> 
> Then in detach_pid when the last reference to the pid goes away, you
> walk the list of weak kpids and you NULL the appropriate entries.
> 
> This seems to solve the reference counting problem neatly and
> without needing to disturb the logic of the existing code.  Even
> outside the context of multiple pid spaces then I think weak
> kpids are desirable.
> 
> Thoughts?
> 
> from kernel/pid.c:
> > void fastcall detach_pid(task_t *task, enum pid_type type)
> > {
> > 	int tmp, nr;
> > 
> > 	nr = __detach_pid(task, type);
> > 	if (!nr)
> > 		return;
> 
> Walk the list of weak kpids here.
> 
> > 
> > 	for (tmp = PIDTYPE_MAX; --tmp >= 0; )
> > 		if (tmp != type && find_pid(tmp, nr))
> > 			return;
> > 
> > 	free_pidmap(nr);
> > }
> 
> 
> Eric
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC: Multiple instances of kernel namespaces.
  2006-01-26 20:13                           ` Eric W. Biederman
@ 2006-01-26 20:27                             ` Herbert Poetzl
  0 siblings, 0 replies; 136+ messages in thread
From: Herbert Poetzl @ 2006-01-26 20:27 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Serge E. Hallyn, linux-kernel,
	Alan Cox <alan@lxorguk.ukuu.org.uk> Dave Hansen,
	Arjan van de Ven, Suleiman Souhlal, Hubertus Franke,
	Cedric Le Goater

On Thu, Jan 26, 2006 at 01:13:45PM -0700, Eric W. Biederman wrote:
> Herbert Poetzl <herbert@13thfloor.at> writes:
> 
> > On Sat, Jan 21, 2006 at 03:04:16AM -0700, Eric W. Biederman wrote:
> >> So in the simple case I have names like:
> >> 1178/1632
> >
> > which is a new namespace in itself, but it doesn't matter
> > as long as it uniquely and persistently identifies the
> > namespace for the time it exists ... just leaves the
> > question how to retrieve a list of all namespaces :)
> 
> Yes but the name of the namespace is still in the original pid namespace.
> And more importantly to me it isn't a new kind of namespace.
> 
> >> If I want a guest that can keep secrets from the host sysadmin I don't
> >> want transitioning into a guest namespace to come too easily.
> >
> > which can easily be achieved by 'marking' the namespace
> > as private and/or applying certain rules/checks to the
> > 'enter' procedure ...
> 
> Right.  The trick here is that you must be able to deny
> transitioning into a namespace from the inside the namespace.
> Or else a guest could never trust it.  Something one of my
> coworkers pointed out to me.

not necessarily, for example have a 'private' flag, which
can only be set once (usually from outside), ensuring that
the namespace will not be entered. this flag could be
checked from inside ...

best,
Herbert

> Eric
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-26 20:23                                     ` Herbert Poetzl
@ 2006-01-27  8:28                                       ` Eric W. Biederman
  0 siblings, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-27  8:28 UTC (permalink / raw)
  To: Herbert Poetzl
  Cc: Alan Cox, Hubertus Franke, Dave Hansen, Greg KH, Serge E. Hallyn,
	Arjan van de Ven, linux-kernel, Cedric Le Goater

Herbert Poetzl <herbert@13thfloor.at> writes:

> anyway, if that would be the aim, it could be done
> much simpler by 'just' adding a v/upid field to the
> task struct and use that for everything userspace
> related (i.e. locating tasks, sending signals, etc)
> no need to change the current *pid entries at all

Yes and no.  Changing the current pid entries as opposed
to adding the kpid/upid separation is a slightly different
problem.  

In particular there are 4 uses we need to change.  
 - Printing the pid in debug messages.
 - Comparing pids (because we need to add a context comparison)
 - Sending signals/localing tasks.
 - Entering a code path that wants to do one of the above.

Printing the pid in debug messages seems to be confined to
performing the action with reference to a task_struct, and
is a completeness issue not a correctness issue.

Sending signals and locating tasks by pid is fairly
straight forward to change the interface of all affected
functions.  And thus forcing an audit of all callers,
recursively also works well.

Comparing pids is where I think things get sticky but arguably
that case is rare enough we may be able to catch all of the users
with a code audit.

The only change I would really advocate at the moment beyond
adding the kpid fields to the task struct is to rename
(pid, tgid, pgrp, session) to (upid, utgid, upgrp, usession)
so we catch and break the users.  This would catch flush into
the open all of the users that are doing weird things like comparing
pids and would leave any rare untested and unspotted case broken where
it will not compile.

Arguably that it is overkill to break all of the users to catch the
stragglers that we can't easily spot with a code review.  Likely I be
satisfied with not breaking the code until I found a straggler that
affects correctness that made it through a kernel code review. 

So far I have yet to see a version of the code that does not miss
important stragglers.  Which is why to be correct I suspect we need
to break all users.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-26 20:01                             ` Herbert Poetzl
@ 2006-01-27  9:04                               ` Eric W. Biederman
  2006-01-27 12:27                                 ` Kyle Moffett
  0 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-27  9:04 UTC (permalink / raw)
  To: Herbert Poetzl
  Cc: Arjan van de Ven, Kyle Moffett, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, linux-kernel,
	Cedric Le Goater

Herbert Poetzl <herbert@13thfloor.at> writes:

> well, yes, but wouldn't that be the RightThing(tm) 
> anway? because 'referencing' something via a pid, then
> letting the task holding the pid go away and even be
> replaced by a new one (with the same pid) which then
> will get suddenly signaled from somewhere, just because
> the pid matches seems very broken to me ...

Agreed, but that describes the current state of the kernel.

Using a task_struct for referencing kernel threads where there
is tight collaboration seems sane.  However using a task_struct
is impossible when referring to process groups, and it feels
like a bad idea to reference user space processes.

Basically my concern is that by using task structs internally
the kernel will start collecting invisible zombies.  And
with a case like struct fown_struct we could force RLIMIT_NOFILE task
structs into memory, per hostile process.  Usually this is much more
than RLIMIT_NPROC which limits the total number of live processes
and zombies a single user may create.

So assuming RLIMIT_NPROC == 100 and RLIMIT_NOFILE == 1024

Which means something like 100*1024*sizeof(struct task_struct) bytes
sizeof(struct task_struct) is somewhere between 512 and 1K bytes,
on a 32bit platform.

So 100*1024*512 to 100*1024*1024 = 50 to 100MB.
Being able to pin 100MB with modest ulimits does not sound like an
obvious fix to me.

Given what a hostile user can potentially accomplish I think anything that
approaches using struct task_struct pointers as a replacements for pids
should be approached carefully.

Eric


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-27  9:04                               ` Eric W. Biederman
@ 2006-01-27 12:27                                 ` Kyle Moffett
  2006-01-27 13:15                                   ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Kyle Moffett @ 2006-01-27 12:27 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Herbert Poetzl, Arjan van de Ven, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, linux-kernel,
	Cedric Le Goater

On Jan 27, 2006, at 04:04, Eric W. Biederman wrote:
> Basically my concern is that by using task structs internally the  
> kernel will start collecting invisible zombies.

So come up with a task_struct weakref system.  Maintain an (RCU?)  
linked list of struct task_weakref in the struct task_struct, and  
when the task struct is about to go away, run around all of the  
weakrefs and change their pointers to NULL.  The user of the weakref  
should check if the pointer is NULL and handle accordingly.  Sure, it  
would be tricky to get the locking right, but a couple extra bytes  
for a struct task_weakref would be a lot better than a whole pinned  
struct task_struct.

Cheers,
Kyle Moffett

--
Somone asked me why I work on this free (http://www.fsf.org/ 
philosophy/) software stuff and not get a real job. Charles Schulz  
had the best answer:

"Why do musicians compose symphonies and poets write poems? They do  
it because life wouldn't have any meaning for them if they didn't.  
That's why I draw cartoons. It's my life."
   -- Charles Schulz



^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-27 12:27                                 ` Kyle Moffett
@ 2006-01-27 13:15                                   ` Eric W. Biederman
  0 siblings, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-01-27 13:15 UTC (permalink / raw)
  To: Kyle Moffett
  Cc: Herbert Poetzl, Arjan van de Ven, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, linux-kernel,
	Cedric Le Goater

Kyle Moffett <mrmacman_g4@mac.com> writes:

> On Jan 27, 2006, at 04:04, Eric W. Biederman wrote:
>> Basically my concern is that by using task structs internally the kernel will
>> start collecting invisible zombies.
>
> So come up with a task_struct weakref system.  Maintain an (RCU?)  linked list
> of struct task_weakref in the struct task_struct, and  when the task struct is
> about to go away, run around all of the  weakrefs and change their pointers to
> NULL.  The user of the weakref  should check if the pointer is NULL and handle
> accordingly.  Sure, it  would be tricky to get the locking right, but a couple
> extra bytes for a struct task_weakref would be a lot better than a whole pinned
> struct task_struct.

Right.

I'm working on it.

Somehow it was lodged in my mind that a task_struct cannot represent
a process group, but it can obviously be the first in a link list.
Because of that I so far I have been approaching the weak reference
problem from the pid hash table side.  While using pid hash tables
can come out fairly clean, my best solution so far doubled the size
of the pid hash table.

I have two things that still concern me.
- The size of the linked list in pathological cases.
- Consistently picking a leader for a process group.

But I don't know if either one of them will actually be a problem,
so I think I will walk down that implementation path and see where
it takes me.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-20 20:18                 ` Hubertus Franke
  2006-01-21 10:25                   ` Eric W. Biederman
  2006-01-21 14:42                   ` Eric W. Biederman
@ 2006-01-31 21:02                   ` Linus Torvalds
  2006-02-01  0:01                     ` Hubertus Franke
                                       ` (2 more replies)
  2 siblings, 3 replies; 136+ messages in thread
From: Linus Torvalds @ 2006-01-31 21:02 UTC (permalink / raw)
  To: Hubertus Franke
  Cc: Eric W. Biederman, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater


(I'm coming in late, it's not been a high priority for me)

On Fri, 20 Jan 2006, Hubertus Franke wrote:
> 
> 2nd:
> ====	Issue: we don't need pid virtualization, instead simply use
> <container,pid> pair.
> 
> This requires a bit more thought. Essentially that's what I was doing, 
> but I mangled them into the same pid and using masking to add/remove the 
> container for internal use. As pointed out by Alan(?), we can indeed 
> reused the same pid internally many times as long as we can distinguish 
> during the pid-to-task_struct lookup. This is easily done because, the 
> caller provides the context hence the container for the lookup.

This is my preferred approach BY FAR.

Doing a <container,pid> approach is very natural, and avoids almost all 
issues. At most, you might want to have a new system call (most naturally 
just the one that is limited to the "init container" - it the one that we 
boot up with) that can specify both container and pid explicitly, and see 
all processes and access all processes. But all "normal" system calls 
would only ever operate within their container.

The fact is, we want "containers" anyway for any virtualization thing, ie 
vserver already adds them. And if we have containers, then it's very easy 
("easyish") to split up the current static "pid_hash[]", "pidmap_array[]" 
and "pidmap_lock", and make them per-container, and have a pointer to the 
container for each "struct task_struct".

After that, there wouldn't even be a lot else to do. The normal system 
calls would just use their own container, and the (few) places that save 
away pid's for later (ie things that use "kill_proc_info_as_uid()" and 
"struct fown_struct" friends) would have to also squirrell away the 
container, but then you should be pretty much done.

Of course, you'll have to do the system calls to _create_ the containers 
in the first place, but that's at a higher level and involves much more 
than just the pid-space (ie a container would normally have more than just 
the uid mappings, it would have any network knowledge too etc - hostname, 
perhaps list of network devices associated with that context etc etc)

			Linus

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-31 21:02                   ` RFC [patch 13/34] PID Virtualization Define new task_pid api Linus Torvalds
@ 2006-02-01  0:01                     ` Hubertus Franke
  2006-02-01  4:18                     ` Eric W. Biederman
  2006-02-02 14:48                     ` Kirill Korotaev
  2 siblings, 0 replies; 136+ messages in thread
From: Hubertus Franke @ 2006-02-01  0:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric W. Biederman, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

Linus Torvalds wrote:
> (I'm coming in late, it's not been a high priority for me)
> 
> On Fri, 20 Jan 2006, Hubertus Franke wrote:
> 
>>2nd:
>>====	Issue: we don't need pid virtualization, instead simply use
>><container,pid> pair.
>>
>>This requires a bit more thought. Essentially that's what I was doing, 
>>but I mangled them into the same pid and using masking to add/remove the 
>>container for internal use. As pointed out by Alan(?), we can indeed 
>>reused the same pid internally many times as long as we can distinguish 
>>during the pid-to-task_struct lookup. This is easily done because, the 
>>caller provides the context hence the container for the lookup.
> 
> 
> This is my preferred approach BY FAR.
> 
> Doing a <container,pid> approach is very natural, and avoids almost all 
> issues. At most, you might want to have a new system call (most naturally 
> just the one that is limited to the "init container" - it the one that we 
> boot up with) that can specify both container and pid explicitly, and see 
> all processes and access all processes. But all "normal" system calls 
> would only ever operate within their container.

That's what the current patch set does.
One "global container" that sees and accesses all and the rest is limited
to their respective "container".

> 
> The fact is, we want "containers" anyway for any virtualization thing, ie 
> vserver already adds them. And if we have containers, then it's very easy 
> ("easyish") to split up the current static "pid_hash[]", "pidmap_array[]" 
> and "pidmap_lock", and make them per-container, and have a pointer to the 
> container for each "struct task_struct".

We are very close to that .. the pidmap_array is already organized that way.
This was done so not to make the container an object that penetrates every
where in the code. Now that the discussion is flushing out, actually
accessing those entities through the container of the context-task would
be the next logical restructuring of the code.

> 
> After that, there wouldn't even be a lot else to do. The normal system 
> calls would just use their own container, and the (few) places that save 
> away pid's for later (ie things that use "kill_proc_info_as_uid()" and 
> "struct fown_struct" friends) would have to also squirrell away the 
> container, but then you should be pretty much done.

Agreed.

> 
> Of course, you'll have to do the system calls to _create_ the containers 
> in the first place, but that's at a higher level and involves much more 
> than just the pid-space (ie a container would normally have more than just 
> the uid mappings, it would have any network knowledge too etc - hostname, 
> perhaps list of network devices associated with that context etc etc)

Right now we do it simply through a poor man's /proc/container fs approach
that should be reasonable straight forward to convert to a syscall.

> 
> 			Linus
> 

Finally, I presume you followed the discussion on the conversion from task->pid
to access_functions that stirred some criticism. That part would disappear.

-- Hubertus


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-31 21:02                   ` RFC [patch 13/34] PID Virtualization Define new task_pid api Linus Torvalds
  2006-02-01  0:01                     ` Hubertus Franke
@ 2006-02-01  4:18                     ` Eric W. Biederman
  2006-02-01  4:39                       ` Linus Torvalds
  2006-02-01 16:29                       ` Greg
  2006-02-02 14:48                     ` Kirill Korotaev
  2 siblings, 2 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-01  4:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, Linux Kernel Mailing List, Cedric Le Goater

Linus Torvalds <torvalds@osdl.org> writes:

> (I'm coming in late, it's not been a high priority for me)

Thanks.  For taking the time.

> On Fri, 20 Jan 2006, Hubertus Franke wrote:
>> 
>> 2nd:
>> ====	Issue: we don't need pid virtualization, instead simply use
>> <container,pid> pair.
>> 
>> This requires a bit more thought. Essentially that's what I was doing, 
>> but I mangled them into the same pid and using masking to add/remove the 
>> container for internal use. As pointed out by Alan(?), we can indeed 
>> reused the same pid internally many times as long as we can distinguish 
>> during the pid-to-task_struct lookup. This is easily done because, the 
>> caller provides the context hence the container for the lookup.
>
> This is my preferred approach BY FAR.
>
> Doing a <container,pid> approach is very natural, and avoids almost all 
> issues. At most, you might want to have a new system call (most naturally 
> just the one that is limited to the "init container" - it the one that we 
> boot up with) that can specify both container and pid explicitly, and see 
> all processes and access all processes. But all "normal" system calls 
> would only ever operate within their container.

On this front I have been planning on using sys_clone as it allows
pieces of the virtualization to be incrementally built, it already
supports the FS namespace, and it supports flexibly specifying what
you want to contain.

> The fact is, we want "containers" anyway for any virtualization thing, ie 
> vserver already adds them. And if we have containers, then it's very easy 
> ("easyish") to split up the current static "pid_hash[]", "pidmap_array[]" 
> and "pidmap_lock", and make them per-container, and have a pointer to the 
> container for each "struct task_struct".

Well hash tables with their giant allocations are hard to split it has
been easier to add a container tag.

> After that, there wouldn't even be a lot else to do. The normal system 
> calls would just use their own container, and the (few) places that save 
> away pid's for later (ie things that use "kill_proc_info_as_uid()" and 
> "struct fown_struct" friends) would have to also squirrell away the 
> container, but then you should be pretty much done.

Yes.  Although there are a few container lifetimes problems with that
approach.  Do you want your container alive for a long time after every
process using it has exited just because someone has squirrelled away their
pid.  While container lifetime issues crop up elsewhere as well PIDs are
by far the worst, because it is current safe to store a PID indefinitely 
with nothing worse that PID wrap around.

> Of course, you'll have to do the system calls to _create_ the containers 
> in the first place, but that's at a higher level and involves much more 
> than just the pid-space (ie a container would normally have more than just 
> the uid mappings, it would have any network knowledge too etc - hostname, 
> perhaps list of network devices associated with that context etc etc)

Yes.  A list of network devices works seems to work well.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-01  4:18                     ` Eric W. Biederman
@ 2006-02-01  4:39                       ` Linus Torvalds
  2006-02-01  7:14                         ` Eric W. Biederman
                                           ` (2 more replies)
  2006-02-01 16:29                       ` Greg
  1 sibling, 3 replies; 136+ messages in thread
From: Linus Torvalds @ 2006-02-01  4:39 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, Linux Kernel Mailing List, Cedric Le Goater



On Tue, 31 Jan 2006, Eric W. Biederman wrote:
> 
> Yes.  Although there are a few container lifetimes problems with that
> approach.  Do you want your container alive for a long time after every
> process using it has exited just because someone has squirrelled away their
> pid.  While container lifetime issues crop up elsewhere as well PIDs are
> by far the worst, because it is current safe to store a PID indefinitely 
> with nothing worse that PID wrap around.

Are people really expecting to have a huge turn-over on containers? It 
sounds like this shouldn't be a problem in any normal circumstance: 
especially if you don't even do the "big hash-table per container" 
approach, who really cares if a container lives on after the last process 
exited?

I'd have expected that the major user for this would end up being ISP's 
and the like, and I would not expect the virtual machines to be brought up 
all the time. 

If it's a problem, you can do the same thing that the "struct mm_struct" 
does: it has life-time issues because a mm_struct actually has to live for 
potentially a _long_ time (zombies) but at the same time we want to free 
the data structures allocated to the mm_struct as soon as possible, 
notably the VMA's and the page tables.

So a mm_struct uses a two-level counter, with the "real" users (who need 
the page tables etc) incrementing one ("mm_users"), and the "secondary" 
ones (who just need to have an mm_struct pinned, but are ok with an empty 
VM being attached) incrementing the other ("mm_count").

The same approach might be valid for "containers": you can destroy most of 
the associated container when the actual processes are gone, but keep just 
the empty container around until all secondary references are finally also 
gone.

It's pretty simple: the secondary reference starts at 1 - with the 
"primary" counter being the single ref to the secondary. Then freeing a 
primary does:

	if (atomic_dec_and_test(&container->primary_counter)) {
		.. free the core resources here ..

		/* then release the ref from the primary to secondary */
		secondary_free(container);
	}

(for "mm_struct", the primary is dropped "mmput()" and the secondary is 
dropped with "mmdrop()", which is absolutely horrid naming. Please name 
things better than I did ;)

			Linus

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-01  4:39                       ` Linus Torvalds
@ 2006-02-01  7:14                         ` Eric W. Biederman
  2006-02-01 16:41                         ` Dave Hansen
  2006-02-02  5:14                         ` Herbert Poetzl
  2 siblings, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-01  7:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Hubertus Franke, Dave Hansen, Greg KH, Alan Cox, Serge E. Hallyn,
	Arjan van de Ven, Linux Kernel Mailing List, Cedric Le Goater

Linus Torvalds <torvalds@osdl.org> writes:

> On Tue, 31 Jan 2006, Eric W. Biederman wrote:
>> 
>> Yes.  Although there are a few container lifetimes problems with that
>> approach.  Do you want your container alive for a long time after every
>> process using it has exited just because someone has squirrelled away their
>> pid.  While container lifetime issues crop up elsewhere as well PIDs are
>> by far the worst, because it is current safe to store a PID indefinitely 
>> with nothing worse that PID wrap around.
>
> Are people really expecting to have a huge turn-over on containers? It 
> sounds like this shouldn't be a problem in any normal circumstance: 
> especially if you don't even do the "big hash-table per container" 
> approach, who really cares if a container lives on after the last process 
> exited?

Turn over rate is a good argument, for not worry about things too much.
I guess it only really becomes a problem if you have large amounts
of resources locked up.

> I'd have expected that the major user for this would end up being ISP's 
> and the like, and I would not expect the virtual machines to be brought up 
> all the time. 

People doing server consolidation are one of the big user bases.  The other
and possibly a bigger driver force right now are people dealing with large
high performance clusters.  There the interest is in encapsulating applications
so that you can checkpoint or migrate them.

One container per batch job might not be too high but if wound up being used
for short jobs as well as long ones you could get as high as one container
every couple of minutes.

The scary part with lifetime issues is that if you aren't careful you can
have lots of system resources used with no obvious source.

> If it's a problem, you can do the same thing that the "struct mm_struct" 
> does: it has life-time issues because a mm_struct actually has to live for 
> potentially a _long_ time (zombies) but at the same time we want to free 
> the data structures allocated to the mm_struct as soon as possible, 
> notably the VMA's and the page tables.
>
> So a mm_struct uses a two-level counter, with the "real" users (who need 
> the page tables etc) incrementing one ("mm_users"), and the "secondary" 
> ones (who just need to have an mm_struct pinned, but are ok with an empty 
> VM being attached) incrementing the other ("mm_count").

Neat.  I had not realized that was what was going on.  Having clean up a bunch
of cases there ages ago I was about to feel silly but I just realized mmdrop
is more recent than my comment explaining the difference between mmput and
mm_release.

One of the suggestions that has been floating around was to replace the
saved pids with references to task structures in places like fown_struct.
If we were to take that approach we would have nasty lifetime issues, because
we would continue to pin processes even after they were no longer zombies,
and we can potentially get a lot of fown_structs.

So I am considering introducing an intermediary (on very similar lines
to what you were suggesting) a struct task_ref that is just:
struct task_ref
{
	atomic_t count;
	enum pid_type type;
	struct task_struct *task;
};

That can be used to track tasks and process groups.  I posted fairly
complete patches for review a few days ago.  The interesting thing
with this case is that it can solve the pid wrap around issues as well
as container reference issues, by completely removing the need for
them.

The other technique that has served me well in my network
virtualization work was to setup of a notifier and have everyone who
cared register a notifier and drop their references when the notifier
was called.  For a low number of things that care as this works very
well.

> (for "mm_struct", the primary is dropped "mmput()" and the secondary is 
> dropped with "mmdrop()", which is absolutely horrid naming. Please name 
> things better than I did ;)

Well it is a challenge there aren't that many good names around and
it is hard work to find them. :)

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-01  4:18                     ` Eric W. Biederman
  2006-02-01  4:39                       ` Linus Torvalds
@ 2006-02-01 16:29                       ` Greg
  2006-02-01 16:44                         ` Eric W. Biederman
  1 sibling, 1 reply; 136+ messages in thread
From: Greg @ 2006-02-01 16:29 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linus Torvalds, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

Eric W. Biederman wrote:
> 
> On this front I have been planning on using sys_clone as it allows
> pieces of the virtualization to be incrementally built, it already
> supports the FS namespace, and it supports flexibly specifying what
> you want to contain.
> 

What would you do to handle the following case:

pid = getpid();
if (sys_clone(CLONE_CONTAINER) == 0) {
	ppid = getppid();
	assert(ppid == pid);
}

Most of the calls involving resource ids will return values that aren't
*consistent* with ids already stored in userland... could possibly break some
piece of code. Perhaps a sys_exec() should also be enforced to reset the process
memory.

-Greg-

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-01  4:39                       ` Linus Torvalds
  2006-02-01  7:14                         ` Eric W. Biederman
@ 2006-02-01 16:41                         ` Dave Hansen
  2006-02-02  5:14                         ` Herbert Poetzl
  2 siblings, 0 replies; 136+ messages in thread
From: Dave Hansen @ 2006-02-01 16:41 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric W. Biederman, Hubertus Franke, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

On Tue, 2006-01-31 at 20:39 -0800, Linus Torvalds wrote:
> On Tue, 31 Jan 2006, Eric W. Biederman wrote:
> > Yes.  Although there are a few container lifetimes problems with that
> > approach.  Do you want your container alive for a long time after every
> > process using it has exited just because someone has squirrelled away their
> > pid.  While container lifetime issues crop up elsewhere as well PIDs are
> > by far the worst, because it is current safe to store a PID indefinitely 
> > with nothing worse that PID wrap around.
> 
> Are people really expecting to have a huge turn-over on containers? It 
> sounds like this shouldn't be a problem in any normal circumstance: 
> especially if you don't even do the "big hash-table per container" 
> approach, who really cares if a container lives on after the last process 
> exited?

Other than testing, I can't imagine a case when we'd need them created
and destroyed very often.  In fact, one of the biggest cases for needing
checkpoint/restart on a container is a very long-lived processes that is
doing important work.

-- Dave


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-01 16:29                       ` Greg
@ 2006-02-01 16:44                         ` Eric W. Biederman
  2006-02-02 13:50                           ` Greg
  0 siblings, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-01 16:44 UTC (permalink / raw)
  To: Greg
  Cc: Linus Torvalds, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

Greg <gkurz@fr.ibm.com> writes:

> Eric W. Biederman wrote:
>> 
>> On this front I have been planning on using sys_clone as it allows
>> pieces of the virtualization to be incrementally built, it already
>> supports the FS namespace, and it supports flexibly specifying what
>> you want to contain.
>> 
>
> What would you do to handle the following case:
>
> pid = getpid();
> if (sys_clone(CLONE_CONTAINER) == 0) {
> 	ppid = getppid();
> 	assert(ppid == pid);
> }
>
> Most of the calls involving resource ids will return values that aren't
> *consistent* with ids already stored in userland... could possibly break some
> piece of code. Perhaps a sys_exec() should also be enforced to reset the process
> memory.

Well that assertion will fail.
At that point getppid() will return 0, and getpid() will return 1.

Processes getting confused is their own problem.

Now there will be a pid that the parent sees that will not be 0.
And that is what the parent will see in the context of wait.

In my code I introduced a wid (wait id) for that purpose.

This makes it possible to manage a container using the usual unix
process semantics.  Which is very important.


Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-01  4:39                       ` Linus Torvalds
  2006-02-01  7:14                         ` Eric W. Biederman
  2006-02-01 16:41                         ` Dave Hansen
@ 2006-02-02  5:14                         ` Herbert Poetzl
  2 siblings, 0 replies; 136+ messages in thread
From: Herbert Poetzl @ 2006-02-02  5:14 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric W. Biederman, Hubertus Franke, Dave Hansen, Greg KH,
	Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List, Cedric Le Goater

On Tue, Jan 31, 2006 at 08:39:19PM -0800, Linus Torvalds wrote:
> 
> 
> On Tue, 31 Jan 2006, Eric W. Biederman wrote:
> > 
> > Yes. Although there are a few container lifetimes problems with
> > that approach. Do you want your container alive for a long time
> > after every process using it has exited just because someone has
> > squirrelled away their pid. While container lifetime issues crop up
> > elsewhere as well PIDs are by far the worst, because it is current
> > safe to store a PID indefinitely with nothing worse that PID wrap
> > around.
>
> Are people really expecting to have a huge turn-over on containers? It
> sounds like this shouldn't be a problem in any normal circumstance:
> especially if you don't even do the "big hash-table per container"
> approach, who really cares if a container lives on after the last
> process exited?
>
> I'd have expected that the major user for this would end up being     
> ISP's and the like, and I would not expect the virtual machines to be 
> brought up all the time.                                              

well, really depends, as far as I can tell the 
number of guest (container) (re)starts can be as
high as one per second (in extreme cases) while
the entire setup doesn't have more than 50-100
containers at the same time, and usually 'runs'
for more than a few months without reboot ...

but agreed, the typical number of container
creations and deletions will be around one per
hour or day ...

> If it's a problem, you can do the same thing that the "struct
> mm_struct" does: it has life-time issues because a mm_struct actually
> has to live for potentially a _long_ time (zombies) but at the same
> time we want to free the data structures allocated to the mm_struct as
> soon as possible, notably the VMA's and the page tables.
>
> So a mm_struct uses a two-level counter, with the "real" users
> (who need the page tables etc) incrementing one ("mm_users"), and
> the "secondary" ones (who just need to have an mm_struct pinned,
> but are ok with an empty VM being attached) incrementing the other
> ("mm_count").

yes, we already do something very similar in 
Linux-VServer, basically differentiating between 
'active users' and 'passive references' ...

> The same approach might be valid for "containers": you can destroy most of 
> the associated container when the actual processes are gone, but keep just 
> the empty container around until all secondary references are finally also 
> gone.
> 
> It's pretty simple: the secondary reference starts at 1 - with the 
> "primary" counter being the single ref to the secondary. Then freeing a 
> primary does:
> 
> 	if (atomic_dec_and_test(&container->primary_counter)) {
> 		.. free the core resources here ..
> 
> 		/* then release the ref from the primary to secondary */
> 		secondary_free(container);
> 	}
> 
> (for "mm_struct", the primary is dropped "mmput()" and the secondary is 
> dropped with "mmdrop()", which is absolutely horrid naming. Please name 
> things better than I did ;)

best,
Herbert

> 			Linus
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-01 16:44                         ` Eric W. Biederman
@ 2006-02-02 13:50                           ` Greg
  2006-02-02 14:09                             ` Eric W. Biederman
  0 siblings, 1 reply; 136+ messages in thread
From: Greg @ 2006-02-02 13:50 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linus Torvalds, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

Eric W. Biederman wrote:
> Greg <gkurz@fr.ibm.com> writes:
> 
> 
>>Eric W. Biederman wrote:
>>
>>>On this front I have been planning on using sys_clone as it allows
>>>pieces of the virtualization to be incrementally built, it already
>>>supports the FS namespace, and it supports flexibly specifying what
>>>you want to contain.
>>>
>>
>>What would you do to handle the following case:
>>
>>pid = getpid();
>>if (sys_clone(CLONE_CONTAINER) == 0) {
>>	ppid = getppid();
>>	assert(ppid == pid);
>>}
>>
>>Most of the calls involving resource ids will return values that aren't
>>*consistent* with ids already stored in userland... could possibly break some
>>piece of code. Perhaps a sys_exec() should also be enforced to reset the process
>>memory.
> 
> 
> Well that assertion will fail.
> At that point getppid() will return 0, and getpid() will return 1.
> 
> Processes getting confused is their own problem.
> 

This flavour of clone should be used with great care then since it breaks the
usual unix process semantics. :)

> Now there will be a pid that the parent sees that will not be 0.
> And that is what the parent will see in the context of wait.
> 
> In my code I introduced a wid (wait id) for that purpose.
> 

Is it possible to see the code ?

> This makes it possible to manage a container using the usual unix
> process semantics.  Which is very important.
> 
> 
> Eric
> 

Thanks.

-Greg-

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 13:50                           ` Greg
@ 2006-02-02 14:09                             ` Eric W. Biederman
  0 siblings, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-02 14:09 UTC (permalink / raw)
  To: Greg
  Cc: Linus Torvalds, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

Greg <gkurz@fr.ibm.com> writes:

>> Well that assertion will fail.
>> At that point getppid() will return 0, and getpid() will return 1.
>> 
>> Processes getting confused is their own problem.
>> 
>
> This flavour of clone should be used with great care then since it breaks the
> usual unix process semantics. :)

Do you know of a flavor of clone (besides fork) that doesn't share that
property?

For the most part I am not breaking the usual process semantics I work
very hard to preserve it but simply which pids you see are different.
Which what would be expected.

>> Now there will be a pid that the parent sees that will not be 0.
>> And that is what the parent will see in the context of wait.
>> 
>> In my code I introduced a wid (wait id) for that purpose.
>> 
>
> Is it possible to see the code ?

git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-ns.git/

The tree is still at the proof of concept level but it does come fairly close
to what has been discussed.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-31 21:02                   ` RFC [patch 13/34] PID Virtualization Define new task_pid api Linus Torvalds
  2006-02-01  0:01                     ` Hubertus Franke
  2006-02-01  4:18                     ` Eric W. Biederman
@ 2006-02-02 14:48                     ` Kirill Korotaev
  2006-02-02 15:13                       ` Eric W. Biederman
  2 siblings, 1 reply; 136+ messages in thread
From: Kirill Korotaev @ 2006-02-02 14:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Hubertus Franke, Eric W. Biederman, Dave Hansen, Greg KH,
	Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List, Cedric Le Goater

> This is my preferred approach BY FAR.
> 
> Doing a <container,pid> approach is very natural, and avoids almost all 
> issues. At most, you might want to have a new system call (most naturally 
> just the one that is limited to the "init container" - it the one that we 
> boot up with) that can specify both container and pid explicitly, and see 
> all processes and access all processes. But all "normal" system calls 
> would only ever operate within their container.
> 
> The fact is, we want "containers" anyway for any virtualization thing, ie 
> vserver already adds them. And if we have containers, then it's very easy 
> ("easyish") to split up the current static "pid_hash[]", "pidmap_array[]" 
> and "pidmap_lock", and make them per-container, and have a pointer to the 
> container for each "struct task_struct".
In fact this is almost what OpenVZ does for half a year, both containers 
and VPIDs.
But it is very usefull to see process tree from host system. To be able 
to use std tools to manage containers from host (i.e. ps, kill, top, 
etc.). So it is much more convinient to have 2 pids. One globally 
unique, and one for container.

> After that, there wouldn't even be a lot else to do. The normal system 
> calls would just use their own container, and the (few) places that save 
> away pid's for later (ie things that use "kill_proc_info_as_uid()" and 
> "struct fown_struct" friends) would have to also squirrell away the 
> container, but then you should be pretty much done.
> 
> Of course, you'll have to do the system calls to _create_ the containers 
> in the first place, but that's at a higher level and involves much more 
> than just the pid-space (ie a container would normally have more than just 
> the uid mappings, it would have any network knowledge too etc - hostname, 
> perhaps list of network devices associated with that context etc etc)

Kirill


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-01-17 17:25         ` Dave Hansen
  2006-01-18  4:54           ` Greg KH
  2006-01-18  4:55           ` Greg KH
@ 2006-02-02 14:49           ` Kirill Korotaev
  2 siblings, 0 replies; 136+ messages in thread
From: Kirill Korotaev @ 2006-02-02 14:49 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Alan Cox, Serge E. Hallyn, Arjan van de Ven, linux-kernel,
	Hubertus Franke, Cedric Le Goater

Hello all,

>>>The virtual pid is different depending on who is asking.  So simply
>>>storing current->realpid and current->pid isn't helpful, as we would
>>>still need to call a function when a pid crosses user->kernel boundary.
>>
>>This is an obscure, weird piece of functionality for some special case
>>usages most of which are going to be eliminated by Xen. I don't see the
>>kernel side justification for it at all.
> 
> 
> At least OpenVZ and vserver want very similar functionality.  They're
> both working with out-of-tree patch sets.  We each want to do subtly
> different things with tsk->pid, and task_pid() seemed to be a decent
> place to start.  OpenVZ has a very similar concept in its pid_task()
> function.
But our VPID patch is much less intrusive and shorter (thanks to Alexey 
Kuznetsov).
I will send a broken-out patches today CC-ing you.

BTW, have you tested it somehow (LTP, etc.)?

> Arjan had a very good point last time we posted these: we should
> consider getting rid of as many places in the kernel where pids are used
> to uniquely identify tasks, and just stick with task_struct pointers.  

Kirill


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 14:48                     ` Kirill Korotaev
@ 2006-02-02 15:13                       ` Eric W. Biederman
  2006-02-02 15:26                         ` Kirill Korotaev
  2006-02-06 20:15                         ` Pavel Machek
  0 siblings, 2 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-02 15:13 UTC (permalink / raw)
  To: Kirill Korotaev
  Cc: Linus Torvalds, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

Kirill Korotaev <dev@sw.ru> writes:

> In fact this is almost what OpenVZ does for half a year, both containers and
> VPIDs.
> But it is very usefull to see process tree from host system. To be able to use
> std tools to manage containers from host (i.e. ps, kill, top, etc.). So it is
> much more convinient to have 2 pids. One globally unique, and one for container.

There are two issues here.
1) Monitoring.  (ps, top etc)
2) Control (kill).

For monitoring you might need to patch ps/top a little but it is doable without
2 pids.

For kill it is extremely rude to kill processes inside of a nested pid space.
There are other solutions to the problem.

It is undesireable to make it too easy to communicate through the barrier because
then applications may start to take advantage of it and then depend on
it and you will have lost the isolation that the container gives you.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 15:13                       ` Eric W. Biederman
@ 2006-02-02 15:26                         ` Kirill Korotaev
  2006-02-02 15:51                           ` Eric W. Biederman
  2006-02-06 20:15                         ` Pavel Machek
  1 sibling, 1 reply; 136+ messages in thread
From: Kirill Korotaev @ 2006-02-02 15:26 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linus Torvalds, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

>>In fact this is almost what OpenVZ does for half a year, both containers and
>>VPIDs.
>>But it is very usefull to see process tree from host system. To be able to use
>>std tools to manage containers from host (i.e. ps, kill, top, etc.). So it is
>>much more convinient to have 2 pids. One globally unique, and one for container.
> 
> 
> There are two issues here.
> 1) Monitoring.  (ps, top etc)
> 2) Control (kill).
> 
> For monitoring you might need to patch ps/top a little but it is doable without
> 2 pids.
> 
> For kill it is extremely rude to kill processes inside of a nested pid space.
> There are other solutions to the problem.
it is not always good idea to fix the tools everytime new functionality 
is involved. why do you think there are no more tools except for 
ps,top,kill? will you fix it all?

Another example, when you have problems in your VPS it is very 
convinient to attach with strace/gdb/etc from the host. Will you patch 
it as well?

OpenVZ big advantage is this ease of administering compared to VM 
approach and it is not good idea to forbid this. If you have broken VM 
you have problems, in OpenVZ you have control over VPSs.

> It is undesireable to make it too easy to communicate through the barrier because
> then applications may start to take advantage of it and then depend on
> it and you will have lost the isolation that the container gives you.
in OpenVZ we have VPSs fully isolated between each other.
But host system has access to some of VPS resources such as files, 
processes, etc. I understand your concern which is related to 
checkpointing, yeah?

Kirill


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 15:26                         ` Kirill Korotaev
@ 2006-02-02 15:51                           ` Eric W. Biederman
  2006-02-02 16:05                             ` Kirill Korotaev
  2006-02-02 21:10                             ` Cedric Le Goater
  0 siblings, 2 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-02 15:51 UTC (permalink / raw)
  To: Kirill Korotaev
  Cc: Linus Torvalds, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

Kirill Korotaev <dev@sw.ru> writes:

>>>In fact this is almost what OpenVZ does for half a year, both containers and
>>>VPIDs.
>>>But it is very usefull to see process tree from host system. To be able to use
>>>std tools to manage containers from host (i.e. ps, kill, top, etc.). So it is
>>>much more convinient to have 2 pids. One globally unique, and one for
> container.
>> There are two issues here.
>> 1) Monitoring.  (ps, top etc)
>> 2) Control (kill).
>> For monitoring you might need to patch ps/top a little but it is doable
> without
>> 2 pids.
>> For kill it is extremely rude to kill processes inside of a nested pid space.
>> There are other solutions to the problem.
> it is not always good idea to fix the tools everytime new functionality is
> involved. why do you think there are no more tools except for ps,top,kill? will
> you fix it all?

No.  In my implementation the nested pid space has one pid that essentially
looks like a threaded process.  So ps and top can see that something is there
but you aren't spammed with all of the pids.  In addition for top the cpu utilization
for all of the pids are shown for that one pid (just like the thread group leader of
on a threaded process).  So there is no confusion.

If you want the detailed information you can either chroot to an environment
where the appropriate version of /proc is available.  Or you can modify your
tools.

> Another example, when you have problems in your VPS it is very convinient to
> attach with strace/gdb/etc from the host. Will you patch it as well?
>
> OpenVZ big advantage is this ease of administering compared to VM approach and
> it is not good idea to forbid this. If you have broken VM you have problems, in
> OpenVZ you have control over VPSs.
>
>> It is undesireable to make it too easy to communicate through the barrier
> because
>> then applications may start to take advantage of it and then depend on
>> it and you will have lost the isolation that the container gives you.
> in OpenVZ we have VPSs fully isolated between each other.
> But host system has access to some of VPS resources such as files, processes,
> etc. I understand your concern which is related to checkpointing, yeah?

There areas.
1) Checkpointing.
2) Isolation for security purposes.
   There may be secrets that the sysadmin should not have access to.
3) Nesting of containers, (so they are general purpose and not special hacks).

The vserver way of solving some of these problems is to provide a way
to enter the guest.  I would rather have some explicit operation that puts
you into the guest context so there is a single point where we can tackle
the nested security issues, than to have hundreds of places we have to
look at individually.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 15:51                           ` Eric W. Biederman
@ 2006-02-02 16:05                             ` Kirill Korotaev
  2006-02-02 16:27                               ` Eric W. Biederman
  2006-02-02 21:10                             ` Cedric Le Goater
  1 sibling, 1 reply; 136+ messages in thread
From: Kirill Korotaev @ 2006-02-02 16:05 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linus Torvalds, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

> There areas.
> 1) Checkpointing.
> 2) Isolation for security purposes.
>    There may be secrets that the sysadmin should not have access to.
I hope you understand, that such things do not make anything secure. 
Administrator of the node will always have access to /proc/kcore, 
devices, KERNEL CODE(!) etc. No security from this point of view.

> 3) Nesting of containers, (so they are general purpose and not special hacks).
Why are you interested in nesting? Any applications for this?
Until everything is virtualized in nesting way (including TCP/IP stack, 
routing etc.) I see no much use of it.

> The vserver way of solving some of these problems is to provide a way
> to enter the guest.  I would rather have some explicit operation that puts
> you into the guest context so there is a single point where we can tackle
> the nested security issues, than to have hundreds of places we have to
> look at individually.
Huh, it sounds too easy. Just imagine that VPS owner has deleted ps, 
top, kill, bash and other tools. You won't be able to enter. Another 
example when VPS owner is near its resource limits - you won't be able 
to do anything after VPS entering.
Do you need other examples?

Kirill


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 16:05                             ` Kirill Korotaev
@ 2006-02-02 16:27                               ` Eric W. Biederman
  2006-02-02 21:32                                 ` Cedric Le Goater
  2006-02-03 10:52                                 ` Kirill Korotaev
  0 siblings, 2 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-02 16:27 UTC (permalink / raw)
  To: Kirill Korotaev
  Cc: Linus Torvalds, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

Kirill Korotaev <dev@sw.ru> writes:

>> There areas.
>> 1) Checkpointing.
>> 2) Isolation for security purposes.
>>    There may be secrets that the sysadmin should not have access to.
> I hope you understand, that such things do not make anything
> secure. Administrator of the node will always have access to /proc/kcore,
> devices, KERNEL CODE(!) etc. No security from this point of view.

Only if they have CAP_SYS_RAWIO.  I admit it takes a lot more
to get there than just that.  But having a mechanism that has the
potential to be secured and is much simpler to understand
and to setup for minimal privileges than any of the other unix
addons I have seen is very interesting.


>> 3) Nesting of containers, (so they are general purpose and not special hacks).
> Why are you interested in nesting? Any applications for this?
> Until everything is virtualized in nesting way (including TCP/IP stack, routing
> etc.) I see no much use of it.

For everything except the PID namespace I am just interested in having multiple
separate namespaces.  For the PID namespace to keep the traditional unix
model you need a parent process so it is actually nesting.

I am interested because, it is easy, because if it is possible than
the range of applications you can apply a containers to is much
larger.  At the far end of that spectrum is migrating a server running
on real hardware and bringing it up as a guest on a newer much more
powerful machine.  With the appearance that it had only been
unreachable for a few seconds.

>> The vserver way of solving some of these problems is to provide a way
>> to enter the guest.  I would rather have some explicit operation that puts
>> you into the guest context so there is a single point where we can tackle
>> the nested security issues, than to have hundreds of places we have to
>> look at individually.
> Huh, it sounds too easy. Just imagine that VPS owner has deleted ps, top, kill,
> bash and other tools. You won't be able to enter. 

Entering is different from execing a process on the inside.
Implementation wise it is changing the context pointer on your task.

> Another example when VPS owner
> is near its resource limits - you won't be able to do anything after VPS
> entering.

For debugging this is a good reason for being inside.  What if the
problem is that you are out of resources?  

I have no intention of requiring monitoring to work from the inside though.

> Do you need other examples?

No I need to post patches.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 15:51                           ` Eric W. Biederman
  2006-02-02 16:05                             ` Kirill Korotaev
@ 2006-02-02 21:10                             ` Cedric Le Goater
  2006-02-02 21:24                               ` Eric W. Biederman
  1 sibling, 1 reply; 136+ messages in thread
From: Cedric Le Goater @ 2006-02-02 21:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill Korotaev, Linus Torvalds, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List

Eric W. Biederman wrote:

> If you want the detailed information you can either chroot to an environment
> where the appropriate version of /proc is available.  Or you can modify your
> tools.

did you modify /proc to be able to mount it multiples times on the same
system ?

C.

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 21:10                             ` Cedric Le Goater
@ 2006-02-02 21:24                               ` Eric W. Biederman
  0 siblings, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-02 21:24 UTC (permalink / raw)
  To: Cedric Le Goater
  Cc: Kirill Korotaev, Linus Torvalds, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List

Cedric Le Goater <clg@fr.ibm.com> writes:

> Eric W. Biederman wrote:
>
>> If you want the detailed information you can either chroot to an environment
>> where the appropriate version of /proc is available.  Or you can modify your
>> tools.
>
> did you modify /proc to be able to mount it multiples times on the same
> system ?

Yes.

Eric


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 16:27                               ` Eric W. Biederman
@ 2006-02-02 21:32                                 ` Cedric Le Goater
  2006-02-02 21:43                                   ` Hubertus Franke
  2006-02-02 21:46                                   ` Eric W. Biederman
  2006-02-03 10:52                                 ` Kirill Korotaev
  1 sibling, 2 replies; 136+ messages in thread
From: Cedric Le Goater @ 2006-02-02 21:32 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill Korotaev, Linus Torvalds, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List

Eric W. Biederman wrote:

> For everything except the PID namespace I am just interested in having multiple
> separate namespaces.  For the PID namespace to keep the traditional unix
> model you need a parent process so it is actually nesting.
> 
> I am interested because, it is easy, because if it is possible than
> the range of applications you can apply a containers to is much
> larger.  At the far end of that spectrum is migrating a server running
> on real hardware and bringing it up as a guest on a newer much more
> powerful machine.  With the appearance that it had only been
> unreachable for a few seconds.

We gave a name to such containers. We call them 'application' containers
just to make a difference with the 'system' containers, like vserver or openvz.

'application' containers are very useful in an HPC environment where you
can trig checkpoint/restart through batch managers. But, this model,
really, has some virtualization issues on the edge. The virtualisation of
the parent process of such a container is tricky, it can be in multiple
containers at the same time, it can die (difficult to restart ...), it
could belong to another process groups, etc. Plenty of annoying cases to
handle.

'system' containers, like vserver or openvz, are safer because they use a
private PID space.

Now, would it be possible to have an 'application' container using a
private PID space and being friendly to the usual unix process semantics ?
We haven't found a solution yet ...

> Entering is different from execing a process on the inside.
> Implementation wise it is changing the context pointer on your task.

yes and you could restrict that privilege to some top level container, like
the container 0.

C.


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 21:32                                 ` Cedric Le Goater
@ 2006-02-02 21:43                                   ` Hubertus Franke
  2006-02-02 21:46                                   ` Eric W. Biederman
  1 sibling, 0 replies; 136+ messages in thread
From: Hubertus Franke @ 2006-02-02 21:43 UTC (permalink / raw)
  To: Cedric Le Goater
  Cc: Eric W. Biederman, Kirill Korotaev, Linus Torvalds, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List

On Thu, 2006-02-02 at 22:32 +0100, Cedric Le Goater wrote:
> Eric W. Biederman wrote:
> 
> > For everything except the PID namespace I am just interested in having multiple
> > separate namespaces.  For the PID namespace to keep the traditional unix
> > model you need a parent process so it is actually nesting.
> > 
> > I am interested because, it is easy, because if it is possible than
> > the range of applications you can apply a containers to is much
> > larger.  At the far end of that spectrum is migrating a server running
> > on real hardware and bringing it up as a guest on a newer much more
> > powerful machine.  With the appearance that it had only been
> > unreachable for a few seconds.
> 
> We gave a name to such containers. We call them 'application' containers
> just to make a difference with the 'system' containers, like vserver or openvz.
> 
> 'application' containers are very useful in an HPC environment where you
> can trig checkpoint/restart through batch managers. But, this model,
> really, has some virtualization issues on the edge. The virtualisation of
> the parent process of such a container is tricky, it can be in multiple
> containers at the same time, it can die (difficult to restart ...), it
> could belong to another process groups, etc. Plenty of annoying cases to
> handle.

I guess we all experienced that (see everybodies patches).

> 
> 'system' containers, like vserver or openvz, are safer because they use a
> private PID space.

Maybe we are talking about different things, but I thought our patch-set
does provide private pid spaces ( by putting <pid,container> tuple into
the same pid ). Yes, privacy is achieved only by guarding the edges, but
the next approach that was discussed on the mailing list was to indeed 
make container's first class object, hang the pid mgmt and pid lookup
off the container and thus force the isolation.
Containers then hence move up to be children of "init".

>
> Now, would it be possible to have an 'application' container using a
> private PID space and being friendly to the usual unix process semantics ?
> We haven't found a solution yet ...

Could you spell out what specific UNIX semantics you are referring to.
That would help and we can discuss whether and how they can be
addressed.

> 
> > Entering is different from execing a process on the inside.
> > Implementation wise it is changing the context pointer on your task.
> 
> yes and you could restrict that privilege to some top level container, like
> the container 0.
> 

-- 
Hubertus Franke <frankeh@watson.ibm.com>


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 21:32                                 ` Cedric Le Goater
  2006-02-02 21:43                                   ` Hubertus Franke
@ 2006-02-02 21:46                                   ` Eric W. Biederman
  2006-02-03 10:07                                     ` Kirill Korotaev
  1 sibling, 1 reply; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-02 21:46 UTC (permalink / raw)
  To: Cedric Le Goater
  Cc: Kirill Korotaev, Linus Torvalds, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List

Cedric Le Goater <clg@fr.ibm.com> writes:

> Now, would it be possible to have an 'application' container using a
> private PID space and being friendly to the usual unix process semantics ?
> We haven't found a solution yet ...

Well that is what I implemented.  So I am pretty certain it is solvable.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 21:46                                   ` Eric W. Biederman
@ 2006-02-03 10:07                                     ` Kirill Korotaev
  0 siblings, 0 replies; 136+ messages in thread
From: Kirill Korotaev @ 2006-02-03 10:07 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Cedric Le Goater, Linus Torvalds, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List

>>Now, would it be possible to have an 'application' container using a
>>private PID space and being friendly to the usual unix process semantics ?
>>We haven't found a solution yet ...
> 
> 
> Well that is what I implemented.  So I am pretty certain it is solvable.
Exactly. This is what our patch does also. It is solvable. Tested by LTP 
and numerous other tests/production systems etc.

Kirill



^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 16:27                               ` Eric W. Biederman
  2006-02-02 21:32                                 ` Cedric Le Goater
@ 2006-02-03 10:52                                 ` Kirill Korotaev
  2006-02-03 11:09                                   ` Eric W. Biederman
  2006-02-03 15:45                                   ` Dave Hansen
  1 sibling, 2 replies; 136+ messages in thread
From: Kirill Korotaev @ 2006-02-03 10:52 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linus Torvalds, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

>>I hope you understand, that such things do not make anything
>>secure. Administrator of the node will always have access to /proc/kcore,
>>devices, KERNEL CODE(!) etc. No security from this point of view.
> Only if they have CAP_SYS_RAWIO.  I admit it takes a lot more
> to get there than just that.  But having a mechanism that has the
> potential to be secured and is much simpler to understand
> and to setup for minimal privileges than any of the other unix
> addons I have seen is very interesting.
ok. I suppose it can be done as an option. If required, access from host 
system can be allowed. If "secure" environment is requested - fully 
isolated.

>>>3) Nesting of containers, (so they are general purpose and not special hacks).
>>
>>Why are you interested in nesting? Any applications for this?
>>Until everything is virtualized in nesting way (including TCP/IP stack, routing
>>etc.) I see no much use of it.
> For everything except the PID namespace I am just interested in having multiple
> separate namespaces.  For the PID namespace to keep the traditional unix
> model you need a parent process so it is actually nesting.
Yes, but nesting can be one level as in OpenVZ, when VPS is a nested 
namespace inside host system or it can be a fully isolated separate 
traditional namespace.

By real nesting I mean hierarchical containers, when containers inside 
multiple containers are allowed. This is hard to implement. For example, 
for real containers you will need to have isolated TCP/IP stacks and 
with complex rules of routing etc.

> I am interested because, it is easy, because if it is possible than
> the range of applications you can apply a containers to is much
> larger.  At the far end of that spectrum is migrating a server running
> on real hardware and bringing it up as a guest on a newer much more
> powerful machine.  With the appearance that it had only been
> unreachable for a few seconds.
You can use fully isolated containers like OpenVZ VPSs for this. They 
are naturally suitable for this, because provide you not PIDs isolation 
only, but also IPC, sockets, etc.

How can you migrate application which consists of two processes doing 
IPC via signals? They are not tired inside kernel anyhow and there is no 
way to automatically detect that both should be migrated together.
VPSs what provides you such kind of boundaries of what should be 
considered as a whole.

>>>The vserver way of solving some of these problems is to provide a way
>>>to enter the guest.  I would rather have some explicit operation that puts
>>>you into the guest context so there is a single point where we can tackle
>>>the nested security issues, than to have hundreds of places we have to
>>>look at individually.
>>
>>Huh, it sounds too easy. Just imagine that VPS owner has deleted ps, top, kill,
>>bash and other tools. You won't be able to enter. 

> Entering is different from execing a process on the inside.
> Implementation wise it is changing the context pointer on your task.
If I understand you correctly it is fully insecure way of doing things. 
After changing context without applying all the restrictions which 
should be implied by VPS your process will be ptrace'able and so on.

>>Another example when VPS owner
>>is near its resource limits - you won't be able to do anything after VPS
>>entering.
> For debugging this is a good reason for being inside.  What if the
> problem is that you are out of resources?  
Debugging - yes, in production - no.
That is why OpenVZ allows host system to access VPS resources - for 
debugging in production.

> I have no intention of requiring monitoring to work from the inside though.
>>Do you need other examples?
> No I need to post patches.
Thanks a lot for valuable discussion and your time!

Kirill


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-03 10:52                                 ` Kirill Korotaev
@ 2006-02-03 11:09                                   ` Eric W. Biederman
  2006-02-03 15:45                                   ` Dave Hansen
  1 sibling, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-03 11:09 UTC (permalink / raw)
  To: Kirill Korotaev
  Cc: Linus Torvalds, Hubertus Franke, Dave Hansen, Greg KH, Alan Cox,
	Serge E. Hallyn, Arjan van de Ven, Linux Kernel Mailing List,
	Cedric Le Goater

Kirill Korotaev <dev@sw.ru> writes:

>>>I hope you understand, that such things do not make anything
>>>secure. Administrator of the node will always have access to /proc/kcore,
>>>devices, KERNEL CODE(!) etc. No security from this point of view.
>> Only if they have CAP_SYS_RAWIO.  I admit it takes a lot more
>> to get there than just that.  But having a mechanism that has the
>> potential to be secured and is much simpler to understand
>> and to setup for minimal privileges than any of the other unix
>> addons I have seen is very interesting.
> ok. I suppose it can be done as an option. If required, access from host system
> can be allowed. If "secure" environment is requested - fully isolated.
>
>>>>3) Nesting of containers, (so they are general purpose and not special
> hacks).
>>>
>>>Why are you interested in nesting? Any applications for this?
>>>Until everything is virtualized in nesting way (including TCP/IP stack,
> routing
>>>etc.) I see no much use of it.
>> For everything except the PID namespace I am just interested in having
> multiple
>> separate namespaces.  For the PID namespace to keep the traditional unix
>> model you need a parent process so it is actually nesting.
> Yes, but nesting can be one level as in OpenVZ, when VPS is a nested namespace
> inside host system or it can be a fully isolated separate traditional namespace.
>
> By real nesting I mean hierarchical containers, when containers inside multiple
> containers are allowed. This is hard to implement. For example, for real
> containers you will need to have isolated TCP/IP stacks and with complex rules
> of routing etc.

TCP/IP is a pain because it has so many global static variables, but otherwise
it is easier than PIDs.  You just need what looks like 2 instances
of the network stack.   And since you usually don't have enough real
hardware a 2 headed network device that acts like 2 NICS plugged into
a cross over cable.

>> I am interested because, it is easy, because if it is possible than
>> the range of applications you can apply a containers to is much
>> larger.  At the far end of that spectrum is migrating a server running
>> on real hardware and bringing it up as a guest on a newer much more
>> powerful machine.  With the appearance that it had only been
>> unreachable for a few seconds.

> You can use fully isolated containers like OpenVZ VPSs for this. They are
> naturally suitable for this, because provide you not PIDs isolation only, but
> also IPC, sockets, etc.

Exactly.

> How can you migrate application which consists of two processes doing IPC via
> signals? They are not tired inside kernel anyhow and there is no way to
> automatically detect that both should be migrated together.
> VPSs what provides you such kind of boundaries of what should be considered as a
> whole.

I always look at migration in terms of a container/VPS.

However the entire system can be considered one such container.  So on the
extreme side you can migrate everything to another machine.  It's not
a hard requirement but it would be nice if the mechanism wasn't so special
that it prevented that.


>>>
>>>Huh, it sounds too easy. Just imagine that VPS owner has deleted ps, top,
> kill,
>>> bash and other tools. You won't be able to enter.
>
>> Entering is different from execing a process on the inside.
>> Implementation wise it is changing the context pointer on your task.
> If I understand you correctly it is fully insecure way of doing things. After
> changing context without applying all the restrictions which should be implied
> by VPS your process will be ptrace'able and so on.

Not exactly insecure.   But something you need to be careful with.

It's an idea I don't like personally.  But I like it more than adhoc
mechanisms for modify what I guest gets to look at.


>>>Another example when VPS owner
>>>is near its resource limits - you won't be able to do anything after VPS
>>>entering.
>> For debugging this is a good reason for being inside.  What if the
>> problem is that you are out of resources?
> Debugging - yes, in production - no.

I was talking about sysadmin style of debugging.

> That is why OpenVZ allows host system to access VPS resources - for debugging in
> production.

This is something I will freely admit is up on the air, how this
should be accomplished.  But I don't want an assumption that the host
system will always be able to access guest resources.  

> Thanks a lot for valuable discussion and your time!

Welcome.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-03 10:52                                 ` Kirill Korotaev
  2006-02-03 11:09                                   ` Eric W. Biederman
@ 2006-02-03 15:45                                   ` Dave Hansen
  2006-02-03 16:35                                     ` Kirill Korotaev
  1 sibling, 1 reply; 136+ messages in thread
From: Dave Hansen @ 2006-02-03 15:45 UTC (permalink / raw)
  To: Kirill Korotaev
  Cc: Eric W. Biederman, Linus Torvalds, Hubertus Franke, Greg KH,
	Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List, Cedric Le Goater

On Fri, 2006-02-03 at 13:52 +0300, Kirill Korotaev wrote:
> How can you migrate application which consists of two processes doing 
> IPC via signals? They are not tired inside kernel anyhow and there is
> no way to automatically detect that both should be migrated together.
> VPSs what provides you such kind of boundaries of what should be 
> considered as a whole.

Could you explain a little bit _how_ VPSs provide this?

-- Dave


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-03 15:45                                   ` Dave Hansen
@ 2006-02-03 16:35                                     ` Kirill Korotaev
  0 siblings, 0 replies; 136+ messages in thread
From: Kirill Korotaev @ 2006-02-03 16:35 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Eric W. Biederman, Linus Torvalds, Hubertus Franke, Greg KH,
	Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List, Cedric Le Goater

>>How can you migrate application which consists of two processes doing 
>>IPC via signals? They are not tired inside kernel anyhow and there is
>>no way to automatically detect that both should be migrated together.
>>VPSs what provides you such kind of boundaries of what should be 
>>considered as a whole.
> 
> 
> Could you explain a little bit _how_ VPSs provide this?
OpenVZ virtualizes kernel resources such as IPC making them per-VPS. 
Processes in VPS deal with such virtualized resources, so they 
efficiently should be migrated together. That's it - a whole container 
with its resources should be considered as a whole.

Signals are not virtualized in this manner since they are rather 
per-task resource, but OpenVZ introduces strict boundaries in kernel 
which make sure that no any 3rd task from another container (VPS) will 
be participating in the communication.

Kirill


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-02 15:13                       ` Eric W. Biederman
  2006-02-02 15:26                         ` Kirill Korotaev
@ 2006-02-06 20:15                         ` Pavel Machek
  2006-02-06 20:34                           ` Eric W. Biederman
  2006-02-06 20:36                           ` Kirill Korotaev
  1 sibling, 2 replies; 136+ messages in thread
From: Pavel Machek @ 2006-02-06 20:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Kirill Korotaev, Linus Torvalds, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List, Cedric Le Goater

Hi!

> There are two issues here.
> 1) Monitoring.  (ps, top etc)
> 2) Control (kill).
> 
> For monitoring you might need to patch ps/top a little but it is doable without
> 2 pids.
> 
> For kill it is extremely rude to kill processes inside of a nested pid space.
> There are other solutions to the problem.

Can you elaborate? If I have 10 containers with 1000 processes each,
it would be nice to be able to run top then kill 40 top cpu hogs....
-- 
Thanks, Sharp!

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-06 20:15                         ` Pavel Machek
@ 2006-02-06 20:34                           ` Eric W. Biederman
  2006-02-06 20:36                           ` Kirill Korotaev
  1 sibling, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-06 20:34 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Kirill Korotaev, Linus Torvalds, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List, Cedric Le Goater

Pavel Machek <pavel@ucw.cz> writes:

> Hi!
>
>> There are two issues here.
>> 1) Monitoring.  (ps, top etc)
>> 2) Control (kill).
>> 
>> For monitoring you might need to patch ps/top a little but it is doable
> without
>> 2 pids.
>> 
>> For kill it is extremely rude to kill processes inside of a nested pid space.
>> There are other solutions to the problem.
>
> Can you elaborate? If I have 10 containers with 1000 processes each,
> it would be nice to be able to run top then kill 40 top cpu hogs....

So I just posted my patches to lkml if you want to see the details.
But the way I have implemented it each container has a pid in it's parent's
pid namespace.  

That pid to top looks essentially a thread group leader, and top will
tell you which container contains the cpu hogs.

Currently from the outside your choice is to kill or not kill the entire
container.  Which let's you kill the cpu hogs.  The control is not broken
down so fine that it is easy to do something the sysadmin of the container
should be doing.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-06 20:15                         ` Pavel Machek
  2006-02-06 20:34                           ` Eric W. Biederman
@ 2006-02-06 20:36                           ` Kirill Korotaev
  2006-02-06 20:40                             ` Eric W. Biederman
  1 sibling, 1 reply; 136+ messages in thread
From: Kirill Korotaev @ 2006-02-06 20:36 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Eric W. Biederman, Linus Torvalds, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List, Cedric Le Goater

>>There are two issues here.
>>1) Monitoring.  (ps, top etc)
>>2) Control (kill).
>>
>>For monitoring you might need to patch ps/top a little but it is doable without
>>2 pids.
>>
>>For kill it is extremely rude to kill processes inside of a nested pid space.
>>There are other solutions to the problem.

> Can you elaborate? If I have 10 containers with 1000 processes each,
> it would be nice to be able to run top then kill 40 top cpu hogs....
This is exactly the reason why we allow host system to see all the 
containers/VPSs/processes.

Otherwise monitoring tools should be fixed for it, which doesn't look 
good and top/ps/kill are not the only tools in the world.
Without such functionality you can't understand whether you machine is 
underloaded or overloaded.

Kirill


^ permalink raw reply	[flat|nested] 136+ messages in thread

* Re: RFC [patch 13/34] PID Virtualization Define new task_pid api
  2006-02-06 20:36                           ` Kirill Korotaev
@ 2006-02-06 20:40                             ` Eric W. Biederman
  0 siblings, 0 replies; 136+ messages in thread
From: Eric W. Biederman @ 2006-02-06 20:40 UTC (permalink / raw)
  To: Kirill Korotaev
  Cc: Pavel Machek, Linus Torvalds, Hubertus Franke, Dave Hansen,
	Greg KH, Alan Cox, Serge E. Hallyn, Arjan van de Ven,
	Linux Kernel Mailing List, Cedric Le Goater

Kirill Korotaev <dev@sw.ru> writes:

> This is exactly the reason why we allow host system to see all the
> containers/VPSs/processes.

Which makes for a really hairy, and noticably different logical implementation.
At least that was my impression when glancing at your patches.  I haven't
had a chance to look at them in depth yet.

> Otherwise monitoring tools should be fixed for it, which doesn't look good and
> top/ps/kill are not the only tools in the world.
> Without such functionality you can't understand whether you machine is
> underloaded or overloaded.

Look at my code please.  I think it is a place in the problem domain
you haven't considered.

Except for detailed information everything is there for existing tools.

Eric

^ permalink raw reply	[flat|nested] 136+ messages in thread

end of thread, other threads:[~2006-02-06 20:41 UTC | newest]

Thread overview: 136+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-17 14:32 RFC [patch 00/34] PID Virtualization Overview Serge Hallyn
2006-01-17 14:32 ` RFC [patch 01/34] PID Virtualization Change pid accesses: drivers Serge Hallyn
2006-01-17 14:33 ` RFC [patch 02/34] PID Virtualization Change pid accesses: most archs Serge Hallyn
2006-01-17 14:33 ` RFC [patch 03/34] PID Virtualization Change pid accesses: filesystems Serge Hallyn
2006-01-17 14:33 ` RFC [patch 04/34] PID Virtualization Change pid accesses: include/ Serge Hallyn
2006-01-17 14:33 ` RFC [patch 05/34] PID Virtualization Change pid accesses: ipc Serge Hallyn
2006-01-17 14:33 ` RFC [patch 06/34] PID Virtualization Change pid accesses: kernel/ Serge Hallyn
2006-01-17 14:33 ` RFC [patch 07/34] PID Virtualization Change pid accesses: lib/ Serge Hallyn
2006-01-17 14:33 ` RFC [patch 08/34] PID Virtualization Change pid accesses: mm/ Serge Hallyn
2006-01-17 14:33 ` RFC [patch 09/34] PID Virtualization Change pid accesses: net/ Serge Hallyn
2006-01-17 14:33 ` RFC [patch 10/34] PID Virtualization Change pid accesses: security/ Serge Hallyn
2006-01-17 14:33 ` RFC [patch 11/34] PID Virtualization Change pid accesses: sound/ Serge Hallyn
2006-01-17 14:33 ` RFC [patch 12/34] PID Virtualization Change pid accesses: ia64 and mips Serge Hallyn
2006-01-17 14:33 ` RFC [patch 13/34] PID Virtualization Define new task_pid api Serge Hallyn
2006-01-17 15:32   ` Arjan van de Ven
2006-01-17 15:56     ` Serge E. Hallyn
2006-01-17 16:02       ` Arjan van de Ven
2006-01-17 16:03       ` Alan Cox
2006-01-17 17:16         ` Kyle Moffett
2006-01-17 17:25         ` Dave Hansen
2006-01-18  4:54           ` Greg KH
2006-01-18  4:55           ` Greg KH
2006-01-18 16:23             ` Dave Hansen
2006-01-20 17:00               ` Eric W. Biederman
2006-01-20 20:18                 ` Hubertus Franke
2006-01-21 10:25                   ` Eric W. Biederman
2006-01-23 18:38                     ` Hubertus Franke
2006-01-23 18:48                       ` Eric W. Biederman
2006-01-21 14:42                   ` Eric W. Biederman
2006-01-22  6:43                     ` Kyle Moffett
2006-01-22 15:48                       ` Eric W. Biederman
2006-01-22 15:55                         ` Arjan van de Ven
2006-01-22 16:24                           ` Eric W. Biederman
2006-01-26 20:01                             ` Herbert Poetzl
2006-01-27  9:04                               ` Eric W. Biederman
2006-01-27 12:27                                 ` Kyle Moffett
2006-01-27 13:15                                   ` Eric W. Biederman
2006-01-23 18:50                     ` Hubertus Franke
2006-01-23 19:28                       ` Eric W. Biederman
2006-01-23 21:11                         ` Alan Cox
2006-01-23 21:30                           ` Eric W. Biederman
2006-01-23 22:15                             ` Hubertus Franke
2006-01-24  6:56                               ` Arjan van de Ven
2006-01-24 19:34                               ` Eric W. Biederman
2006-01-24 21:09                                 ` Hubertus Franke
2006-01-24  0:22                             ` Alan Cox
2006-01-24 19:26                               ` Eric W. Biederman
2006-01-24 21:11                                 ` Alan Cox
2006-01-24 21:15                                   ` Arjan van de Ven
2006-01-25  9:58                                     ` Eric W. Biederman
2006-01-25 15:10                                       ` Trond Myklebust
2006-01-25 18:01                                         ` Eric W. Biederman
2006-01-25 19:30                                           ` Trond Myklebust
2006-01-25 21:59                                             ` Eric W. Biederman
2006-01-25  9:13                                   ` Eric W. Biederman
2006-01-25  9:51                                   ` Eric W. Biederman
2006-01-26 20:23                                     ` Herbert Poetzl
2006-01-27  8:28                                       ` Eric W. Biederman
     [not found]                       ` <m1k6cqlmfe.fsf_-_@ebiederm.dsl.xmission.com>
2006-01-23 21:57                         ` RFC: [PATCH] pids as weak references Dave Hansen
2006-01-31 21:02                   ` RFC [patch 13/34] PID Virtualization Define new task_pid api Linus Torvalds
2006-02-01  0:01                     ` Hubertus Franke
2006-02-01  4:18                     ` Eric W. Biederman
2006-02-01  4:39                       ` Linus Torvalds
2006-02-01  7:14                         ` Eric W. Biederman
2006-02-01 16:41                         ` Dave Hansen
2006-02-02  5:14                         ` Herbert Poetzl
2006-02-01 16:29                       ` Greg
2006-02-01 16:44                         ` Eric W. Biederman
2006-02-02 13:50                           ` Greg
2006-02-02 14:09                             ` Eric W. Biederman
2006-02-02 14:48                     ` Kirill Korotaev
2006-02-02 15:13                       ` Eric W. Biederman
2006-02-02 15:26                         ` Kirill Korotaev
2006-02-02 15:51                           ` Eric W. Biederman
2006-02-02 16:05                             ` Kirill Korotaev
2006-02-02 16:27                               ` Eric W. Biederman
2006-02-02 21:32                                 ` Cedric Le Goater
2006-02-02 21:43                                   ` Hubertus Franke
2006-02-02 21:46                                   ` Eric W. Biederman
2006-02-03 10:07                                     ` Kirill Korotaev
2006-02-03 10:52                                 ` Kirill Korotaev
2006-02-03 11:09                                   ` Eric W. Biederman
2006-02-03 15:45                                   ` Dave Hansen
2006-02-03 16:35                                     ` Kirill Korotaev
2006-02-02 21:10                             ` Cedric Le Goater
2006-02-02 21:24                               ` Eric W. Biederman
2006-02-06 20:15                         ` Pavel Machek
2006-02-06 20:34                           ` Eric W. Biederman
2006-02-06 20:36                           ` Kirill Korotaev
2006-02-06 20:40                             ` Eric W. Biederman
2006-02-02 14:49           ` Kirill Korotaev
2006-01-17 14:33 ` RFC [patch 14/34] PID Virtualization const parameter for process group Serge Hallyn
2006-01-17 14:33 ` RFC [patch 15/34] PID Virtualization task virtual pid access functions Serge Hallyn
2006-01-17 14:33 ` RFC [patch 16/34] PID Virtualization return virtual pids where required Serge Hallyn
2006-01-17 14:33 ` RFC [patch 17/34] PID Virtualization return virtual process group ids Serge Hallyn
2006-01-17 14:33 ` RFC [patch 18/34] PID Virtualization code enhancements for virtual pids in /proc Serge Hallyn
2006-01-17 14:33 ` RFC [patch 19/34] PID Virtualization Define pid_to_vpid functions Serge Hallyn
2006-01-17 14:33 ` RFC [patch 20/34] PID Virtualization Use pid_to_vpid conversion functions Serge Hallyn
2006-01-17 14:33 ` RFC [patch 21/34] PID Virtualization file owner pid virtualization Serge Hallyn
2006-01-17 14:33 ` RFC [patch 22/34] PID Virtualization define vpid_to_pid functions Serge Hallyn
2006-01-17 14:33 ` RFC [patch 23/34] PID Virtualization Use " Serge Hallyn
2006-01-17 14:33 ` RFC [patch 24/34] PID Virtualization use vpgid_to_pgid function Serge Hallyn
2006-01-17 14:33 ` RFC [patch 25/34] PID Virtualization Context for pid_to_vpid conversition functions Serge Hallyn
2006-01-17 14:33 ` RFC [patch 26/34] PID Virtualization Documentation Serge Hallyn
2006-01-17 14:33 ` RFC [patch 27/34] PID Virtualization pidspace Serge Hallyn
2006-01-17 14:33 ` RFC [patch 28/34] PID Virtualization container object and functions Serge Hallyn
2006-01-17 14:33 ` RFC [patch 29/34] PID Virtualization container attach/detach calls Serge Hallyn
2006-01-17 14:33 ` RFC [patch 30/34] PID Virtualization /proc/container filesystem Serge Hallyn
2006-01-17 14:33 ` RFC [patch 31/34] PID Virtualization Implementation of low level virtualization functions Serge Hallyn
2006-01-17 14:33 ` RFC [patch 32/34] PID Virtualization Handle special case vpid return cases Serge Hallyn
2006-01-17 14:33 ` RFC [patch 33/34] PID Virtualization per container /proc filesystem Serge Hallyn
2006-01-17 14:33 ` RFC [patch 34/34] PID Virtualization pidspace parent : signal behavior Serge Hallyn
2006-01-17 16:19 ` RFC [patch 00/34] PID Virtualization Overview Suleiman Souhlal
2006-01-17 17:08   ` Dave Hansen
2006-01-17 18:09     ` Suleiman Souhlal
2006-01-17 18:12       ` Dave Hansen
2006-01-17 18:29         ` Alan Cox
2006-01-18 19:01           ` Dave Hansen
2006-01-18 19:28             ` Arjan van de Ven
2006-01-18 19:38               ` Dave Hansen
2006-01-18 19:50                 ` Arjan van de Ven
2006-01-18 22:54                 ` Alan Cox
2006-01-19  7:15                   ` Arjan van de Ven
2006-01-20  5:11                     ` Eric W. Biederman
2006-01-20 20:23                       ` Serge E. Hallyn
2006-01-20 20:33                         ` Hubertus Franke
2006-01-21 10:34                           ` Eric W. Biederman
2006-01-20 19:53                   ` RFC: Multiple instances of kernel namespaces Eric W. Biederman
2006-01-20 20:13                     ` Serge E. Hallyn
2006-01-20 20:22                       ` Hubertus Franke
     [not found]                         ` <20060120203555.GC13265@sergelap.austin.ibm.com>
2006-01-20 21:47                           ` Hubertus Franke
2006-01-21 10:04                       ` Eric W. Biederman
2006-01-26 19:47                         ` Herbert Poetzl
2006-01-26 20:13                           ` Eric W. Biederman
2006-01-26 20:27                             ` Herbert Poetzl
2006-01-21 10:31             ` RFC [patch 00/34] PID Virtualization Overview Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).