linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers
@ 2005-05-05 18:07 gh
  2005-05-05 18:07 ` [patch 01/21] CKRM: Core CKRM Event Callbacks gh
                   ` (20 more replies)
  0 siblings, 21 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--

Here are the core patches for CKRM updated to 2.6.12-rc3, along with
the most basic classification engine and a slightly more advanced derivative
and, several bug fixes to the core code.

All of these changes have been tested on IA32 and PPC64, with CONIG_CKRM
on and off, including both basic functionality tests and a variety of
stress/performance tests.  There are still a few minor code cleanups that
are in progress, but nothing functional outstanding in this set.

Upcoming patches (soon to be included) are the memory controller (parts of
which are being currently discussed on linux-mm), the listen accept
queue and an IO contoller (or maybe two).

Here is the current series file:

01-diff_ckrm_events
02-diff_delay_acct
03-diff_ckrm_core
04-diff_rcfs
05-diff_taskclass
06-diff_sockclass
07-diff_numtasks
10-diff_docs
03a-missing_unlock
06a-ckrm_net_cb
06b-ckrm_sockc
07a-numtasks_config
07c-numtasks_cleanup
07c2-numtasks-undo-delete
09-01-rbce_fs
09-02-rbce_fs-main
09-03-rbce_main-opt
09-04-rbce_opt-core
09-05-rbce_core-crbce
ckrm-printf-cleanup
compiler-warning-fix

--
gerrit


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 01/21] CKRM: Core CKRM Event Callbacks
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 02/21] CKRM: Processor Delay Accounting gh
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=01-diff_ckrm_events


Core CKRM Event Callbacks.

On exec, fork, exit, real/effective gid/uid, use CKRM to associate
tasks with appropriate class.

Addressed all review comments except:
        Use of __bitwise and sparse in enum's
        Use of kernel list type

Signed-off-by:  Shailabh Nagar <nagar@us.ibm.com>
Signed-off-by:  Hubertus Franke <frankeh@us.ibm.com>
Signed-off-by:  Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by:  Gerrit Huizenga <gh@us.ibm.com>


 fs/exec.c                   |    2 
 include/linux/ckrm_events.h |  192 ++++++++++++++++++++++++++++++++++++++++++++
 init/Kconfig                |   16 +++
 kernel/Makefile             |    1 
 kernel/ckrm/Makefile        |    5 +
 kernel/ckrm/ckrm_events.c   |   86 +++++++++++++++++++
 kernel/exit.c               |    3 
 kernel/fork.c               |    4 
 kernel/sys.c                |   10 ++
 9 files changed, 319 insertions(+)

Index: linux-2.6.12-rc3-ckrm5/fs/exec.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/fs/exec.c	2005-05-05 09:32:55.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/fs/exec.c	2005-05-05 09:34:55.000000000 -0700
@@ -48,6 +48,7 @@
 #include <linux/syscalls.h>
 #include <linux/rmap.h>
 #include <linux/acct.h>
+#include <linux/ckrm_events.h>
 
 #include <asm/uaccess.h>
 #include <asm/mmu_context.h>
@@ -1087,6 +1088,7 @@ int search_binary_handler(struct linux_b
 					fput(bprm->file);
 				bprm->file = NULL;
 				current->did_exec = 1;
+				ckrm_cb_exec(bprm->filename);
 				return retval;
 			}
 			read_lock(&binfmt_lock);
Index: linux-2.6.12-rc3-ckrm5/include/linux/ckrm_events.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/include/linux/ckrm_events.h	2005-05-05 09:34:55.000000000 -0700
@@ -0,0 +1,192 @@
+/*
+ * ckrm_events.h - Class-based Kernel Resource Management (CKRM)
+ *                 event handling
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003,2004
+ *           (C) Shailabh Nagar,  IBM Corp. 2003
+ *           (C) Chandra Seetharaman, IBM Corp. 2003
+ *
+ *
+ * Provides a base header file including macros and basic data structures.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#ifndef _LINUX_CKRM_EVENTS_H
+#define _LINUX_CKRM_EVENTS_H
+
+#ifdef CONFIG_CKRM
+
+/*
+ * Data structure and function to get the list of registered
+ * resource controllers.
+ */
+
+/*
+ * CKRM defines a set of events at particular points in the kernel
+ * at which callbacks registered by various class types are called
+ */
+
+enum ckrm_event {
+	/*
+	 * we distinguish these events types:
+	 *
+	 * (a) CKRM_LATCHABLE_EVENTS
+	 *      events can be latched for event callbacks by classtypes
+	 *
+	 * (b) CKRM_NONLATACHBLE_EVENTS
+	 *     events can not be latched but can be used to call classification
+	 *
+	 * (c) event that are used for notification purposes
+	 *     range: [ CKRM_EVENT_CANNOT_CLASSIFY .. )
+	 */
+
+	/* events (a) */
+
+	CKRM_LATCHABLE_EVENTS,
+
+	CKRM_EVENT_NEWTASK = CKRM_LATCHABLE_EVENTS,
+	CKRM_EVENT_FORK,
+	CKRM_EVENT_EXIT,
+	CKRM_EVENT_EXEC,
+	CKRM_EVENT_UID,
+	CKRM_EVENT_GID,
+	CKRM_EVENT_LOGIN,
+	CKRM_EVENT_USERADD,
+	CKRM_EVENT_USERDEL,
+	CKRM_EVENT_LISTEN_START,
+	CKRM_EVENT_LISTEN_STOP,
+	CKRM_EVENT_APPTAG,
+
+	/* events (b) */
+
+	CKRM_NONLATCHABLE_EVENTS,
+
+	CKRM_EVENT_RECLASSIFY = CKRM_NONLATCHABLE_EVENTS,
+
+	/* events (c) */
+
+	CKRM_NOTCLASSIFY_EVENTS,
+
+	CKRM_EVENT_MANUAL = CKRM_NOTCLASSIFY_EVENTS,
+
+	CKRM_NUM_EVENTS
+};
+
+/*
+ * CKRM event callback specification for the classtypes or resource controllers
+ *   typically an array is specified using CKRM_EVENT_SPEC terminated with
+ *   CKRM_EVENT_SPEC_LAST and then that array is registered using
+ *   ckrm_register_event_set.
+ *   Individual registration of event_cb is also possible
+ */
+
+struct ckrm_hook_cb {
+	void (*fct)(void *arg);
+	struct ckrm_hook_cb *next;
+};
+
+struct ckrm_event_spec {
+	enum ckrm_event ev;
+	struct ckrm_hook_cb cb;
+};
+
+int ckrm_register_event_set(struct ckrm_event_spec especs[]);
+int ckrm_unregister_event_set(struct ckrm_event_spec especs[]);
+int ckrm_register_event_cb(enum ckrm_event ev, struct ckrm_hook_cb *cb);
+int ckrm_unregister_event_cb(enum ckrm_event ev, struct ckrm_hook_cb *cb);
+
+extern void ckrm_invoke_event_cb_chain(enum ckrm_event ev, void *arg);
+
+/* forward declarations for function arguments */
+struct task_struct;
+struct sock;
+struct user_struct;
+
+static inline void ckrm_cb_fork(struct task_struct *p)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_FORK, p);
+}
+
+static inline void ckrm_cb_newtask(struct task_struct *p)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_NEWTASK, p);
+}
+
+static inline void ckrm_cb_exit(struct task_struct *p)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_EXIT, p);
+}
+
+static inline void ckrm_cb_exec(char *c)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_EXEC, c);
+}
+
+static inline void ckrm_cb_uid(void)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_UID, NULL);
+}
+
+static inline void ckrm_cb_gid(void)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_GID, NULL);
+}
+
+static inline void ckrm_cb_apptag(void)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_APPTAG, NULL);
+}
+
+static inline void ckrm_cb_login(void)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_LOGIN, NULL);
+}
+
+static inline void ckrm_cb_useradd(struct user_struct *u)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_USERADD, u);
+}
+
+static inline void ckrm_cb_userdel(struct user_struct *u)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_USERDEL, u);
+}
+
+static inline void ckrm_cb_listen_start(struct sock *s)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_LISTEN_START, s);
+}
+
+static inline void ckrm_cb_listen_stop(struct sock *s)
+{
+         ckrm_invoke_event_cb_chain(CKRM_EVENT_LISTEN_STOP, s);
+}
+
+#else /* !CONFIG_CKRM */
+
+static inline void ckrm_cb_fork(struct task_struct *p) { }
+static inline void ckrm_cb_newtask(struct task_struct *p) { }
+static inline void ckrm_cb_exit(struct task_struct *p) { }
+static inline void ckrm_cb_exec(const char *c) { }
+static inline void ckrm_cb_uid(void) { }
+static inline void ckrm_cb_gid(void) { }
+static inline void ckrm_cb_apptag(void) { }
+static inline void ckrm_cb_login(void) { }
+static inline void ckrm_cb_useradd(struct user_struct *u) { }
+static inline void ckrm_cb_userdel(struct user_struct *u) { }
+static inline void ckrm_cb_listen_start(struct sock *s) { }
+static inline void ckrm_cb_listen_stop(struct sock *s) { }
+
+#endif /* CONFIG_CKRM */
+
+#endif /* _LINUX_CKRM_EVENTS_H */
Index: linux-2.6.12-rc3-ckrm5/init/Kconfig
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/init/Kconfig	2005-05-05 09:33:00.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/init/Kconfig	2005-05-05 09:34:55.000000000 -0700
@@ -146,6 +146,22 @@ config BSD_PROCESS_ACCT_V3
 	  for processing it. A preliminary version of these tools is available
 	  at <http://www.physik3.uni-rostock.de/tim/kernel/utils/acct/>.
 
+menu "Class Based Kernel Resource Management"
+
+config CKRM
+	bool "Class Based Kernel Resource Management Core"
+	depends on EXPERIMENTAL
+	help
+	  Class-based Kernel Resource Management is a framework for controlling
+	  and monitoring resource allocation of user-defined groups of tasks or
+	  incoming socket connections. For more information, please visit
+	  http://ckrm.sf.net.
+
+	  If you say Y here, enable the Resource Class File System and at least
+	  one of the resource controllers below. Say N if you are unsure.
+
+endmenu
+
 config SYSCTL
 	bool "Sysctl support"
 	---help---
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_events.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_events.c	2005-05-05 09:34:55.000000000 -0700
@@ -0,0 +1,86 @@
+/* ckrm_events.c - Class-based Kernel Resource Management (CKRM)
+ *               - event handling routines
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003, 2004
+ *           (C) Chandra Seetharaman,  IBM Corp. 2003
+ *
+ *
+ * Provides API for event registration and handling for different
+ * classtypes.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include <linux/config.h>
+#include <linux/stddef.h>
+#include <linux/ckrm_events.h>
+
+/*******************************************************************
+ *   Event callback invocation
+ *******************************************************************/
+
+struct ckrm_hook_cb *ckrm_event_callbacks[CKRM_NONLATCHABLE_EVENTS];
+
+/* Registration / Deregistration / Invocation functions */
+
+int ckrm_register_event_cb(enum ckrm_event ev, struct ckrm_hook_cb *cb)
+{
+	struct ckrm_hook_cb **cbptr;
+
+	if ((ev < CKRM_LATCHABLE_EVENTS) || (ev >= CKRM_NONLATCHABLE_EVENTS))
+		return 1;
+	cbptr = &ckrm_event_callbacks[ev];
+	while (*cbptr != NULL)
+		cbptr = &((*cbptr)->next);
+	*cbptr = cb;
+	return 0;
+}
+
+int ckrm_unregister_event_cb(enum ckrm_event ev, struct ckrm_hook_cb *cb)
+{
+	struct ckrm_hook_cb **cbptr;
+
+	if ((ev < CKRM_LATCHABLE_EVENTS) || (ev >= CKRM_NONLATCHABLE_EVENTS))
+		return -1;
+	cbptr = &ckrm_event_callbacks[ev];
+	while ((*cbptr != NULL) && (*cbptr != cb))
+		cbptr = &((*cbptr)->next);
+	if (*cbptr)
+		(*cbptr)->next = cb->next;
+	return (*cbptr == NULL);
+}
+
+int ckrm_register_event_set(struct ckrm_event_spec especs[])
+{
+	struct ckrm_event_spec *espec = especs;
+
+	for (espec = especs; espec->ev != -1; espec++)
+		ckrm_register_event_cb(espec->ev, &espec->cb);
+	return 0;
+}
+
+int ckrm_unregister_event_set(struct ckrm_event_spec especs[])
+{
+	struct ckrm_event_spec *espec = especs;
+
+	for (espec = especs; espec->ev != -1; espec++)
+		ckrm_unregister_event_cb(espec->ev, &espec->cb);
+	return 0;
+}
+
+void ckrm_invoke_event_cb_chain(enum ckrm_event ev, void *arg)
+{
+	struct ckrm_hook_cb *cb, *anchor;
+
+	if ((anchor = ckrm_event_callbacks[ev]) != NULL) {
+		for (cb = anchor; cb; cb = cb->next)
+			(*cb->fct) (arg);
+	}
+}
+
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile	2005-05-05 09:34:55.000000000 -0700
@@ -0,0 +1,5 @@
+#
+# Makefile for CKRM
+#
+
+obj-y := ckrm_events.o
Index: linux-2.6.12-rc3-ckrm5/kernel/exit.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/exit.c	2005-05-05 09:33:00.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/exit.c	2005-05-05 09:34:55.000000000 -0700
@@ -27,6 +27,7 @@
 #include <linux/mempolicy.h>
 #include <linux/cpuset.h>
 #include <linux/syscalls.h>
+#include <linux/ckrm_events.h>
 
 #include <asm/uaccess.h>
 #include <asm/unistd.h>
@@ -654,6 +655,8 @@ static void exit_notify(struct task_stru
 	struct task_struct *t;
 	struct list_head ptrace_dead, *_p, *_n;
 
+	ckrm_cb_exit(tsk);
+
 	if (signal_pending(tsk) && !(tsk->signal->flags & SIGNAL_GROUP_EXIT)
 	    && !thread_group_empty(tsk)) {
 		/*
Index: linux-2.6.12-rc3-ckrm5/kernel/fork.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/fork.c	2005-05-05 09:33:00.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/fork.c	2005-05-05 09:34:55.000000000 -0700
@@ -41,6 +41,7 @@
 #include <linux/profile.h>
 #include <linux/rmap.h>
 #include <linux/acct.h>
+#include <linux/ckrm_events.h>
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -174,6 +175,7 @@ static struct task_struct *dup_task_stru
 	tsk->thread_info = ti;
 	ti->task = tsk;
 
+	ckrm_cb_newtask(tsk);
 	/* One for us, one for whoever does the "release_task()" (usually parent) */
 	atomic_set(&tsk->usage,2);
 	return tsk;
@@ -1216,6 +1218,8 @@ long do_fork(unsigned long clone_flags,
 	if (!IS_ERR(p)) {
 		struct completion vfork;
 
+		ckrm_cb_fork(p);
+
 		if (clone_flags & CLONE_VFORK) {
 			p->vfork_done = &vfork;
 			init_completion(&vfork);
Index: linux-2.6.12-rc3-ckrm5/kernel/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/Makefile	2005-05-05 09:33:00.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/Makefile	2005-05-05 09:34:55.000000000 -0700
@@ -28,6 +28,7 @@ obj-$(CONFIG_KPROBES) += kprobes.o
 obj-$(CONFIG_SYSFS) += ksysfs.o
 obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
 obj-$(CONFIG_SECCOMP) += seccomp.o
+obj-$(CONFIG_CKRM) += ckrm/
 
 ifneq ($(CONFIG_IA64),y)
 # According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
Index: linux-2.6.12-rc3-ckrm5/kernel/sys.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/sys.c	2005-05-05 09:33:00.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/sys.c	2005-05-05 09:34:55.000000000 -0700
@@ -25,6 +25,7 @@
 #include <linux/dcookies.h>
 #include <linux/suspend.h>
 #include <linux/tty.h>
+#include <linux/ckrm_events.h>
 
 #include <linux/compat.h>
 #include <linux/syscalls.h>
@@ -534,6 +535,7 @@ asmlinkage long sys_setregid(gid_t rgid,
 	current->egid = new_egid;
 	current->gid = new_rgid;
 	key_fsgid_changed(current);
+	ckrm_cb_gid();
 	return 0;
 }
 
@@ -573,6 +575,7 @@ asmlinkage long sys_setgid(gid_t gid)
 		return -EPERM;
 
 	key_fsgid_changed(current);
+	ckrm_cb_gid();
 	return 0;
 }
   
@@ -663,6 +666,8 @@ asmlinkage long sys_setreuid(uid_t ruid,
 
 	key_fsuid_changed(current);
 
+	ckrm_cb_uid();
+
 	return security_task_post_setuid(old_ruid, old_euid, old_suid, LSM_SETID_RE);
 }
 
@@ -710,6 +715,8 @@ asmlinkage long sys_setuid(uid_t uid)
 
 	key_fsuid_changed(current);
 
+	ckrm_cb_uid();
+
 	return security_task_post_setuid(old_ruid, old_euid, old_suid, LSM_SETID_ID);
 }
 
@@ -758,6 +765,8 @@ asmlinkage long sys_setresuid(uid_t ruid
 
 	key_fsuid_changed(current);
 
+	ckrm_cb_uid();
+
 	return security_task_post_setuid(old_ruid, old_euid, old_suid, LSM_SETID_RES);
 }
 
@@ -809,6 +818,7 @@ asmlinkage long sys_setresgid(gid_t rgid
 		current->sgid = sgid;
 
 	key_fsgid_changed(current);
+	ckrm_cb_gid();
 	return 0;
 }
 

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 02/21] CKRM: Processor Delay Accounting
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
  2005-05-05 18:07 ` [patch 01/21] CKRM: Core CKRM Event Callbacks gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 03/21] CKRM: Core infrastructure gh
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=02-diff_delay_acct


CKRM processor scheduling delay accounting - provides a mechanism
to In addition to counting frequency the total delay in ns is also
recorded. CPU delays are specified as cpu-wait and cpu-run.  I/O delays
are recorded for memory and regular I/O.  Information is accessible
through /proc/<pid>/delay.

Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

 fs/proc/array.c            |   18 +++++++++
 fs/proc/base.c             |   17 ++++++++
 fs/proc/internal.h         |    1 
 include/linux/sched.h      |   89 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/taskdelays.h |   35 +++++++++++++++++
 init/Kconfig               |    8 ++++
 kernel/fork.c              |    1 
 kernel/sched.c             |   20 ++++++++++
 mm/memory.c                |    9 ++++
 9 files changed, 197 insertions(+), 1 deletion(-)

Index: linux-2.6.12-rc3-ckrm5/fs/proc/array.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/fs/proc/array.c	2005-05-05 09:32:56.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/fs/proc/array.c	2005-05-05 09:35:02.000000000 -0700
@@ -482,3 +482,21 @@ int proc_pid_statm(struct task_struct *t
 	return sprintf(buffer,"%d %d %d %d %d %d %d\n",
 		       size, resident, shared, text, lib, data, 0);
 }
+
+
+int proc_pid_delay(struct task_struct *task, char * buffer)
+{
+	int res;
+
+	res  = sprintf(buffer,"%u %llu %llu %u %llu %u %llu\n",
+		       (unsigned int) get_delay(task,runs),
+		       (uint64_t) get_delay(task,runcpu_total),
+		       (uint64_t) get_delay(task,waitcpu_total),
+		       (unsigned int) get_delay(task,num_iowaits),
+		       (uint64_t) get_delay(task,iowait_total),
+		       (unsigned int) get_delay(task,num_memwaits),
+		       (uint64_t) get_delay(task,mem_iowait_total)
+		);
+	return res;
+}
+
Index: linux-2.6.12-rc3-ckrm5/fs/proc/base.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/fs/proc/base.c	2005-05-05 09:32:56.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/fs/proc/base.c	2005-05-05 09:35:02.000000000 -0700
@@ -120,6 +120,10 @@ enum pid_directory_inos {
 #ifdef CONFIG_AUDITSYSCALL
 	PROC_TID_LOGINUID,
 #endif
+#ifdef CONFIG_DELAY_ACCT
+        PROC_TID_DELAY_ACCT,
+        PROC_TGID_DELAY_ACCT,
+#endif
 	PROC_TID_FD_DIR = 0x8000,	/* 0x8000-0xffff */
 	PROC_TID_OOM_SCORE,
 	PROC_TID_OOM_ADJUST,
@@ -155,6 +159,9 @@ static struct pid_entry tgid_base_stuff[
 #ifdef CONFIG_SECURITY
 	E(PROC_TGID_ATTR,      "attr",    S_IFDIR|S_IRUGO|S_IXUGO),
 #endif
+#ifdef CONFIG_DELAY_ACCT
+	E(PROC_TGID_DELAY_ACCT,"delay",   S_IFREG|S_IRUGO),
+#endif
 #ifdef CONFIG_KALLSYMS
 	E(PROC_TGID_WCHAN,     "wchan",   S_IFREG|S_IRUGO),
 #endif
@@ -191,6 +198,9 @@ static struct pid_entry tid_base_stuff[]
 #ifdef CONFIG_SECURITY
 	E(PROC_TID_ATTR,       "attr",    S_IFDIR|S_IRUGO|S_IXUGO),
 #endif
+#ifdef CONFIG_DELAY_ACCT
+	E(PROC_TGID_DELAY_ACCT,"delay",   S_IFREG|S_IRUGO),
+#endif
 #ifdef CONFIG_KALLSYMS
 	E(PROC_TID_WCHAN,      "wchan",   S_IFREG|S_IRUGO),
 #endif
@@ -1564,6 +1574,13 @@ static struct dentry *proc_pident_lookup
 			ei->op.proc_read = proc_pid_wchan;
 			break;
 #endif
+#ifdef CONFIG_DELAY_ACCT
+		case PROC_TID_DELAY_ACCT:
+		case PROC_TGID_DELAY_ACCT:
+			inode->i_fop = &proc_info_file_operations;
+			ei->op.proc_read = proc_pid_delay;
+			break;
+#endif
 #ifdef CONFIG_SCHEDSTATS
 		case PROC_TID_SCHEDSTAT:
 		case PROC_TGID_SCHEDSTAT:
Index: linux-2.6.12-rc3-ckrm5/fs/proc/internal.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/fs/proc/internal.h	2005-03-01 23:37:48.000000000 -0800
+++ linux-2.6.12-rc3-ckrm5/fs/proc/internal.h	2005-05-05 09:35:02.000000000 -0700
@@ -36,6 +36,7 @@ extern int proc_tid_stat(struct task_str
 extern int proc_tgid_stat(struct task_struct *, char *);
 extern int proc_pid_status(struct task_struct *, char *);
 extern int proc_pid_statm(struct task_struct *, char *);
+extern int proc_pid_delay(struct task_struct *, char*);
 
 static inline struct task_struct *proc_task(struct inode *inode)
 {
Index: linux-2.6.12-rc3-ckrm5/include/linux/sched.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/include/linux/sched.h	2005-05-05 09:33:00.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/include/linux/sched.h	2005-05-05 09:35:02.000000000 -0700
@@ -34,6 +34,7 @@
 #include <linux/percpu.h>
 #include <linux/topology.h>
 #include <linux/seccomp.h>
+#include <linux/taskdelays.h>
 
 struct exec_domain;
 
@@ -737,6 +738,9 @@ struct task_struct {
 	nodemask_t mems_allowed;
 	int cpuset_mems_generation;
 #endif
+#ifdef CONFIG_DELAY_ACCT
+	struct task_delay_info delays;
+#endif
 };
 
 static inline pid_t process_group(struct task_struct *tsk)
@@ -1033,6 +1037,9 @@ task_t *fork_idle(int);
 extern void set_task_comm(struct task_struct *tsk, char *from);
 extern void get_task_comm(char *to, struct task_struct *tsk);
 
+#define PF_MEMIO	0x00400000      /* I am potentially doing I/O for mem */
+#define PF_IOWAIT	0x00800000      /* I am waiting on disk I/O */
+
 #ifdef CONFIG_SMP
 extern void wait_task_inactive(task_t * p);
 #else
@@ -1267,6 +1274,88 @@ static inline int try_to_freeze(unsigned
 	return 0;
 }
 #endif /* CONFIG_PM */
+
+/* API for registering delay info */
+#ifdef CONFIG_DELAY_ACCT
+
+#define test_delay_flag(tsk,flg)	((tsk)->flags & (flg))
+#define set_delay_flag(tsk,flg)		((tsk)->flags |= (flg))
+#define clear_delay_flag(tsk,flg)	((tsk)->flags &= ~(flg))
+
+#define def_delay_var(var)		unsigned long long var
+#define get_delay(tsk,field)		((tsk)->delays.field)
+
+#define start_delay(var)		((var) = sched_clock())
+#define start_delay_set(var,flg)	(set_delay_flag(current,flg),(var) = \
+							sched_clock())
+
+#define inc_delay(tsk,field)		(((tsk)->delays.field)++)
+
+/* because of hardware timer drifts in SMPs and task continue on different cpu
+ * then where the start_ts was taken there is a possibility that
+ * end_ts < start_ts by some usecs. In this case we ignore the diff
+ * and add nothing to the total.
+ */
+#ifdef CONFIG_SMP
+#define test_ts_integrity(start_ts,end_ts)  (likely((end_ts) > (start_ts)))
+#else
+#define test_ts_integrity(start_ts,end_ts)  (1)
+#endif
+
+#define add_delay_ts(tsk,field,start_ts,end_ts) \
+	do { if (test_ts_integrity(start_ts,end_ts)) (tsk)->delays.field += ((end_ts)-(start_ts)); } while (0)
+
+#define add_delay_clear(tsk,field,start_ts,flg)		\
+	do {						\
+		unsigned long long now = sched_clock();	\
+		add_delay_ts(tsk,field,start_ts,now);	\
+		clear_delay_flag(tsk,flg);		\
+	} while (0)
+
+static inline void add_io_delay(unsigned long long dstart)
+{
+	struct task_struct * tsk = current;
+	unsigned long long now = sched_clock();
+	unsigned long long val;
+
+	if (test_ts_integrity(dstart,now))
+		val = now - dstart;
+	else
+		val = 0;
+	if (test_delay_flag(tsk,PF_MEMIO)) {
+		tsk->delays.mem_iowait_total += val;
+		tsk->delays.num_memwaits++;
+	} else {
+		tsk->delays.iowait_total += val;
+		tsk->delays.num_iowaits++;
+	}
+	clear_delay_flag(tsk,PF_IOWAIT);
+}
+
+inline static void init_delays(struct task_struct *tsk)
+{
+	memset((void*)&tsk->delays,0,sizeof(tsk->delays));
+}
+
+#else
+
+#define test_delay_flag(tsk,flg)                (0)
+#define set_delay_flag(tsk,flg)                 do { } while (0)
+#define clear_delay_flag(tsk,flg)               do { } while (0)
+
+#define def_delay_var(var)
+#define get_delay(tsk,field)                    (0)
+
+#define start_delay(var)                        do { } while (0)
+#define start_delay_set(var,flg)                do { } while (0)
+
+#define inc_delay(tsk,field)                    do { } while (0)
+#define add_delay_ts(tsk,field,start_ts,now)    do { } while (0)
+#define add_delay_clear(tsk,field,start_ts,flg) do { } while (0)
+#define add_io_delay(dstart)			do { } while (0)
+#define init_delays(tsk)                        do { } while (0)
+#endif
+
 #endif /* __KERNEL__ */
 
 #endif
Index: linux-2.6.12-rc3-ckrm5/include/linux/taskdelays.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/include/linux/taskdelays.h	2005-05-05 09:35:02.000000000 -0700
@@ -0,0 +1,35 @@
+/* taskdelays.h - for delay accounting
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003, 2004
+ *
+ * Has the data structure for delay counting.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ */
+
+#ifndef _LINUX_TASKDELAYS_H
+#define _LINUX_TASKDELAYS_H
+
+#include <linux/config.h>
+#include <linux/types.h>
+
+struct task_delay_info {
+	/* delay statistics in usecs */
+	uint64_t waitcpu_total;
+	uint64_t runcpu_total;
+	uint64_t iowait_total;
+	uint64_t mem_iowait_total;
+	uint32_t runs;
+	uint32_t num_iowaits;
+	uint32_t num_memwaits;
+};
+
+#endif /* _LINUX_TASKDELAYS_H */
Index: linux-2.6.12-rc3-ckrm5/init/Kconfig
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/init/Kconfig	2005-05-05 09:34:55.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/init/Kconfig	2005-05-05 09:35:02.000000000 -0700
@@ -261,6 +261,14 @@ menuconfig EMBEDDED
           environments which can tolerate a "non-standard" kernel.
           Only use this if you really know what you are doing.
 
+config DELAY_ACCT
+	bool "Enable delay accounting (EXPERIMENTAL)"
+	help
+	  In addition to counting frequency the total delay in ns is also
+	  recorded. CPU delays are specified as cpu-wait and cpu-run.
+	  I/O delays are recorded for memory and regular I/O.
+	  Information is accessible through /proc/<pid>/delay.
+
 config KALLSYMS
 	 bool "Load all symbols for debugging/kksymoops" if EMBEDDED
 	 default y
Index: linux-2.6.12-rc3-ckrm5/kernel/fork.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/fork.c	2005-05-05 09:34:55.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/fork.c	2005-05-05 09:35:02.000000000 -0700
@@ -901,6 +901,7 @@ static task_t *copy_process(unsigned lon
 	if (p->binfmt && !try_module_get(p->binfmt->module))
 		goto bad_fork_cleanup_put_domain;
 
+	init_delays(p);
 	p->did_exec = 0;
 	copy_flags(clone_flags, p);
 	p->pid = pid;
Index: linux-2.6.12-rc3-ckrm5/kernel/sched.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/sched.c	2005-05-05 09:33:00.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/sched.c	2005-05-05 09:35:02.000000000 -0700
@@ -268,6 +268,8 @@ static DEFINE_PER_CPU(struct runqueue, r
 #define task_rq(p)		cpu_rq(task_cpu(p))
 #define cpu_curr(cpu)		(cpu_rq(cpu)->curr)
 
+#define task_is_running(p)	(this_rq() == task_rq(p))
+
 /*
  * Default context-switch locking:
  */
@@ -2749,6 +2751,7 @@ switch_tasks:
 
 	update_cpu_clock(prev, rq, now);
 
+	add_delay_ts(prev, runcpu_total, prev->timestamp, now);
 	prev->sleep_avg -= run_time;
 	if ((long)prev->sleep_avg <= 0)
 		prev->sleep_avg = 0;
@@ -2756,6 +2759,8 @@ switch_tasks:
 
 	sched_info_switch(prev, next);
 	if (likely(prev != next)) {
+		add_delay_ts(next, waitcpu_total, next->timestamp, now);
+		inc_delay(next, runs);
 		next->timestamp = now;
 		rq->nr_switches++;
 		rq->curr = next;
@@ -3799,9 +3804,12 @@ void __sched io_schedule(void)
 {
 	struct runqueue *rq = &per_cpu(runqueues, _smp_processor_id());
 
+	def_delay_var(dstart);
+	start_delay_set(dstart, PF_IOWAIT);
 	atomic_inc(&rq->nr_iowait);
 	schedule();
 	atomic_dec(&rq->nr_iowait);
+	add_io_delay(dstart);
 }
 
 EXPORT_SYMBOL(io_schedule);
@@ -3810,10 +3818,13 @@ long __sched io_schedule_timeout(long ti
 {
 	struct runqueue *rq = &per_cpu(runqueues, _smp_processor_id());
 	long ret;
+	def_delay_var(dstart);
 
+	start_delay_set(dstart,PF_IOWAIT);
 	atomic_inc(&rq->nr_iowait);
 	ret = schedule_timeout(timeout);
 	atomic_dec(&rq->nr_iowait);
+	add_io_delay(dstart);
 	return ret;
 }
 
@@ -5002,3 +5013,12 @@ void normalize_rt_tasks(void)
 }
 
 #endif /* CONFIG_MAGIC_SYSRQ */
+
+#ifdef CONFIG_DELAY_ACCT
+int task_running_sys(struct task_struct *p)
+{
+	return task_is_running(p);
+}
+EXPORT_SYMBOL_GPL(task_running_sys);
+#endif
+
Index: linux-2.6.12-rc3-ckrm5/mm/memory.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/mm/memory.c	2005-05-05 09:33:00.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/mm/memory.c	2005-05-05 09:35:02.000000000 -0700
@@ -2031,6 +2031,7 @@ int handle_mm_fault(struct mm_struct *mm
 	pud_t *pud;
 	pmd_t *pmd;
 	pte_t *pte;
+	int rc;
 
 	__set_current_state(TASK_RUNNING);
 
@@ -2044,6 +2045,9 @@ int handle_mm_fault(struct mm_struct *mm
 	 * and the SMP-safe atomic PTE updates.
 	 */
 	pgd = pgd_offset(mm, address);
+
+	set_delay_flag(current, PF_MEMIO);
+
 	spin_lock(&mm->page_table_lock);
 
 	pud = pud_alloc(mm, pgd, address);
@@ -2058,10 +2062,13 @@ int handle_mm_fault(struct mm_struct *mm
 	if (!pte)
 		goto oom;
 	
-	return handle_pte_fault(mm, vma, address, write_access, pte, pmd);
+	rc = handle_pte_fault(mm, vma, address, write_access, pte, pmd);
+	clear_delay_flag(current, PF_MEMIO);
+	return rc;
 
  oom:
 	spin_unlock(&mm->page_table_lock);
+	clear_delay_flag(current, PF_MEMIO);
 	return VM_FAULT_OOM;
 }
 

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 03/21] CKRM: Core infrastructure
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
  2005-05-05 18:07 ` [patch 01/21] CKRM: Core CKRM Event Callbacks gh
  2005-05-05 18:07 ` [patch 02/21] CKRM: Processor Delay Accounting gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 04/21] CKRM: Resource Control File System (rcfs) gh
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=03-diff_ckrm_core


This patch contains the core infrastructure code for CKRM.  It includes
the interfaces for the classification engine code and the resource control
filesystems (rcfs).  Rcfs is the mechanism for setting class assignments
and policies within CKRM.

Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>
Signed-Off-By: Vivek Kashyap <vivk@us.ibm.com>


 include/linux/ckrm_ce.h     |   95 ++++
 include/linux/ckrm_events.h |   38 +
 include/linux/ckrm_rc.h     |  345 +++++++++++++++++
 include/linux/rcfs.h        |   96 ++++
 include/linux/sched.h       |    5 
 init/main.c                 |    2 
 kernel/ckrm/Makefile        |    2 
 kernel/ckrm/ckrm.c          |  892 ++++++++++++++++++++++++++++++++++++++++++++
 kernel/ckrm/ckrmutils.c     |  188 +++++++++
 9 files changed, 1648 insertions(+), 15 deletions(-)

Index: linux-2.6.12-rc3-ckrm5/include/linux/ckrm_ce.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/include/linux/ckrm_ce.h	2005-05-05 09:35:04.000000000 -0700
@@ -0,0 +1,95 @@
+/*
+ *  ckrm_ce.h - Header file to be used by Classification Engine of CKRM
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ *           (C) Shailabh Nagar,  IBM Corp. 2003
+ *           (C) Chandra Seetharaman, IBM Corp. 2003
+ *
+ * Provides data structures, macros and kernel API of CKRM for
+ * classification engine.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#ifndef _LINUX_CKRM_CE_H
+#define _LINUX_CKRM_CE_H
+
+#ifdef CONFIG_CKRM
+
+#include <linux/ckrm_events.h>
+
+/*
+ * Action parameters identifying the cause of a task<->class notify callback
+ * these can perculate up to user daemon consuming records send by the
+ * classification engine
+ */
+
+typedef void *(*ce_classify_fct) (enum ckrm_event event, void *obj, ...);
+typedef void (*ce_notify_fct) (enum ckrm_event event, void *classobj,
+				 void *obj);
+
+struct ckrm_eng_callback {
+	/* general state information */
+	int always_callback;	/* set if CE should always be called back
+				   regardless of numclasses */
+
+	/* callbacks which are called without holding locks */
+
+	unsigned long c_interest;	/* set of classification events of
+					 * interest to CE
+					 */
+
+	/* generic classify */
+	ce_classify_fct classify;
+
+	/* class added */
+	void (*class_add) (const char *name, void *core, int classtype);
+
+	/* class deleted */
+	void (*class_delete) (const char *name, void *core, int classtype);
+
+	/* callbacks which are called while holding task_lock(tsk) */
+	unsigned long n_interest;	/* set of notification events of
+					 *  interest to CE
+					 */
+	/* notify on class switch */
+	ce_notify_fct notify;	
+};
+
+struct inode;
+struct dentry;
+
+struct rbce_eng_callback {
+	int (*mkdir) (struct inode *, struct dentry *, int);	/* mkdir */
+	int (*rmdir) (struct inode *, struct dentry *);		/* rmdir */
+	int (*mnt) (void);
+	int (*umnt) (void);
+};
+
+extern int ckrm_register_engine(const char *name, struct ckrm_eng_callback *);
+extern int ckrm_unregister_engine(const char *name);
+
+extern void *ckrm_classobj(char *, int *classtype);
+
+extern int rcfs_register_engine(struct rbce_eng_callback *);
+extern int rcfs_unregister_engine(struct rbce_eng_callback *);
+
+extern int ckrm_reclassify(int pid);
+
+#ifndef _LINUX_CKRM_RC_H
+
+extern void ckrm_core_grab(struct ckrm_core_class *core);
+extern void ckrm_core_drop(struct ckrm_core_class *core);
+#endif
+
+#endif /* CONFIG_CKRM */
+#endif /* _LINUX_CKRM_CE_H */
Index: linux-2.6.12-rc3-ckrm5/include/linux/ckrm_events.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/include/linux/ckrm_events.h	2005-05-05 09:34:55.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/include/linux/ckrm_events.h	2005-05-05 09:35:04.000000000 -0700
@@ -108,70 +108,78 @@ int ckrm_unregister_event_cb(enum ckrm_e
 extern void ckrm_invoke_event_cb_chain(enum ckrm_event ev, void *arg);
 
 /* forward declarations for function arguments */
-struct task_struct;
+
+#include <linux/sched.h>		/* for task_struct */
+
 struct sock;
 struct user_struct;
 
 static inline void ckrm_cb_fork(struct task_struct *p)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_FORK, p);
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_FORK, p);
 }
 
 static inline void ckrm_cb_newtask(struct task_struct *p)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_NEWTASK, p);
+	
+	p->ce_data = NULL;
+	spin_lock_init(&p->ckrm_tsklock);
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_NEWTASK, p);
 }
 
 static inline void ckrm_cb_exit(struct task_struct *p)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_EXIT, p);
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_EXIT, p);
+	p->ce_data = NULL;
 }
 
 static inline void ckrm_cb_exec(char *c)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_EXEC, c);
-}
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_EXEC, c);
+ }
 
 static inline void ckrm_cb_uid(void)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_UID, NULL);
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_UID, NULL);
 }
 
 static inline void ckrm_cb_gid(void)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_GID, NULL);
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_GID, NULL);
 }
 
 static inline void ckrm_cb_apptag(void)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_APPTAG, NULL);
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_APPTAG, NULL);
 }
 
 static inline void ckrm_cb_login(void)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_LOGIN, NULL);
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_LOGIN, NULL);
 }
 
 static inline void ckrm_cb_useradd(struct user_struct *u)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_USERADD, u);
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_USERADD, u);
 }
 
 static inline void ckrm_cb_userdel(struct user_struct *u)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_USERDEL, u);
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_USERDEL, u);
 }
 
 static inline void ckrm_cb_listen_start(struct sock *s)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_LISTEN_START, s);
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_LISTEN_START, s);
 }
 
 static inline void ckrm_cb_listen_stop(struct sock *s)
 {
-         ckrm_invoke_event_cb_chain(CKRM_EVENT_LISTEN_STOP, s);
+	ckrm_invoke_event_cb_chain(CKRM_EVENT_LISTEN_STOP, s);
 }
 
+extern void ckrm_init(void);
+
 #else /* !CONFIG_CKRM */
 
 static inline void ckrm_cb_fork(struct task_struct *p) { }
@@ -187,6 +195,8 @@ static inline void ckrm_cb_userdel(struc
 static inline void ckrm_cb_listen_start(struct sock *s) { }
 static inline void ckrm_cb_listen_stop(struct sock *s) { }
 
+#define ckrm_init()	do { } while (0)
+
 #endif /* CONFIG_CKRM */
 
 #endif /* _LINUX_CKRM_EVENTS_H */
Index: linux-2.6.12-rc3-ckrm5/include/linux/ckrm_rc.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/include/linux/ckrm_rc.h	2005-05-05 09:35:04.000000000 -0700
@@ -0,0 +1,345 @@
+/*
+ *  ckrm_rc.h - Header file to be used by Resource controllers of CKRM
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ *           (C) Shailabh Nagar,  IBM Corp. 2003
+ *           (C) Chandra Seetharaman, IBM Corp. 2003
+ *	     (C) Vivek Kashyap , IBM Corp. 2004
+ *
+ * Provides data structures, macros and kernel API of CKRM for
+ * resource controllers.
+ *
+ * More details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#ifndef _LINUX_CKRM_RC_H
+#define _LINUX_CKRM_RC_H
+
+#ifdef CONFIG_CKRM
+
+#include <linux/list.h>
+#include <linux/ckrm_events.h>
+#include <linux/ckrm_ce.h>
+#include <linux/seq_file.h>
+
+#define CKRM_MAX_CLASSTYPES         32	/* maximum number of class types */
+#define CKRM_MAX_CLASSTYPE_NAME     32 	/* maximum classtype name length */
+
+#define CKRM_MAX_RES_CTLRS           8	/* maximum resource controllers per classtype */
+#define CKRM_MAX_RES_NAME          128	/* maximum resource controller name length */
+
+struct ckrm_core_class;
+struct ckrm_classtype;
+
+/*
+ * Share specifications
+ */
+
+struct ckrm_shares {
+	int my_guarantee;
+	int my_limit;
+	int total_guarantee;
+	int max_limit;
+	int unused_guarantee;	/* not used as parameters */
+	int cur_max_limit;	/* not used as parameters */
+};
+
+#define CKRM_SHARE_UNCHANGED	(-1)
+#define CKRM_SHARE_DONTCARE	(-2)
+#define CKRM_SHARE_DFLT_TOTAL_GUARANTEE	(100)
+#define CKRM_SHARE_DFLT_MAX_LIMIT	(100)
+
+/*
+ * RESOURCE CONTROLLERS
+ */
+
+/* resource controller callback structure */
+
+struct ckrm_res_ctlr {
+	char res_name[CKRM_MAX_RES_NAME];
+	int res_hdepth;		/* maximum hierarchy */
+	int resid;		/* (for now) same as the enum resid */
+	struct ckrm_classtype *classtype;    /* classtype owning this res ctlr */
+
+	/* allocate/free new resource class object for resource controller */
+	void *(*res_alloc) (struct ckrm_core_class * this,
+			    struct ckrm_core_class * parent);
+	void (*res_free) (void *);
+
+	/* set/get limits/guarantees for a resource controller class */
+	int (*set_share_values) (void *, struct ckrm_shares * shares);
+	int (*get_share_values) (void *, struct ckrm_shares * shares);
+
+	/* statistics and configuration access */
+	int (*get_stats) (void *, struct seq_file *);
+	int (*reset_stats) (void *);
+	int (*show_config) (void *, struct seq_file *);
+	int (*set_config) (void *, const char *cfgstr);
+
+	void (*change_resclass) (void *, void *, void *);
+};
+
+/*
+ * CKRM_CLASSTYPE
+ *
+ * A <struct ckrm_classtype> object describes a dimension for CKRM to classify
+ * along. Need to provide methods to create and manipulate class objects in
+ * this dimension
+ */
+
+/* list of predefined class types, we always recognize */
+#define CKRM_CLASSTYPE_TASK_CLASS    0
+#define CKRM_CLASSTYPE_SOCKET_CLASS  1
+#define CKRM_RESV_CLASSTYPES         2	/* always +1 of last known type */
+
+#define CKRM_MAX_TYPENAME_LEN       32
+
+struct ckrm_classtype {
+	/* TODO: Review for cache alignment */
+
+	/* resource controllers */
+
+	spinlock_t res_ctlrs_lock;  /* protect res ctlr related data */
+	int max_res_ctlrs;          /* max number of res ctlrs allowed */
+	int max_resid;              /* max resid used */
+	int resid_reserved;	    /* max number of reserved controllers */
+	long bit_res_ctlrs;	    /* bitmap of resource ID used */
+	atomic_t nr_resusers[CKRM_MAX_RES_CTLRS];
+	struct ckrm_res_ctlr *res_ctlrs[CKRM_MAX_RES_CTLRS];
+
+	/* state about my classes */
+
+	struct ckrm_core_class *default_class;	
+	struct list_head classes;  /* link all classes of this classtype */
+	int num_classes;
+
+	/* state about my ce interaction */
+	atomic_t ce_regd;		/* if CE registered */
+	int ce_cb_active;		/* if Callbacks active */
+	atomic_t ce_nr_users;		/* number of active transient calls */
+	struct ckrm_eng_callback ce_callbacks;	/* callback engine */
+
+	/* Begin classtype-rcfs private data. No rcfs/fs specific types used.  */
+
+	int mfidx;		/* Index into genmfdesc array used to initialize */
+	void *mfdesc;		/* Array of descriptors of root and magic files */
+	int mfcount;		/* length of above array */
+	void *rootde;		/* root dentry created by rcfs */
+	/* End rcfs private data */
+
+	char name[CKRM_MAX_TYPENAME_LEN]; /* currently same as mfdesc[0]->name  */
+	                                  /* but could be different */
+	int type_id;			  /* unique TypeID */
+	int maxdepth;			  /* maximum depth supported */
+
+	/* functions to be called on any class type by external API's */
+
+	struct ckrm_core_class *(*alloc) (struct ckrm_core_class * parent,
+					  const char *name);	
+	int (*free) (struct ckrm_core_class * cls);	
+	int (*show_members) (struct ckrm_core_class *, struct seq_file *);
+	int (*show_stats) (struct ckrm_core_class *, struct seq_file *);
+	int (*show_config) (struct ckrm_core_class *, struct seq_file *);
+	int (*show_shares) (struct ckrm_core_class *, struct seq_file *);
+
+	int (*reset_stats) (struct ckrm_core_class *, const char *resname,
+			    const char *);
+	int (*set_config) (struct ckrm_core_class *, const char *resname,
+			   const char *cfgstr);
+	int (*set_shares) (struct ckrm_core_class *, const char *resname,
+			   struct ckrm_shares * shares);
+	int (*forced_reclassify) (struct ckrm_core_class *, const char *);
+
+	/* functions to be called on a class type by ckrm internals */
+
+	/* class initialization for new RC */
+	void (*add_resctrl) (struct ckrm_core_class *, int resid);	
+};
+
+/*
+ * CKRM CORE CLASS
+ *      common part to any class structure (i.e. instance of a classtype)
+ */
+
+/*
+ * basic definition of a hierarchy that is to be used by the the CORE classes
+ * and can be used by the resource class objects
+ */
+
+#define CKRM_CORE_MAGIC		0xBADCAFFE
+
+struct ckrm_hnode {
+	struct ckrm_core_class *parent;
+	struct list_head siblings;	
+	struct list_head children;	
+};
+
+struct ckrm_core_class {
+	struct ckrm_classtype *classtype;	
+	void *res_class[CKRM_MAX_RES_CTLRS];	/* resource classes */
+	spinlock_t class_lock;	                /* protects list,array above */
+
+	struct list_head objlist;		/* generic object list */
+	struct list_head clslist;		/* peer classtype classes */
+	struct dentry *dentry;			/* dentry of inode in the RCFS */
+	int magic;
+
+	struct ckrm_hnode hnode;		/* hierarchy */
+	rwlock_t hnode_rwlock;			/* protects hnode above. */
+	atomic_t refcnt;
+	const char *name;
+	int delayed;				/* core deletion delayed  */
+						/* because of race conditions */
+};
+
+/* type coerce between derived class types and ckrm core class type */
+#define class_type(type,coreptr)   container_of(coreptr,type,core)
+#define class_core(clsptr)         (&(clsptr)->core)
+/* locking classes */
+#define class_lock(coreptr)        spin_lock(&(coreptr)->class_lock)
+#define class_unlock(coreptr)      spin_unlock(&(coreptr)->class_lock)
+/* what type is a class of ISA */
+#define class_isa(clsptr)          (class_core(clsptr)->classtype)
+
+/*
+ * OTHER
+ */
+
+#define ckrm_get_res_class(rescls, resid, type) \
+	((type*) (((resid != -1) && ((rescls) != NULL) \
+			   && ((rescls) != (void *)-1)) ? \
+	 ((struct ckrm_core_class *)(rescls))->res_class[resid] : NULL))
+
+
+extern int ckrm_register_res_ctlr(struct ckrm_classtype *, struct ckrm_res_ctlr *);
+extern int ckrm_unregister_res_ctlr(struct ckrm_res_ctlr *);
+
+extern int ckrm_validate_and_grab_core(struct ckrm_core_class *core);
+extern int ckrm_init_core_class(struct ckrm_classtype *clstype,
+				struct ckrm_core_class *dcore,
+				struct ckrm_core_class *parent,
+				const char *name);
+extern int ckrm_release_core_class(struct ckrm_core_class *);	
+
+/* TODO: can disappear after cls del debugging */
+
+extern struct ckrm_res_ctlr *ckrm_resctlr_lookup(struct ckrm_classtype *type,
+						 const char *resname);
+
+extern void ckrm_lock_hier(struct ckrm_core_class *);
+extern void ckrm_unlock_hier(struct ckrm_core_class *);
+extern struct ckrm_core_class *ckrm_get_next_child(struct ckrm_core_class *,
+						   struct ckrm_core_class *);
+
+extern void child_guarantee_changed(struct ckrm_shares *, int, int);
+extern void child_maxlimit_changed(struct ckrm_shares *, int);
+extern int set_shares(struct ckrm_shares *, struct ckrm_shares *,
+		      struct ckrm_shares *);
+
+/* classtype registration and lookup */
+extern int ckrm_register_classtype(struct ckrm_classtype *clstype);
+extern int ckrm_unregister_classtype(struct ckrm_classtype *clstype);
+extern struct ckrm_classtype *ckrm_find_classtype_by_name(const char *name);
+
+/* default functions that can be used in classtypes's function table */
+extern int ckrm_class_show_shares(struct ckrm_core_class *core,
+				  struct seq_file *seq);
+extern int ckrm_class_show_stats(struct ckrm_core_class *core,
+				 struct seq_file *seq);
+extern int ckrm_class_show_config(struct ckrm_core_class *core,
+				  struct seq_file *seq);
+extern int ckrm_class_set_config(struct ckrm_core_class *core,
+				 const char *resname, const char *cfgstr);
+extern int ckrm_class_set_shares(struct ckrm_core_class *core,
+				 const char *resname,
+				 struct ckrm_shares *shares);
+extern int ckrm_class_reset_stats(struct ckrm_core_class *core,
+				  const char *resname, const char *unused);
+
+static inline void ckrm_core_grab(struct ckrm_core_class *core)
+{
+	if (core)
+		atomic_inc(&core->refcnt);
+}
+
+static inline void ckrm_core_drop(struct ckrm_core_class *core)
+{
+	/* only make definition available in this context */
+	extern void ckrm_free_core_class(struct ckrm_core_class *core);
+	if (core && (atomic_dec_and_test(&core->refcnt)))
+		ckrm_free_core_class(core);
+}
+
+static inline unsigned int ckrm_is_core_valid(struct ckrm_core_class * core)
+{
+	return (core && (core->magic == CKRM_CORE_MAGIC));
+}
+
+/*
+ * iterate through all associate resource controllers:
+ * requires following arguments (ckrm_core_class *cls,
+ *                               ckrm_res_ctrl   *ctlr,
+ *                               void            *robj,
+ *                               int              bmap)
+ */
+
+#define forall_class_resobjs(cls,rcbs,robj,bmap)			\
+       for ( bmap=((cls->classtype)->bit_res_ctlrs) ;			\
+	     ({ int rid; ((rid=ffs(bmap)-1) >= 0) &&			\
+	                 (bmap &= ~(1<<rid),				\
+				((rcbs=cls->classtype->res_ctlrs[rid])	\
+				 && (robj=cls->res_class[rid]))); });	\
+           )
+
+extern struct ckrm_classtype *ckrm_classtypes[];	
+
+/*
+ * CE Invocation interface
+ */
+
+#define ce_protect(ctype)      (atomic_inc(&((ctype)->ce_nr_users)))
+#define ce_release(ctype)      (atomic_dec(&((ctype)->ce_nr_users)))
+
+/* CE Classification callbacks with */
+
+#define CE_CLASSIFY_NORET(ctype, event, objs_to_classify...)		\
+do {									\
+	if ((ctype)->ce_cb_active					\
+	    && (test_bit(event,&(ctype)->ce_callbacks.c_interest)))	\
+		(*(ctype)->ce_callbacks.classify)(event,		\
+						  objs_to_classify);	\
+} while (0)
+
+#define CE_CLASSIFY_RET(ret, ctype, event, objs_to_classify...)		\
+do {									\
+	if ((ctype)->ce_cb_active					\
+	    && (test_bit(event,&(ctype)->ce_callbacks.c_interest)))	\
+		ret = (*(ctype)->ce_callbacks.classify)(event,		\
+							objs_to_classify);\
+} while (0)
+
+#define CE_NOTIFY(ctype, event, cls, objs_to_classify)			\
+do {									\
+	if ((ctype)->ce_cb_active					\
+	    && (test_bit(event,&(ctype)->ce_callbacks.n_interest)))	\
+		(*(ctype)->ce_callbacks.notify)(event,			\
+						cls,objs_to_classify);	\
+} while (0)
+
+/*
+ * RCFS related
+ */
+
+/* vars needed by other modules/core */
+
+extern int rcfs_mounted;
+extern int rcfs_engine_regd;
+
+#endif /* CONFIG_CKRM */
+#endif /* _LINUX_CKRM_RC_H */
Index: linux-2.6.12-rc3-ckrm5/include/linux/rcfs.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/include/linux/rcfs.h	2005-05-05 09:35:04.000000000 -0700
@@ -0,0 +1,96 @@
+#ifndef _LINUX_RCFS_H
+#define _LINUX_RCFS_H
+
+#include <linux/kernel.h>
+#include <linux/fs.h>
+#include <linux/ckrm_events.h>
+#include <linux/ckrm_rc.h>
+#include <linux/ckrm_ce.h>
+
+/*
+ * The following declarations cannot be included in any of ckrm*.h files
+ * without jumping hoops. Remove later when rearrangements done
+ */
+
+#define RCFS_MAGIC	0x4feedbac
+#define RCFS_MAGF_NAMELEN 20
+extern int RCFS_IS_MAGIC;
+
+#define rcfs_is_magic(dentry)  ((dentry)->d_fsdata == &RCFS_IS_MAGIC)
+
+struct rcfs_inode_info {
+	struct ckrm_core_class *core;
+	char *name;
+	struct inode vfs_inode;
+};
+
+#define RCFS_DEFAULT_DIR_MODE	(S_IFDIR | S_IRUGO | S_IXUGO)
+#define RCFS_DEFAULT_FILE_MODE	(S_IFREG | S_IRUSR | S_IWUSR | S_IRGRP |S_IROTH)
+
+struct rcfs_magf {
+	char name[RCFS_MAGF_NAMELEN];
+	int mode;
+	struct inode_operations *i_op;
+	struct file_operations *i_fop;
+};
+
+struct rcfs_mfdesc {
+	struct rcfs_magf *rootmf;	/* Root directory and its magic files */
+	int rootmflen;			/* length of above array */
+	/*
+	 * Can have a different magf describing magic files
+	 * for non-root entries too.
+	 */
+};
+
+extern struct rcfs_mfdesc *genmfdesc[];
+
+struct rcfs_inode_info *RCFS_I(struct inode *inode);
+
+int rcfs_empty(struct dentry *);
+struct inode *rcfs_get_inode(struct super_block *, int, dev_t);
+int rcfs_mknod(struct inode *, struct dentry *, int, dev_t);
+int _rcfs_mknod(struct inode *, struct dentry *, int, dev_t);
+int rcfs_mkdir(struct inode *, struct dentry *, int);
+struct ckrm_core_class *rcfs_make_core(struct dentry *, struct ckrm_core_class *);
+struct dentry *rcfs_set_magf_byname(char *, void *);
+
+struct dentry *rcfs_create_internal(struct dentry *, struct rcfs_magf *, int);
+int rcfs_delete_internal(struct dentry *);
+int rcfs_create_magic(struct dentry *, struct rcfs_magf *, int);
+int rcfs_clear_magic(struct dentry *);
+
+extern struct super_operations rcfs_super_ops;
+extern struct address_space_operations rcfs_aops;
+
+extern struct inode_operations rcfs_dir_inode_operations;
+extern struct inode_operations rcfs_rootdir_inode_operations;
+extern struct inode_operations rcfs_file_inode_operations;
+
+extern struct file_operations target_fileops;
+extern struct file_operations shares_fileops;
+extern struct file_operations stats_fileops;
+extern struct file_operations config_fileops;
+extern struct file_operations members_fileops;
+extern struct file_operations reclassify_fileops;
+extern struct file_operations rcfs_file_operations;
+
+/* Callbacks into rcfs from ckrm */
+
+struct rcfs_functions {
+	int (*mkroot) (struct rcfs_magf *, int, struct dentry **);
+	int (*rmroot) (struct dentry *);
+	int (*register_classtype) (struct ckrm_classtype *);
+	int (*deregister_classtype) (struct ckrm_classtype *);
+};
+
+int rcfs_register_classtype(struct ckrm_classtype *);
+int rcfs_deregister_classtype(struct ckrm_classtype *);
+int rcfs_mkroot(struct rcfs_magf *, int, struct dentry **);
+int rcfs_rmroot(struct dentry *);
+
+#define RCFS_ROOT "/rcfs"  	/* TODO:  Should use the mount point */
+extern struct dentry *rcfs_rootde;
+extern struct rbce_eng_callback rcfs_eng_callbacks;
+
+#endif	/* _LINUX_RCFS_H */
Index: linux-2.6.12-rc3-ckrm5/include/linux/sched.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/include/linux/sched.h	2005-05-05 09:35:02.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/include/linux/sched.h	2005-05-05 09:35:04.000000000 -0700
@@ -741,6 +741,11 @@ struct task_struct {
 #ifdef CONFIG_DELAY_ACCT
 	struct task_delay_info delays;
 #endif
+#ifdef CONFIG_CKRM
+	spinlock_t  ckrm_tsklock;
+	void       *ce_data;
+#endif
+
 };
 
 static inline pid_t process_group(struct task_struct *tsk)
Index: linux-2.6.12-rc3-ckrm5/init/main.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/init/main.c	2005-05-05 09:33:00.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/init/main.c	2005-05-05 09:35:04.000000000 -0700
@@ -47,6 +47,7 @@
 #include <linux/rmap.h>
 #include <linux/mempolicy.h>
 #include <linux/key.h>
+#include <linux/ckrm_events.h>
 
 #include <asm/io.h>
 #include <asm/bugs.h>
@@ -465,6 +466,7 @@ asmlinkage void __init start_kernel(void
 	rcu_init();
 	init_IRQ();
 	pidhash_init();
+	ckrm_init();
 	init_timers();
 	softirq_init();
 	time_init();
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm.c	2005-05-05 09:35:04.000000000 -0700
@@ -0,0 +1,892 @@
+/* ckrm.c - Class-based Kernel Resource Management (CKRM)
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003, 2004
+ *           (C) Shailabh Nagar,  IBM Corp. 2003, 2004
+ *           (C) Chandra Seetharaman,  IBM Corp. 2003
+ *	     (C) Vivek Kashyap,	IBM Corp. 2004
+ *
+ *
+ * Provides kernel API of CKRM for in-kernel,per-resource controllers
+ * (one each for cpu, memory, io, network) and callbacks for
+ * classification modules.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include <linux/config.h>
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/mm.h>
+#include <linux/string.h>
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/ckrm_rc.h>
+#include <linux/rcfs.h>
+#include <net/sock.h>
+#include <linux/ip.h>
+
+#include <asm/uaccess.h>
+#include <asm/errno.h>
+
+rwlock_t ckrm_class_lock;	/* protects classlists */
+
+struct rcfs_functions rcfs_fn;
+EXPORT_SYMBOL_GPL(rcfs_fn);
+
+int rcfs_engine_regd;		/* rcfs state needed by another module */
+EXPORT_SYMBOL_GPL(rcfs_engine_regd);
+
+int rcfs_mounted;
+EXPORT_SYMBOL_GPL(rcfs_mounted);
+
+/*
+ * Helper Functions
+ */
+
+/*
+ * Return non-zero if the given resource is registered.
+ */
+inline unsigned int ckrm_is_res_regd(struct ckrm_classtype *clstype, int resid)
+{
+	return ((resid >= 0) && (resid < clstype->max_resid) &&
+		test_bit(resid, &clstype->bit_res_ctlrs)
+	    );
+}
+
+/*
+ * Return non-zero if the given core class pointer is valid.
+ */
+struct ckrm_res_ctlr *ckrm_resctlr_lookup(struct ckrm_classtype *clstype,
+					  const char *resname)
+{
+	int resid = -1;
+
+	if (!clstype || !resname)
+		return NULL;
+	for (resid = 0; resid < clstype->max_resid; resid++) {
+		if (test_bit(resid, &clstype->bit_res_ctlrs)) {
+			struct ckrm_res_ctlr *rctrl = clstype->res_ctlrs[resid];
+			if (!strncmp(resname, rctrl->res_name,
+				     CKRM_MAX_RES_NAME))
+				return rctrl;
+		}
+	}
+	return NULL;
+}
+
+EXPORT_SYMBOL_GPL(ckrm_resctlr_lookup);
+
+/* given a classname return the class handle and its classtype*/
+void *ckrm_classobj(char *classname, int *classtype_id)
+{
+	int i;
+
+	*classtype_id = -1;
+	if (!classname || !*classname) {
+		return NULL;
+	}
+
+	read_lock(&ckrm_class_lock);
+	for (i = 0; i < CKRM_MAX_CLASSTYPES; i++) {
+		struct ckrm_classtype *ctype = ckrm_classtypes[i];
+		struct ckrm_core_class *core;
+
+		if (ctype == NULL)
+			continue;
+		list_for_each_entry(core, &ctype->classes, clslist) {
+			if (core->name && !strcmp(core->name, classname)) {
+				/* FIXME:   should grep reference. */
+				*classtype_id = ctype->type_id;
+				return core;
+			}
+		}
+	}
+	read_unlock(&ckrm_class_lock);
+	return NULL;
+}
+
+EXPORT_SYMBOL_GPL(ckrm_is_res_regd);
+EXPORT_SYMBOL_GPL(ckrm_classobj);
+
+/*
+ * Internal Functions/macros
+ */
+
+static inline void set_callbacks_active(struct ckrm_classtype *ctype)
+{
+	ctype->ce_cb_active = ((atomic_read(&ctype->ce_regd) > 0) &&
+			       (ctype->ce_callbacks.always_callback
+				|| (ctype->num_classes > 1)));
+}
+
+int ckrm_validate_and_grab_core(struct ckrm_core_class *core)
+{
+	int rc = 0;
+	read_lock(&ckrm_class_lock);
+	if (likely(ckrm_is_core_valid(core))) {
+		ckrm_core_grab(core);
+		rc = 1;
+	}
+	read_unlock(&ckrm_class_lock);
+	return rc;
+}
+
+/*
+ * Interfaces for classification engine
+ */
+
+/*
+ * Registering a callback structure by the classification engine.
+ *
+ * Returns typeId of class on success -errno for failure.
+ */
+int ckrm_register_engine(const char *typename, struct ckrm_eng_callback * ecbs)
+{
+	struct ckrm_classtype *ctype;
+
+	ctype = ckrm_find_classtype_by_name(typename);
+	if (ctype == NULL)
+		return (-ENOENT);
+
+	atomic_inc(&ctype->ce_regd);
+
+	/* another engine registered or trying to register ? */
+	if (atomic_read(&ctype->ce_regd) != 1) {
+		atomic_dec(&ctype->ce_regd);
+		return (-EBUSY);
+	}
+
+	/*
+	 * One of the following must be set:
+	 * classify, class_delete (due to object reference) or
+	 * notify (case where notification supported but not classification)
+	 * The function pointer must be set the momement the mask is non-null
+	 */
+	if (!(((ecbs->classify) && (ecbs->class_delete)) || (ecbs->notify)) ||
+	    (ecbs->c_interest && ecbs->classify == NULL) ||
+	    (ecbs->n_interest && ecbs->notify == NULL)) {
+		atomic_dec(&ctype->ce_regd);
+		return -EINVAL;
+	}
+
+	ctype->ce_callbacks = *ecbs;
+	set_callbacks_active(ctype);
+
+	if (ctype->ce_callbacks.class_add) {
+		struct ckrm_core_class *core;
+
+		read_lock(&ckrm_class_lock);
+		list_for_each_entry(core, &ctype->classes, clslist) {
+			(*ctype->ce_callbacks.class_add) (core->name, core,
+							  ctype->type_id);
+		}
+		read_unlock(&ckrm_class_lock);
+	}
+	return ctype->type_id;
+}
+
+/*
+ * Unregistering a callback structure by the classification engine.
+ *
+ * Returns 0 on success -errno for failure.
+ */
+int ckrm_unregister_engine(const char *typename)
+{
+	struct ckrm_classtype *ctype;
+
+	ctype = ckrm_find_classtype_by_name(typename);
+	if (ctype == NULL)
+		return (-ENOENT);
+
+	ctype->ce_cb_active = 0;
+	if (atomic_read(&ctype->ce_nr_users) > 1) {
+		/* Somebody is currently using the engine, cannot deregister. */
+		return (-EAGAIN);
+	}
+	atomic_set(&ctype->ce_regd, 0);
+	memset(&ctype->ce_callbacks, 0, sizeof(struct ckrm_eng_callback));
+	return 0;
+}
+
+/*
+ * Interfaces to manipulate class (core or resource) hierarchies
+ */
+
+static void
+ckrm_add_child(struct ckrm_core_class *parent, struct ckrm_core_class *child)
+{
+	struct ckrm_hnode *cnode = &child->hnode;
+
+	if (!ckrm_is_core_valid(child)) {
+		printk(KERN_ERR "Invalid child %p given in ckrm_add_child\n",
+		       child);
+		return;
+	}
+	class_lock(child);
+	INIT_LIST_HEAD(&cnode->children);
+	INIT_LIST_HEAD(&cnode->siblings);
+
+	if (parent) {
+		struct ckrm_hnode *pnode;
+
+		if (!ckrm_is_core_valid(parent)) {
+			printk(KERN_ERR
+			       "Invalid parent %p given in ckrm_add_child\n",
+			       parent);
+			parent = NULL;
+		} else {
+			pnode = &parent->hnode;
+			write_lock(&parent->hnode_rwlock);
+			list_add(&cnode->siblings, &pnode->children);
+			write_unlock(&parent->hnode_rwlock);
+		}
+	}
+	cnode->parent = parent;
+	class_unlock(child);
+	return;
+}
+
+static int ckrm_remove_child(struct ckrm_core_class *child)
+{
+	struct ckrm_hnode *cnode, *pnode;
+	struct ckrm_core_class *parent;
+
+	if (!ckrm_is_core_valid(child)) {
+		printk(KERN_ERR "Invalid child %p given"
+		       		" in ckrm_remove_child\n",
+		       	child);
+		return 0;
+	}
+
+	cnode = &child->hnode;
+	parent = cnode->parent;
+	if (!ckrm_is_core_valid(parent)) {
+		printk(KERN_ERR "Invalid parent %p in ckrm_remove_child\n",
+		       parent);
+		return 0;
+	}
+
+	pnode = &parent->hnode;
+
+	class_lock(child);
+	/* ensure that the node does not have children */
+	if (!list_empty(&cnode->children)) {
+		class_unlock(child);
+		return 0;
+	}
+	write_lock(&parent->hnode_rwlock);
+	list_del(&cnode->siblings);
+	write_unlock(&parent->hnode_rwlock);
+	cnode->parent = NULL;
+	class_unlock(child);
+	return 1;
+}
+
+void ckrm_lock_hier(struct ckrm_core_class *parent)
+{
+	if (ckrm_is_core_valid(parent)) {
+		read_lock(&parent->hnode_rwlock);
+	}
+}
+
+void ckrm_unlock_hier(struct ckrm_core_class *parent)
+{
+	if (ckrm_is_core_valid(parent)) {
+		read_unlock(&parent->hnode_rwlock);
+	}
+}
+
+/*
+ * hnode_rwlock of the parent core class must held in read mode.
+ * external callers should 've called ckrm_lock_hier before calling this
+ * function.
+ */
+#define hnode_2_core(ptr) \
+((ptr)? container_of(ptr, struct ckrm_core_class, hnode) : NULL)
+
+struct ckrm_core_class *ckrm_get_next_child(struct ckrm_core_class *parent,
+					    struct ckrm_core_class *child)
+{
+	struct list_head *cnode;
+	struct ckrm_hnode *next_cnode;
+	struct ckrm_core_class *next_childcore;
+
+	if (!ckrm_is_core_valid(parent)) {
+		printk(KERN_ERR "Invalid parent %p in ckrm_get_next_child\n",
+		       parent);
+		return NULL;
+	}
+	if (list_empty(&parent->hnode.children)) {
+		return NULL;
+	}
+	if (child) {
+		if (!ckrm_is_core_valid(child)) {
+			printk(KERN_ERR
+			       "Invalid child %p in ckrm_get_next_child\n",
+			       child);
+			return NULL;
+		}
+		cnode = child->hnode.siblings.next;
+	} else {
+		cnode = parent->hnode.children.next;
+	}
+
+	if (cnode == &parent->hnode.children) {	/* back at the anchor */
+		return NULL;
+	}
+
+	next_cnode = container_of(cnode, struct ckrm_hnode, siblings);
+	next_childcore = hnode_2_core(next_cnode);
+
+	if (!ckrm_is_core_valid(next_childcore)) {
+		printk(KERN_ERR
+		       "Invalid next child %p in ckrm_get_next_child\n",
+		       next_childcore);
+		return NULL;
+	}
+	return next_childcore;
+}
+
+EXPORT_SYMBOL_GPL(ckrm_lock_hier);
+EXPORT_SYMBOL_GPL(ckrm_unlock_hier);
+EXPORT_SYMBOL_GPL(ckrm_get_next_child);
+
+static void
+ckrm_alloc_res_class(struct ckrm_core_class *core,
+		     struct ckrm_core_class *parent, int resid)
+{
+
+	struct ckrm_classtype *clstype;
+	/*
+	 * Allocate a resource class only if the resource controller has
+	 * registered with core and the engine requests for the class.
+	 */
+	if (!ckrm_is_core_valid(core))
+		return;
+	clstype = core->classtype;
+	core->res_class[resid] = NULL;
+
+	if (test_bit(resid, &clstype->bit_res_ctlrs)) {
+		struct ckrm_res_ctlr *rcbs;
+
+		atomic_inc(&clstype->nr_resusers[resid]);
+		rcbs = clstype->res_ctlrs[resid];
+
+		if (rcbs && rcbs->res_alloc) {
+			core->res_class[resid] =
+			    (*rcbs->res_alloc) (core, parent);
+			if (core->res_class[resid])
+				return;
+			printk(KERN_ERR "Error creating res class\n");
+		}
+		atomic_dec(&clstype->nr_resusers[resid]);
+	}
+}
+
+/*
+ * Initialize a core class
+ *
+ */
+
+int
+ckrm_init_core_class(struct ckrm_classtype *clstype,
+		     struct ckrm_core_class *dcore,
+		     struct ckrm_core_class *parent, const char *name)
+{
+	/* TODO:  Should replace name with dentry or add dentry? */
+	int i;
+
+	/* TODO:  How is this used in initialization? */
+	pr_debug("name %s => %p\n", name ? name : "default", dcore);
+	if ((dcore != clstype->default_class) && (!ckrm_is_core_valid(parent))){
+		printk(KERN_NOTICE "error not a valid parent %p\n", parent);
+		return -EINVAL;
+	}
+	dcore->classtype = clstype;
+	dcore->magic = CKRM_CORE_MAGIC;
+	dcore->name = name;
+	dcore->class_lock = SPIN_LOCK_UNLOCKED;
+	dcore->hnode_rwlock = RW_LOCK_UNLOCKED;
+	dcore->delayed = 0;
+
+	atomic_set(&dcore->refcnt, 0);
+	write_lock(&ckrm_class_lock);
+
+	INIT_LIST_HEAD(&dcore->objlist);
+	list_add_tail(&dcore->clslist, &clstype->classes);
+
+	clstype->num_classes++;
+	set_callbacks_active(clstype);
+
+	write_unlock(&ckrm_class_lock);
+	ckrm_add_child(parent, dcore);
+
+	for (i = 0; i < clstype->max_resid; i++)
+		ckrm_alloc_res_class(dcore, parent, i);
+
+	/* fix for race condition seen in stress with numtasks */
+	if (parent)
+		ckrm_core_grab(parent);
+
+	ckrm_core_grab(dcore);
+	return 0;
+}
+
+static void ckrm_free_res_class(struct ckrm_core_class *core, int resid)
+{
+	/*
+	 * Free a resource class only if the resource controller has
+	 * registered with core
+	 */
+	if (core->res_class[resid]) {
+		struct ckrm_res_ctlr *rcbs;
+		struct ckrm_classtype *clstype = core->classtype;
+
+		atomic_inc(&clstype->nr_resusers[resid]);
+		rcbs = clstype->res_ctlrs[resid];
+
+		if (rcbs->res_free) {
+			(*rcbs->res_free) (core->res_class[resid]);
+			/* compensate inc in alloc */
+			atomic_dec(&clstype->nr_resusers[resid]);
+		}
+		atomic_dec(&clstype->nr_resusers[resid]);
+	}
+	core->res_class[resid] = NULL;
+}
+
+/*
+ * Free a core class
+ *   requires that all tasks were previously reassigned to another class
+ *
+ * Returns 0 on success -errno on failure.
+ */
+
+void ckrm_free_core_class(struct ckrm_core_class *core)
+{
+	int i;
+	struct ckrm_classtype *clstype = core->classtype;
+	struct ckrm_core_class *parent = core->hnode.parent;
+
+	pr_debug("core=%p:%s parent=%p:%s\n", core, core->name, parent,
+		  parent->name);
+	if (core->delayed) {
+		/* this core was marked as late */
+		printk("class <%s> finally deleted %lu\n", core->name, jiffies);
+	}
+	if (ckrm_remove_child(core) == 0) {
+		printk("Core class removal failed. Chilren present\n");
+	}
+	for (i = 0; i < clstype->max_resid; i++) {
+		ckrm_free_res_class(core, i);
+	}
+
+	write_lock(&ckrm_class_lock);
+	/* Clear the magic, so we would know if this core is reused. */
+	core->magic = 0;
+#if 0				/* Dynamic not yet enabled */
+	core->res_class = NULL;
+#endif
+	/* Remove this core class from its linked list. */
+	list_del(&core->clslist);
+	clstype->num_classes--;
+	set_callbacks_active(clstype);
+	write_unlock(&ckrm_class_lock);
+
+	/* fix for race condition seen in stress with numtasks */
+	if (parent)
+		ckrm_core_drop(parent);
+
+	kfree(core);
+}
+
+int ckrm_release_core_class(struct ckrm_core_class *core)
+{
+	if (!ckrm_is_core_valid(core)) /* Invalid core */
+		return -EINVAL;
+
+	if (core == core->classtype->default_class)
+		return 0;
+
+	/* need to make sure that the classgot really dropped */
+	if (atomic_read(&core->refcnt) != 1) {
+		pr_debug("class <%s> deletion delayed refcnt=%d jif=%ld\n",
+			  core->name, atomic_read(&core->refcnt), jiffies);
+		core->delayed = 1;	/* just so we have a ref point */
+	}
+	ckrm_core_drop(core);
+	return 0;
+}
+
+/*
+ * Interfaces for the resource controller
+ */
+/*
+ * Registering a callback structure by the resource controller.
+ *
+ * Returns the resource id(0 or +ve) on success, -errno for failure.
+ */
+static int
+ckrm_register_res_ctlr_intern(struct ckrm_classtype *clstype,
+			      struct ckrm_res_ctlr * rcbs)
+{
+	int resid, ret, i;
+
+	if (!rcbs)
+		return -EINVAL;
+
+	resid = rcbs->resid;
+
+	spin_lock(&clstype->res_ctlrs_lock);
+	printk(KERN_WARNING "resid is %d name is %s %s\n",
+	       resid, rcbs->res_name, clstype->res_ctlrs[resid]->res_name);
+	if (resid >= 0) {
+		if ((resid < CKRM_MAX_RES_CTLRS)
+		    && (clstype->res_ctlrs[resid] == NULL)) {
+			clstype->res_ctlrs[resid] = rcbs;
+			atomic_set(&clstype->nr_resusers[resid], 0);
+			set_bit(resid, &clstype->bit_res_ctlrs);
+			ret = resid;
+			if (resid >= clstype->max_resid) {
+				clstype->max_resid = resid + 1;
+			}
+		} else {
+			ret = -EBUSY;
+		}
+		spin_unlock(&clstype->res_ctlrs_lock);
+		return ret;
+	}
+	for (i = clstype->resid_reserved; i < clstype->max_res_ctlrs; i++) {
+		if (clstype->res_ctlrs[i] == NULL) {
+			clstype->res_ctlrs[i] = rcbs;
+			rcbs->resid = i;
+			atomic_set(&clstype->nr_resusers[i], 0);
+			set_bit(i, &clstype->bit_res_ctlrs);
+			if (i >= clstype->max_resid) {
+				clstype->max_resid = i + 1;
+			}
+			spin_unlock(&clstype->res_ctlrs_lock);
+			return i;
+		}
+	}
+	spin_unlock(&clstype->res_ctlrs_lock);
+	return (-ENOMEM);
+}
+
+int
+ckrm_register_res_ctlr(struct ckrm_classtype *clstype, struct ckrm_res_ctlr *rcbs)
+{
+	struct ckrm_core_class *core;
+	int resid;
+
+	resid = ckrm_register_res_ctlr_intern(clstype, rcbs);
+
+	if (resid >= 0) {
+		/* run through all classes and create the resource class
+		 * object and if necessary "initialize" class in context
+		 * of this resource
+		 */
+		read_lock(&ckrm_class_lock);
+		list_for_each_entry(core, &clstype->classes, clslist) {
+			printk("CKRM .. create res clsobj for resouce <%s>"
+			       "class <%s> par=%p\n", rcbs->res_name,
+			       core->name, core->hnode.parent);
+			ckrm_alloc_res_class(core, core->hnode.parent, resid);
+
+			if (clstype->add_resctrl) {
+				/* FIXME: this should be mandatory */
+				(*clstype->add_resctrl) (core, resid);
+			}
+		}
+		read_unlock(&ckrm_class_lock);
+	}
+	return resid;
+}
+
+/*
+ * Unregistering a callback structure by the resource controller.
+ *
+ * Returns 0 on success -errno for failure.
+ */
+int ckrm_unregister_res_ctlr(struct ckrm_res_ctlr *rcbs)
+{
+	struct ckrm_classtype *clstype = rcbs->classtype;
+	struct ckrm_core_class *core = NULL;
+	int resid = rcbs->resid;
+
+	if ((clstype == NULL) || (resid < 0)) {
+		return -EINVAL;
+	}
+	/* TODO: probably need to also call deregistration function */
+
+	read_lock(&ckrm_class_lock);
+	/* free up this resource from all the classes */
+	list_for_each_entry(core, &clstype->classes, clslist) {
+		ckrm_free_res_class(core, resid);
+	}
+	read_unlock(&ckrm_class_lock);
+
+	if (atomic_read(&clstype->nr_resusers[resid])) {
+		return -EBUSY;
+	}
+
+	spin_lock(&clstype->res_ctlrs_lock);
+	clstype->res_ctlrs[resid] = NULL;
+	clear_bit(resid, &clstype->bit_res_ctlrs);
+	clstype->max_resid = fls(clstype->bit_res_ctlrs);
+	rcbs->resid = -1;
+	spin_unlock(&clstype->res_ctlrs_lock);
+
+	return 0;
+}
+
+/*
+ * Class Type Registration
+ */
+
+/* TODO: What locking is needed here?*/
+
+struct ckrm_classtype *ckrm_classtypes[CKRM_MAX_CLASSTYPES];
+EXPORT_SYMBOL_GPL(ckrm_classtypes);	
+
+int ckrm_register_classtype(struct ckrm_classtype *clstype)
+{
+	int tid = clstype->type_id;
+
+	if (tid != -1) {
+		if ((tid < 0) || (tid > CKRM_MAX_CLASSTYPES)
+		    || (ckrm_classtypes[tid]))
+			return -EINVAL;
+	} else {
+		int i;
+		for (i = CKRM_RESV_CLASSTYPES; i < CKRM_MAX_CLASSTYPES; i++) {
+			if (ckrm_classtypes[i] == NULL) {
+				tid = i;
+				break;
+			}
+		}
+	}
+	if (tid == -1)
+		return -EBUSY;
+	clstype->type_id = tid;
+	ckrm_classtypes[tid] = clstype;
+
+	/* TODO: Need to call the callbacks of the RCFS client */
+	if (rcfs_fn.register_classtype) {
+		(*rcfs_fn.register_classtype) (clstype);
+		/* No error return for now. */
+	}
+	return tid;
+}
+
+int ckrm_unregister_classtype(struct ckrm_classtype *clstype)
+{
+	int tid = clstype->type_id;
+
+	if ((tid < 0) || (tid > CKRM_MAX_CLASSTYPES)
+	    || (ckrm_classtypes[tid] != clstype))
+		return -EINVAL;
+
+	if (rcfs_fn.deregister_classtype) {
+		(*rcfs_fn.deregister_classtype) (clstype);
+		/* No error return for now */
+	}
+
+	ckrm_classtypes[tid] = NULL;
+	clstype->type_id = -1;
+	return 0;
+}
+
+struct ckrm_classtype *ckrm_find_classtype_by_name(const char *name)
+{
+	int i;
+	for (i = 0; i < CKRM_MAX_CLASSTYPES; i++) {
+		struct ckrm_classtype *ctype = ckrm_classtypes[i];
+		if (ctype && !strncmp(ctype->name, name, CKRM_MAX_TYPENAME_LEN))
+			return ctype;
+	}
+	return NULL;
+}
+
+/*
+ *   Generic Functions that can be used as default functions
+ *   in almost all classtypes
+ *     (a) function iterator over all resource classes of a class
+ *     (b) function invoker on a named resource
+ */
+
+int ckrm_class_show_shares(struct ckrm_core_class *core, struct seq_file *seq)
+{
+	int i;
+	struct ckrm_res_ctlr *rcbs;
+	struct ckrm_classtype *clstype = core->classtype;
+	struct ckrm_shares shares;
+
+	for (i = 0; i < clstype->max_resid; i++) {
+		atomic_inc(&clstype->nr_resusers[i]);
+		rcbs = clstype->res_ctlrs[i];
+		if (rcbs && rcbs->get_share_values) {
+			(*rcbs->get_share_values) (core->res_class[i], &shares);
+			seq_printf(seq,"res=%s,guarantee=%d,limit=%d,"
+				   "total_guarantee=%d,max_limit=%d\n",
+				   rcbs->res_name, shares.my_guarantee,
+				   shares.my_limit, shares.total_guarantee,
+				   shares.max_limit);
+		}
+		atomic_dec(&clstype->nr_resusers[i]);
+	}
+	return 0;
+}
+
+int ckrm_class_show_stats(struct ckrm_core_class *core, struct seq_file *seq)
+{
+	int i;
+	struct ckrm_res_ctlr *rcbs;
+	struct ckrm_classtype *clstype = core->classtype;
+
+	for (i = 0; i < clstype->max_resid; i++) {
+		atomic_inc(&clstype->nr_resusers[i]);
+		rcbs = clstype->res_ctlrs[i];
+		if (rcbs && rcbs->get_stats)
+			(*rcbs->get_stats) (core->res_class[i], seq);
+		atomic_dec(&clstype->nr_resusers[i]);
+	}
+	return 0;
+}
+
+int ckrm_class_show_config(struct ckrm_core_class *core, struct seq_file *seq)
+{
+	int i;
+	struct ckrm_res_ctlr *rcbs;
+	struct ckrm_classtype *clstype = core->classtype;
+
+	for (i = 0; i < clstype->max_resid; i++) {
+		atomic_inc(&clstype->nr_resusers[i]);
+		rcbs = clstype->res_ctlrs[i];
+		if (rcbs && rcbs->show_config)
+			(*rcbs->show_config) (core->res_class[i], seq);
+		atomic_dec(&clstype->nr_resusers[i]);
+	}
+	return 0;
+}
+
+int ckrm_class_set_config(struct ckrm_core_class *core, const char *resname,
+			  const char *cfgstr)
+{
+	struct ckrm_classtype *clstype = core->classtype;
+	struct ckrm_res_ctlr *rcbs = ckrm_resctlr_lookup(clstype, resname);
+	int rc;
+
+	if (rcbs == NULL || rcbs->set_config == NULL)
+		return -EINVAL;
+	rc = (*rcbs->set_config) (core->res_class[rcbs->resid], cfgstr);
+	return rc;
+}
+
+#define legalshare(a)   \
+         ( ((a) >=0) \
+	   || ((a) == CKRM_SHARE_UNCHANGED) \
+	   || ((a) == CKRM_SHARE_DONTCARE) )
+
+int ckrm_class_set_shares(struct ckrm_core_class *core, const char *resname,
+			  struct ckrm_shares *shares)
+{
+	struct ckrm_classtype *clstype = core->classtype;
+	struct ckrm_res_ctlr *rcbs;
+	int rc;
+
+	/* Check for legal values */
+	if (!legalshare(shares->my_guarantee) || !legalshare(shares->my_limit)
+	    || !legalshare(shares->total_guarantee)
+	    || !legalshare(shares->max_limit))
+		return -EINVAL;
+
+	rcbs = ckrm_resctlr_lookup(clstype, resname);
+	if (rcbs == NULL || rcbs->set_share_values == NULL)
+		return -EINVAL;
+	rc = (*rcbs->set_share_values) (core->res_class[rcbs->resid], shares);
+	return rc;
+}
+
+int ckrm_class_reset_stats(struct ckrm_core_class *core, const char *resname,
+			   const char *unused)
+{
+	struct ckrm_classtype *clstype = core->classtype;
+	struct ckrm_res_ctlr *rcbs = ckrm_resctlr_lookup(clstype, resname);
+	int rc;
+
+	if (rcbs == NULL || rcbs->reset_stats == NULL)
+		return -EINVAL;
+	rc = (*rcbs->reset_stats) (core->res_class[rcbs->resid]);
+	return rc;
+}
+
+/*
+ * Initialization
+ */
+
+void __init ckrm_init(void)
+{
+	printk("CKRM Initialization\n");
+	rwlock_init(&ckrm_class_lock);
+
+	/* register/initialize the Metatypes */
+
+#ifdef CONFIG_CKRM_TYPE_TASKCLASS
+	{
+		extern void ckrm_meta_init_taskclass(void);
+		ckrm_meta_init_taskclass();
+	}
+#endif
+#ifdef CONFIG_CKRM_TYPE_SOCKETCLASS
+	{
+		extern void ckrm_meta_init_sockclass(void);
+		ckrm_meta_init_sockclass();
+	}
+#endif
+	/* prepare init_task and then rely on inheritance of properties */
+	ckrm_cb_newtask(&init_task);
+	printk("CKRM Initialization done\n");
+}
+
+EXPORT_SYMBOL_GPL(ckrm_register_engine);
+EXPORT_SYMBOL_GPL(ckrm_unregister_engine);
+
+EXPORT_SYMBOL_GPL(ckrm_register_res_ctlr);
+EXPORT_SYMBOL_GPL(ckrm_unregister_res_ctlr);
+
+EXPORT_SYMBOL_GPL(ckrm_init_core_class);
+EXPORT_SYMBOL_GPL(ckrm_free_core_class);
+EXPORT_SYMBOL_GPL(ckrm_release_core_class);
+
+EXPORT_SYMBOL_GPL(ckrm_register_classtype);
+EXPORT_SYMBOL_GPL(ckrm_unregister_classtype);
+EXPORT_SYMBOL_GPL(ckrm_find_classtype_by_name);
+
+EXPORT_SYMBOL_GPL(ckrm_core_grab);
+EXPORT_SYMBOL_GPL(ckrm_core_drop);
+EXPORT_SYMBOL_GPL(ckrm_is_core_valid);
+EXPORT_SYMBOL_GPL(ckrm_validate_and_grab_core);
+
+EXPORT_SYMBOL_GPL(ckrm_register_event_set);
+EXPORT_SYMBOL_GPL(ckrm_unregister_event_set);
+EXPORT_SYMBOL_GPL(ckrm_register_event_cb);
+EXPORT_SYMBOL_GPL(ckrm_unregister_event_cb);
+
+EXPORT_SYMBOL_GPL(ckrm_class_show_stats);
+EXPORT_SYMBOL_GPL(ckrm_class_show_config);
+EXPORT_SYMBOL_GPL(ckrm_class_show_shares);
+
+EXPORT_SYMBOL_GPL(ckrm_class_set_config);
+EXPORT_SYMBOL_GPL(ckrm_class_set_shares);
+
+EXPORT_SYMBOL_GPL(ckrm_class_reset_stats);
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrmutils.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrmutils.c	2005-05-05 09:35:04.000000000 -0700
@@ -0,0 +1,188 @@
+/*
+ * ckrmutils.c - Utility functions for CKRM
+ *
+ * Copyright (C) Chandra Seetharaman,  IBM Corp. 2003
+ *           (C) Hubertus Franke    ,  IBM Corp. 2004
+ *
+ * Provides simple utility functions for the core module, CE and resource
+ * controllers.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ *  published by the Free Software Foundation.
+ */
+
+#include <linux/mm.h>
+#include <linux/err.h>
+#include <linux/mount.h>
+#include <linux/module.h>
+#include <linux/ckrm_rc.h>
+
+int get_exe_path_name(struct task_struct *tsk, char *buf, int buflen)
+{
+	struct vm_area_struct *vma;
+	struct vfsmount *mnt;
+	struct mm_struct *mm = get_task_mm(tsk);
+	struct dentry *dentry;
+	char *lname;
+	int rc = 0;
+
+	*buf = '\0';
+	if (!mm) {
+		return -EINVAL;
+	}
+	down_read(&mm->mmap_sem);
+	vma = mm->mmap;
+	while (vma) {
+		if ((vma->vm_flags & VM_EXECUTABLE) && vma->vm_file) {
+			dentry = dget(vma->vm_file->f_dentry);
+			mnt = mntget(vma->vm_file->f_vfsmnt);
+			lname = d_path(dentry, mnt, buf, buflen);
+			if (!IS_ERR(lname)) {
+				strncpy(buf, lname, strlen(lname) + 1);
+			} else {
+				rc = (int)PTR_ERR(lname);
+			}
+			mntput(mnt);
+			dput(dentry);
+			break;
+		}
+		vma = vma->vm_next;
+	}
+	up_read(&mm->mmap_sem);
+	mmput(mm);
+	return rc;
+}
+
+/*
+ * TODO:  Use sparce to enforce cnt_lock.
+ *
+ * must be called with cnt_lock of parres held
+ * Caller is responsible for making sure that the new guarantee doesn't
+ * overflow parent's total guarantee.
+ */
+void child_guarantee_changed(struct ckrm_shares *parent, int cur, int new)
+{
+	if (new == cur || !parent) {
+		return;
+	}
+	if (new != CKRM_SHARE_DONTCARE) {
+		parent->unused_guarantee -= new;
+	}
+	if (cur != CKRM_SHARE_DONTCARE) {
+		parent->unused_guarantee += cur;
+	}
+	return;
+}
+
+/*
+ * must be called with cnt_lock of parres held
+ * Caller is responsible for making sure that the new limit is not more
+ * than parent's max_limit
+ */
+void child_maxlimit_changed(struct ckrm_shares *parent, int new_limit)
+{
+	if (parent && parent->cur_max_limit < new_limit) {
+		parent->cur_max_limit = new_limit;
+	}
+	return;
+}
+
+/*
+ * Caller is responsible for holding any lock to protect the data
+ * structures passed to this function
+ */
+int
+set_shares(struct ckrm_shares *new, struct ckrm_shares *cur,
+	   struct ckrm_shares *par)
+{
+	int rc = -EINVAL;
+	int cur_usage_guar = cur->total_guarantee - cur->unused_guarantee;
+	int increase_by = new->my_guarantee - cur->my_guarantee;
+
+	/* Check total_guarantee for correctness */
+	if (new->total_guarantee <= CKRM_SHARE_DONTCARE) {
+		goto set_share_err;
+	} else if (new->total_guarantee == CKRM_SHARE_UNCHANGED) {
+		/* do nothing */;
+	} else if (cur_usage_guar > new->total_guarantee) {
+		goto set_share_err;
+	}
+	/* Check max_limit for correctness */
+	if (new->max_limit <= CKRM_SHARE_DONTCARE) {
+		goto set_share_err;
+	} else if (new->max_limit == CKRM_SHARE_UNCHANGED) {
+		/* do nothing */;
+	} else if (cur->cur_max_limit > new->max_limit) {
+		goto set_share_err;
+	}
+	/* Check my_guarantee for correctness */
+	if (new->my_guarantee == CKRM_SHARE_UNCHANGED) {
+		/* do nothing */;
+	} else if (new->my_guarantee == CKRM_SHARE_DONTCARE) {
+		/* do nothing */;
+	} else if (par && increase_by > par->unused_guarantee) {
+		goto set_share_err;
+	}
+	/* Check my_limit for correctness */
+	if (new->my_limit == CKRM_SHARE_UNCHANGED) {
+		/* do nothing */;
+	} else if (new->my_limit == CKRM_SHARE_DONTCARE) {
+		/* do nothing */;
+	} else if (par && new->my_limit > par->max_limit) {
+		/* I can't get more limit than my parent's limit */
+		goto set_share_err;
+
+	}
+	/* make sure guarantee is lesser than limit */
+	if (new->my_limit == CKRM_SHARE_DONTCARE) {
+		/* do nothing */;
+	} else if (new->my_limit == CKRM_SHARE_UNCHANGED) {
+		if (new->my_guarantee == CKRM_SHARE_DONTCARE) {
+			/* do nothing */;
+		} else if (new->my_guarantee == CKRM_SHARE_UNCHANGED) {
+			/*
+			 * do nothing; earlier setting would have
+			 * taken care of it
+			 */;
+		} else if (new->my_guarantee > cur->my_limit) {
+			goto set_share_err;
+		}
+	} else {		/* new->my_limit has a valid value */
+		if (new->my_guarantee == CKRM_SHARE_DONTCARE) {
+			/* do nothing */;
+		} else if (new->my_guarantee == CKRM_SHARE_UNCHANGED) {
+			if (cur->my_guarantee > new->my_limit) {
+				goto set_share_err;
+			}
+		} else if (new->my_guarantee > new->my_limit) {
+			goto set_share_err;
+		}
+	}
+	if (new->my_guarantee != CKRM_SHARE_UNCHANGED) {
+		child_guarantee_changed(par, cur->my_guarantee,
+					new->my_guarantee);
+		cur->my_guarantee = new->my_guarantee;
+	}
+	if (new->my_limit != CKRM_SHARE_UNCHANGED) {
+		child_maxlimit_changed(par, new->my_limit);
+		cur->my_limit = new->my_limit;
+	}
+	if (new->total_guarantee != CKRM_SHARE_UNCHANGED) {
+		cur->unused_guarantee = new->total_guarantee - cur_usage_guar;
+		cur->total_guarantee = new->total_guarantee;
+	}
+	if (new->max_limit != CKRM_SHARE_UNCHANGED) {
+		cur->max_limit = new->max_limit;
+	}
+	rc = 0;
+set_share_err:
+	return rc;
+}
+
+EXPORT_SYMBOL_GPL(get_exe_path_name);
+EXPORT_SYMBOL_GPL(child_guarantee_changed);
+EXPORT_SYMBOL_GPL(child_maxlimit_changed);
+EXPORT_SYMBOL_GPL(set_shares);
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/Makefile	2005-05-05 09:34:55.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile	2005-05-05 09:35:04.000000000 -0700
@@ -2,4 +2,4 @@
 # Makefile for CKRM
 #
 
-obj-y := ckrm_events.o
+obj-y += ckrm_events.o ckrm.o ckrmutils.o

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 04/21] CKRM: Resource Control File System (rcfs)
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (2 preceding siblings ...)
  2005-05-05 18:07 ` [patch 03/21] CKRM: Core infrastructure gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 05/21] CKRM: Classtype definitions for task class gh
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=04-diff_rcfs


Updates CKRM Resource Control Filesystem (rcfs) to include full
directory structure support.  This is the user level API for managing
CKRM.

Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@us.ibm.com>
Signed-Off-By: Vivek Kashyap <vivk@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>

 fs/Makefile          |    1 
 fs/rcfs/Makefile     |    7 
 fs/rcfs/dir.c        |  220 +++++++++++++++++++++
 fs/rcfs/inode.c      |  160 +++++++++++++++
 fs/rcfs/magic.c      |  517 +++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/rcfs/rootdir.c    |  198 +++++++++++++++++++
 fs/rcfs/super.c      |  291 ++++++++++++++++++++++++++++
 include/linux/rcfs.h |   20 +
 init/Kconfig         |   10 
 9 files changed, 1417 insertions(+), 7 deletions(-)

Index: linux-2.6.12-rc3-ckrm5/fs/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/fs/Makefile	2005-03-01 23:38:10.000000000 -0800
+++ linux-2.6.12-rc3-ckrm5/fs/Makefile	2005-05-05 09:35:06.000000000 -0700
@@ -92,6 +92,7 @@ obj-$(CONFIG_JFS_FS)		+= jfs/
 obj-$(CONFIG_XFS_FS)		+= xfs/
 obj-$(CONFIG_AFS_FS)		+= afs/
 obj-$(CONFIG_BEFS_FS)		+= befs/
+obj-$(CONFIG_RCFS_FS)		+= rcfs/
 obj-$(CONFIG_HOSTFS)		+= hostfs/
 obj-$(CONFIG_HPPFS)		+= hppfs/
 obj-$(CONFIG_DEBUG_FS)		+= debugfs/
Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/dir.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/dir.c	2005-05-05 09:35:06.000000000 -0700
@@ -0,0 +1,220 @@
+/*
+ * fs/rcfs/dir.c
+ *
+ * Copyright (C) Shailabh Nagar,  IBM Corp. 2004
+ *               Vivek Kashyap,   IBM Corp. 2004
+ *
+ *
+ * Directory operations for rcfs
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/namespace.h>
+#include <linux/dcache.h>
+#include <linux/seq_file.h>
+#include <linux/pagemap.h>
+#include <linux/highmem.h>
+#include <linux/init.h>
+#include <linux/string.h>
+#include <linux/smp_lock.h>
+#include <linux/backing-dev.h>
+#include <linux/parser.h>
+#include <linux/rcfs.h>
+#include <asm/uaccess.h>
+
+#define rcfs_positive(dentry)  ((dentry)->d_inode && !d_unhashed((dentry)))
+
+int rcfs_empty(struct dentry *dentry)
+{
+	struct dentry *child;
+	int ret = 0;
+
+	spin_lock(&dcache_lock);
+	list_for_each_entry(child, &dentry->d_subdirs, d_child)
+	    if (!rcfs_is_magic(child) && rcfs_positive(child))
+		goto out;
+	ret = 1;
+out:
+	spin_unlock(&dcache_lock);
+	return ret;
+}
+
+/* Directory inode operations */
+
+int rcfs_create_coredir(struct inode *dir, struct dentry *dentry)
+{
+
+	struct rcfs_inode_info *ripar, *ridir;
+	int sz;
+
+	ripar = rcfs_get_inode_info(dir);
+	ridir = rcfs_get_inode_info(dentry->d_inode);
+	/* Inform resource controllers - do Core operations */
+	if (ckrm_is_core_valid(ripar->core)) {
+		sz = strlen(ripar->name) + strlen(dentry->d_name.name) + 2;
+		ridir->name = kmalloc(sz, GFP_KERNEL);
+		if (!ridir->name) {
+			return -ENOMEM;
+		}
+		snprintf(ridir->name, sz, "%s/%s", ripar->name,
+			 dentry->d_name.name);
+		ridir->core = (*(ripar->core->classtype->alloc))
+		    (ripar->core, ridir->name);
+	} else {
+		printk(KERN_ERR "rcfs_mkdir: Invalid parent core %p\n",
+		       ripar->core);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+int rcfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
+{
+
+	int retval = 0;
+	struct ckrm_classtype *clstype;
+
+	if (rcfs_mknod(dir, dentry, mode | S_IFDIR, 0)) {
+		printk(KERN_ERR "rcfs_mkdir: error in rcfs_mknod\n");
+		return retval;
+	}
+	dir->i_nlink++;
+	/* Inherit parent's ops since rcfs_mknod assigns noperm ops. */
+	dentry->d_inode->i_op = dir->i_op;
+	dentry->d_inode->i_fop = dir->i_fop;
+	retval = rcfs_create_coredir(dir, dentry);
+	if (retval) {
+		simple_rmdir(dir, dentry);
+		return retval;
+	}
+	/* create the default set of magic files */
+	clstype = (rcfs_get_inode_info(dentry->d_inode))->core->classtype;
+	rcfs_create_magic(dentry, &(((struct rcfs_magf *)clstype->mfdesc)[1]),
+			  clstype->mfcount - 3);
+	return retval;
+}
+
+int rcfs_rmdir(struct inode *dir, struct dentry *dentry)
+{
+	struct rcfs_inode_info *ri = rcfs_get_inode_info(dentry->d_inode);
+
+	if (!rcfs_empty(dentry)) {
+		printk(KERN_ERR "rcfs_rmdir: directory not empty\n");
+		return -ENOTEMPTY;
+	}
+	/* Core class removal  */
+
+	if (ri->core == NULL) {
+		printk(KERN_ERR "rcfs_rmdir: core==NULL\n");
+		/* likely a race condition */
+		return 0;
+	}
+
+	if ((*(ri->core->classtype->free)) (ri->core)) {
+		printk(KERN_ERR "rcfs_rmdir: ckrm_free_core_class failed\n");
+		goto out;
+	}
+	ri->core = NULL;	/* just to be safe */
+
+	/* Clear magic files only after core successfully removed */
+	rcfs_clear_magic(dentry);
+
+	return simple_rmdir(dir, dentry);
+
+out:
+	return -EBUSY;
+}
+
+int rcfs_unlink(struct inode *dir, struct dentry *dentry)
+{
+	/*
+	 * -ENOENT and not -ENOPERM to allow rm -rf to work despite
+	 * magic files being present
+	 */
+	return -ENOENT;
+}
+
+/* rename is allowed on directories only */
+int
+rcfs_rename(struct inode *old_dir, struct dentry *old_dentry,
+	    struct inode *new_dir, struct dentry *new_dentry)
+{
+	if (S_ISDIR(old_dentry->d_inode->i_mode))
+		return simple_rename(old_dir, old_dentry, new_dir, new_dentry);
+	else
+		return -EINVAL;
+}
+
+int rcfs_create_noperm(struct inode *dir, struct dentry *dentry, int mode, struct nameidata *nd)
+{
+	return -EPERM;
+}
+
+int rcfs_symlink_noperm(struct inode *dir, struct dentry *dentry, const char *symname)
+{
+	return -EPERM;
+}
+
+int rcfs_mkdir_noperm(struct inode *dir, struct dentry *dentry, int mode)
+{
+	return -EPERM;
+}
+
+int rcfs_rmdir_noperm(struct inode *dir, struct dentry *dentry)
+{
+	return -EPERM;
+}
+
+int rcfs_link_noperm(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry)
+{
+	return -EPERM;
+}
+
+int rcfs_unlink_noperm(struct inode *dir, struct dentry *dentry)
+{
+	return -EPERM;
+}
+
+int rcfs_mknod_noperm(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
+{
+	return -EPERM;
+}
+
+int rcfs_rename_noperm(struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, struct dentry *new_dentry)
+{
+	return -EPERM;
+}
+
+struct inode_operations rcfs_dir_inode_operations = {
+	.create = rcfs_create_noperm,
+	.lookup = simple_lookup,
+	.link = rcfs_link_noperm,
+	.unlink = rcfs_unlink,
+	.symlink = rcfs_symlink_noperm,
+	.mkdir = rcfs_mkdir,
+	.rmdir = rcfs_rmdir,
+	.mknod = rcfs_mknod_noperm,
+	.rename = rcfs_rename,
+};
+
+struct inode_operations rcfs_rootdir_inode_operations = {
+	.create = rcfs_create_noperm,
+	.lookup = simple_lookup,
+	.link = rcfs_link_noperm,
+	.unlink = rcfs_unlink_noperm,
+	.symlink = rcfs_symlink_noperm,
+	.mkdir = rcfs_mkdir_noperm,
+	.rmdir = rcfs_rmdir_noperm,
+	.mknod = rcfs_mknod_noperm,
+	.rename = rcfs_rename_noperm,
+};
Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/inode.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/inode.c	2005-05-05 09:35:06.000000000 -0700
@@ -0,0 +1,160 @@
+/*
+ * fs/rcfs/inode.c
+ *
+ * Copyright (C) Shailabh Nagar,  IBM Corp. 2004
+ *               Vivek Kashyap,   IBM Corp. 2004
+ *
+ * Resource class filesystem (rcfs) forming the
+ * user interface to Class-based Kernel Resource Management (CKRM).
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the  GNU General Public License
+ * as published by the Free Software Foundation.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/list.h>
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/namespace.h>
+#include <linux/dcache.h>
+#include <linux/seq_file.h>
+#include <linux/pagemap.h>
+#include <linux/highmem.h>
+#include <linux/init.h>
+#include <linux/string.h>
+#include <linux/smp_lock.h>
+#include <linux/backing-dev.h>
+#include <linux/parser.h>
+#include <linux/rcfs.h>
+#include <asm/uaccess.h>
+
+/*
+ * Address of variable used as flag to indicate a magic file,
+ * value unimportant
+ */
+int RCFS_IS_MAGIC;
+
+struct inode *rcfs_get_inode(struct super_block *sb, int mode, dev_t dev)
+{
+	struct inode *inode = new_inode(sb);
+
+	if (inode) {
+		inode->i_mode = mode;
+		inode->i_uid = current->fsuid;
+		inode->i_gid = current->fsgid;
+		inode->i_blksize = PAGE_CACHE_SIZE;
+		inode->i_blocks = 0;
+		inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+		switch (mode & S_IFMT) {
+		default:
+			init_special_inode(inode, mode, dev);
+			break;
+		case S_IFREG:
+			/* Treat as default assignment */
+			inode->i_op = &rcfs_file_inode_operations;
+			break;
+		case S_IFDIR:
+			inode->i_op = &rcfs_rootdir_inode_operations;
+			inode->i_fop = &simple_dir_operations;
+
+			/*
+			 * directory inodes start off with i_nlink == 2
+			 *  (for "." entry)
+			 */
+			inode->i_nlink++;
+			break;
+		case S_IFLNK:
+			inode->i_op = &page_symlink_inode_operations;
+			break;
+		}
+	}
+	return inode;
+}
+
+int rcfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
+{
+	struct inode *inode;
+	int error = -EPERM;
+
+	if (dentry->d_inode)
+		return -EEXIST;
+	inode = rcfs_get_inode(dir->i_sb, mode, dev);
+	if (inode) {
+		if (dir->i_mode & S_ISGID) {
+			inode->i_gid = dir->i_gid;
+			if (S_ISDIR(mode))
+				inode->i_mode |= S_ISGID;
+		}
+		d_instantiate(dentry, inode);
+		dget(dentry);
+		error = 0;
+	}
+	return error;
+}
+
+struct dentry *rcfs_create_internal(struct dentry *parent,
+				    struct rcfs_magf *magf, int magic)
+{
+	struct qstr qstr;
+	struct dentry *mfdentry;
+
+	/* Get new dentry for name */
+	qstr.name = magf->name;
+	qstr.len = strlen(magf->name);
+	qstr.hash = full_name_hash(magf->name, qstr.len);
+	mfdentry = lookup_hash(&qstr, parent);
+
+	if (!IS_ERR(mfdentry)) {
+		int err;
+
+		down(&parent->d_inode->i_sem);
+		if (magic && (magf->mode & S_IFDIR))
+			err = parent->d_inode->i_op->mkdir(parent->d_inode,
+							   mfdentry,
+							   magf->mode);
+		else {
+			err = rcfs_mknod(parent->d_inode, mfdentry,
+					  magf->mode, 0);
+			/*
+			 * rcfs_mknod doesn't increment parent's link count,
+			 * i_op->mkdir does.
+			 */
+			parent->d_inode->i_nlink++;
+		}
+		up(&parent->d_inode->i_sem);
+		if (err) {
+			dput(mfdentry);
+			return mfdentry;
+		}
+	}
+	return mfdentry;
+}
+
+int rcfs_delete_internal(struct dentry *mfdentry)
+{
+	struct dentry *parent;
+
+	if (!mfdentry || !mfdentry->d_parent)
+		return -EINVAL;
+	parent = mfdentry->d_parent;
+	if (!mfdentry->d_inode) {
+		return 0;
+	}
+	down(&mfdentry->d_inode->i_sem);
+	if (S_ISDIR(mfdentry->d_inode->i_mode))
+		simple_rmdir(parent->d_inode, mfdentry);
+	else
+		simple_unlink(parent->d_inode, mfdentry);
+	up(&mfdentry->d_inode->i_sem);
+	d_delete(mfdentry);
+
+	return 0;
+}
+
+struct inode_operations rcfs_file_inode_operations = {
+	.getattr = simple_getattr,
+};
Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/magic.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/magic.c	2005-05-05 09:35:06.000000000 -0700
@@ -0,0 +1,517 @@
+/*
+ * fs/rcfs/magic.c
+ *
+ * Copyright (C) Shailabh Nagar,      IBM Corp. 2004
+ *           (C) Vivek Kashyap,       IBM Corp. 2004
+ *           (C) Chandra Seetharaman, IBM Corp. 2004
+ *           (C) Hubertus Franke,     IBM Corp. 2004
+ *
+ * File operations for common magic files in rcfs,
+ * the user interface for CKRM.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/namespace.h>
+#include <linux/dcache.h>
+#include <linux/seq_file.h>
+#include <linux/init.h>
+#include <linux/string.h>
+#include <linux/smp_lock.h>
+#include <linux/parser.h>
+#include <linux/rcfs.h>
+#include <asm/uaccess.h>
+
+#define MAX_INPUT_SIZE	100U
+
+static void mkvalidstr(char *s)
+{
+	char *p;
+
+	for (p = s; *p != '\0'; ++p) {
+		if (*p < ' ' || *p > '~') {
+			*p = '\0';
+			return;
+		}
+	}
+}
+
+
+static int
+magic_show(struct seq_file *s, void *v)
+{
+	int rc=0;
+	ssize_t precnt;
+	struct ckrm_core_class *core;
+	struct rcfs_inode_info *rcfs_i = (struct rcfs_inode_info *)s->private;
+	int (*func) (struct ckrm_core_class *, struct seq_file *) = NULL;
+
+	core = rcfs_i->core;
+
+	if (!ckrm_is_core_valid(core))
+		return -EINVAL;
+
+	precnt = s->count;
+	if (!strcmp(rcfs_i->mfdentry->d_name.name, RCFS_CONFIG_NAME)) {
+		func = core->classtype->show_config;
+	} else if (!strcmp(rcfs_i->mfdentry->d_name.name, RCFS_MEMBERS_NAME)) {
+		func = core->classtype->show_members;
+	} else if (!strcmp(rcfs_i->mfdentry->d_name.name, RCFS_STATS_NAME)) {
+		func = core->classtype->show_stats;
+	} else if (!strcmp(rcfs_i->mfdentry->d_name.name, RCFS_SHARES_NAME)) {
+		func = core->classtype->show_shares;
+	}
+	if (func)
+		rc = func(core, s);
+
+	if (s->count == precnt)
+		seq_printf(s, "No data to display\n");
+	return rc;
+};
+
+static int
+magic_open(struct inode *inode, struct file *file)
+{
+	struct rcfs_inode_info *ri;
+	int ret=-EINVAL;
+
+	if (file->f_dentry) {
+		ri = rcfs_get_inode_info(file->f_dentry->d_inode);
+		ret = single_open(file, magic_show, (void *)ri);
+	}
+	return ret;
+}
+
+static int
+magic_close(struct inode *inode, struct file *file)
+{
+	return single_release(inode, file);
+}
+
+enum parse_token_t {
+	parse_res_type, parse_str, parse_err
+};
+
+static match_table_t parse_tokens = {
+	{parse_res_type, "res=%s"},
+	{parse_str, NULL},
+	{parse_err, NULL},
+};
+
+static int
+magic_parse(const unsigned char *fname, char *options,
+			char **resstr, char **otherstr)
+{
+	char *p;
+	*resstr = NULL;
+
+	if (!options)
+		return 0;
+
+	while ((p = strsep(&options, ",")) != NULL) {
+		substring_t args[MAX_OPT_ARGS];
+		int token;
+
+		if (!*p)
+			continue;
+
+		token = match_token(p, parse_tokens, args);
+		switch (token) {
+		case parse_res_type:
+			*resstr = match_strdup(args);
+			if (!strcmp(fname, RCFS_CONFIG_NAME)) {
+				char *str = p + strlen(p) + 1;
+				*otherstr = kmalloc(strlen(str) + 1,
+							 GFP_KERNEL);
+				if (*otherstr == NULL) {
+					kfree(*resstr);
+					*resstr = NULL;
+					return 0;
+				} else {
+					strcpy(*otherstr, str);
+					return 1;
+				}
+			}
+			break;
+		case parse_str:
+			*otherstr = match_strdup(args);
+			break;
+		default:
+			return 0;
+		}
+	}
+	return (*resstr != NULL);
+}
+
+static ssize_t
+magic_write(struct file *file, const char __user *buf,
+			   size_t count, loff_t *ppos)
+{
+	struct rcfs_inode_info *ri =
+		rcfs_get_inode_info(file->f_dentry->d_parent->d_inode);
+	char *optbuf, *otherstr=NULL, *resname=NULL;
+	int done, rc = 0;
+	struct ckrm_core_class *core ;
+	int (*func) (struct ckrm_core_class *, const char *,
+			const char *) = NULL;
+
+	core = ri->core;
+	if (!ckrm_is_core_valid(core))
+		return -EINVAL;
+
+	if (count > MAX_INPUT_SIZE)
+		return -EINVAL;
+
+	if (!access_ok(VERIFY_READ, buf, count))
+		return -EFAULT;
+
+	down(&(ri->vfs_inode.i_sem));
+
+	optbuf = kmalloc(MAX_INPUT_SIZE+1, GFP_KERNEL);
+	if (!optbuf) {
+		up(&(ri->vfs_inode.i_sem));
+		return -ENOMEM;
+	}
+	__copy_from_user(optbuf, buf, count);
+	mkvalidstr(optbuf);
+	done = magic_parse(ri->mfdentry->d_name.name,
+			optbuf, &resname, &otherstr);
+	if (!done) {
+		printk(KERN_ERR "Error parsing data written to %s\n",
+				ri->mfdentry->d_name.name);
+		goto out;
+	}
+	if (!strcmp(ri->mfdentry->d_name.name, RCFS_CONFIG_NAME)) {
+		func = core->classtype->set_config;
+	} else if (!strcmp(ri->mfdentry->d_name.name, RCFS_STATS_NAME)) {
+		func = core->classtype->reset_stats;
+	}
+	if (func) {
+		rc = func(core, resname, otherstr);
+		if (rc) {
+			printk(KERN_ERR "magic_write: %s: error\n",
+				ri->mfdentry->d_name.name);
+		}
+	}
+out:
+	up(&(ri->vfs_inode.i_sem));
+	kfree(optbuf);
+	kfree(otherstr);
+	kfree(resname);
+	return rc ? rc : count;
+}
+
+/*
+ * Shared function used by Target / Reclassify
+ */
+
+static ssize_t
+target_reclassify_write(struct file *file, const char __user * buf,
+			size_t count, loff_t * ppos, int manual)
+{
+	struct rcfs_inode_info *ri = rcfs_get_inode_info(file->f_dentry->d_inode);
+	char *optbuf;
+	int rc = -EINVAL;
+	struct ckrm_classtype *clstype;
+
+	if (count > MAX_INPUT_SIZE)
+		return -EINVAL;
+	if (!access_ok(VERIFY_READ, buf, count))
+		return -EFAULT;
+	down(&(ri->vfs_inode.i_sem));
+	optbuf = kmalloc(MAX_INPUT_SIZE, GFP_KERNEL);
+	__copy_from_user(optbuf, buf, count);
+	mkvalidstr(optbuf);
+	clstype = ri->core->classtype;
+	if (clstype->forced_reclassify)
+		rc = (*clstype->forced_reclassify) (manual ? ri->core: NULL, optbuf);
+	up(&(ri->vfs_inode.i_sem));
+	kfree(optbuf);
+	return (!rc ? count : rc);
+
+}
+
+/*
+ * Target
+ *
+ * pseudo file for manually reclassifying members to a class
+ */
+
+static ssize_t
+target_write(struct file *file, const char __user * buf,
+	     size_t count, loff_t * ppos)
+{
+	return target_reclassify_write(file, buf, count, ppos, 1);
+}
+
+struct file_operations target_fileops = {
+	.write = target_write,
+};
+
+/*
+ * Reclassify
+ *
+ * pseudo file for reclassification of an object through CE
+ */
+
+static ssize_t
+reclassify_write(struct file *file, const char __user * buf,
+		 size_t count, loff_t * ppos)
+{
+	return target_reclassify_write(file, buf, count, ppos, 0);
+}
+
+struct file_operations reclassify_fileops = {
+	.write = reclassify_write,
+};
+
+/*
+ * Config
+ *
+ * Set/get configuration parameters of a class.
+ */
+
+/*
+ * Currently there are no per-class config parameters defined.
+ * Use existing code as a template
+ */
+
+struct file_operations config_fileops = {
+	.open           = magic_open,
+	.read           = seq_read,
+	.llseek         = seq_lseek,
+	.release        = magic_close,
+	.write          = magic_write,
+};
+
+/*
+ * Members
+ *
+ * List members of a class
+ */
+
+struct file_operations members_fileops = {
+	.open           = magic_open,
+	.read           = seq_read,
+	.llseek         = seq_lseek,
+	.release        = magic_close,
+};
+
+/*
+ * Stats
+ *
+ * Get/reset class statistics
+ * No standard set of stats defined. Each resource controller chooses
+ * its own set of statistics to maintain and export.
+ */
+
+struct file_operations stats_fileops = {
+	.open           = magic_open,
+	.read           = seq_read,
+	.llseek         = seq_lseek,
+	.release        = magic_close,
+	.write          = magic_write,
+};
+
+/*
+ * Shares
+ *
+ * Set/get shares of a taskclass.
+ * Share types and semantics are defined by rcfs and ckrm core
+ */
+
+#define SHARES_MAX_INPUT_SIZE	300U
+
+/*
+ * The enums for the share types should match the indices expected by
+ * array parameter to ckrm_set_resshare
+ *
+ * Note only the first NUM_SHAREVAL enums correspond to share types,
+ * the remaining ones are for token matching purposes
+ */
+
+enum share_token_t {
+	MY_GUAR, MY_LIM, TOT_GUAR, MAX_LIM, SHARE_RES_TYPE, SHARE_ERR
+};
+
+/* Token matching for parsing input to this magic file */
+static match_table_t shares_tokens = {
+	{SHARE_RES_TYPE, "res=%s"},
+	{MY_GUAR, "guarantee=%d"},
+	{MY_LIM, "limit=%d"},
+	{TOT_GUAR, "total_guarantee=%d"},
+	{MAX_LIM, "max_limit=%d"},
+	{SHARE_ERR, NULL}
+};
+
+static int
+shares_parse(char *options, char **resstr, struct ckrm_shares *shares)
+{
+	char *p;
+	int option;
+
+	if (!options)
+		return 1;
+	while ((p = strsep(&options, ",")) != NULL) {
+		substring_t args[MAX_OPT_ARGS];
+		int token;
+
+		if (!*p)
+			continue;
+		token = match_token(p, shares_tokens, args);
+		switch (token) {
+		case SHARE_RES_TYPE:
+			*resstr = match_strdup(args);
+			break;
+		case MY_GUAR:
+			if (match_int(args, &option))
+				return 0;
+			shares->my_guarantee = option;
+			break;
+		case MY_LIM:
+			if (match_int(args, &option))
+				return 0;
+			shares->my_limit = option;
+			break;
+		case TOT_GUAR:
+			if (match_int(args, &option))
+				return 0;
+			shares->total_guarantee = option;
+			break;
+		case MAX_LIM:
+			if (match_int(args, &option))
+				return 0;
+			shares->max_limit = option;
+			break;
+		default:
+			return 0;
+		}
+	}
+	return 1;
+}
+
+static ssize_t
+shares_write(struct file *file, const char __user * buf,
+	     size_t count, loff_t * ppos)
+{
+	struct inode *inode = file->f_dentry->d_inode;
+	struct rcfs_inode_info *ri;
+	char *optbuf;
+	int rc = 0;
+	struct ckrm_core_class *core;
+	int done;
+	char *resname = NULL;
+
+	struct ckrm_shares newshares = {
+		CKRM_SHARE_UNCHANGED,
+		CKRM_SHARE_UNCHANGED,
+		CKRM_SHARE_UNCHANGED,
+		CKRM_SHARE_UNCHANGED,
+		CKRM_SHARE_UNCHANGED,
+		CKRM_SHARE_UNCHANGED
+	};
+	if (count > SHARES_MAX_INPUT_SIZE)
+		return -EINVAL;
+	if (!access_ok(VERIFY_READ, buf, count))
+		return -EFAULT;
+	ri = rcfs_get_inode_info(file->f_dentry->d_parent->d_inode);
+	if (!ri || !ckrm_is_core_valid((struct ckrm_core_class *) (ri->core))) {
+		printk(KERN_ERR "shares_write: Error accessing core class\n");
+		return -EFAULT;
+	}
+	down(&inode->i_sem);
+	core = ri->core;
+	optbuf = kmalloc(SHARES_MAX_INPUT_SIZE, GFP_KERNEL);
+	if (!optbuf) {
+		up(&inode->i_sem);
+		return -ENOMEM;
+	}
+	__copy_from_user(optbuf, buf, count);
+	mkvalidstr(optbuf);
+	done = shares_parse(optbuf, &resname, &newshares);
+	if (!done) {
+		printk(KERN_ERR "Error parsing shares\n");
+		rc = -EINVAL;
+		goto write_out;
+	}
+	if (core->classtype->set_shares) {
+		rc = (*core->classtype->set_shares) (core, resname, &newshares);
+		if (rc) {
+			printk(KERN_ERR
+			       "shares_write: resctlr share set error\n");
+			goto write_out;
+		}
+	}
+	printk(KERN_ERR "Set %s shares to %d %d %d %d\n",
+	       resname,
+	       newshares.my_guarantee,
+	       newshares.my_limit,
+	       newshares.total_guarantee, newshares.max_limit);
+	rc = count;
+
+write_out:
+	up(&inode->i_sem);
+	kfree(optbuf);
+	kfree(resname);
+	return rc;
+}
+
+struct file_operations shares_fileops = {
+	.open           = magic_open,
+	.read           = seq_read,
+	.llseek         = seq_lseek,
+	.release        = magic_close,
+	.write          = shares_write,
+};
+
+/*
+ * magic file creation/deletion
+ */
+
+int rcfs_clear_magic(struct dentry *parent)
+{
+	struct dentry *mftmp, *mfdentry;
+
+	list_for_each_entry_safe(mfdentry, mftmp, &parent->d_subdirs, d_child) {
+		if (!rcfs_is_magic(mfdentry))
+			continue;
+		if (rcfs_delete_internal(mfdentry))
+			printk(KERN_ERR
+			       "rcfs_clear_magic: error deleting one\n");
+	}
+	return 0;
+}
+
+EXPORT_SYMBOL_GPL(rcfs_clear_magic);
+
+int rcfs_create_magic(struct dentry *parent, struct rcfs_magf magf[], int count)
+{
+	int i;
+	struct dentry *mfdentry;
+
+	for (i = 0; i < count; i++) {
+		mfdentry = rcfs_create_internal(parent, &magf[i], 0);
+		if (IS_ERR(mfdentry)) {
+			rcfs_clear_magic(parent);
+			return -ENOMEM;
+		}
+		rcfs_get_inode_info(mfdentry->d_inode)->core =
+			 rcfs_get_inode_info(parent->d_inode)->core;
+		rcfs_get_inode_info(mfdentry->d_inode)->mfdentry = mfdentry;
+		mfdentry->d_fsdata = &RCFS_IS_MAGIC;
+		if (magf[i].i_fop)
+			mfdentry->d_inode->i_fop = magf[i].i_fop;
+		if (magf[i].i_op)
+			mfdentry->d_inode->i_op = magf[i].i_op;
+	}
+	return 0;
+}
+
+EXPORT_SYMBOL_GPL(rcfs_create_magic);
Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/Makefile
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/Makefile	2005-05-05 09:35:06.000000000 -0700
@@ -0,0 +1,7 @@
+#
+# Makefile for rcfs routines.
+#
+
+obj-$(CONFIG_RCFS_FS) += rcfs.o
+
+rcfs-y := super.o inode.o dir.o rootdir.o magic.o
Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/rootdir.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/rootdir.c	2005-05-05 09:35:06.000000000 -0700
@@ -0,0 +1,198 @@
+/*
+ * fs/rcfs/rootdir.c
+ *
+ * Copyright (C)   Vivek Kashyap,   IBM Corp. 2004
+ *
+ *
+ * Functions for creating root directories and magic files
+ * for classtypes and classification engines under rcfs
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/namespace.h>
+#include <linux/dcache.h>
+#include <linux/seq_file.h>
+#include <linux/pagemap.h>
+#include <linux/highmem.h>
+#include <linux/init.h>
+#include <linux/string.h>
+#include <linux/smp_lock.h>
+#include <linux/backing-dev.h>
+#include <linux/parser.h>
+#include <linux/rcfs.h>
+#include <asm/uaccess.h>
+
+struct rbce_eng_callback rcfs_eng_callbacks = {
+	NULL, NULL
+};
+
+int rcfs_register_engine(struct rbce_eng_callback * rcbs)
+{
+	if (!rcbs->mkdir || rcfs_eng_callbacks.mkdir) {
+		return -EINVAL;
+	}
+	rcfs_eng_callbacks = *rcbs;
+	rcfs_engine_regd++;
+	return 0;
+}
+
+EXPORT_SYMBOL_GPL(rcfs_register_engine);
+
+int rcfs_unregister_engine(struct rbce_eng_callback * rcbs)
+{
+	if (!rcbs->mkdir || !rcfs_eng_callbacks.mkdir ||
+	    (rcbs->mkdir != rcfs_eng_callbacks.mkdir)) {
+		return -EINVAL;
+	}
+	rcfs_eng_callbacks.mkdir = NULL;
+	rcfs_eng_callbacks.rmdir = NULL;
+	rcfs_engine_regd--;
+	return 0;
+}
+
+EXPORT_SYMBOL(rcfs_unregister_engine);
+
+/*
+ * rcfs_mkroot
+ * Create and return a "root" dentry under /rcfs.
+ * Also create associated magic files
+ *
+ * @mfdesc: array of rcfs_magf describing root dir and its magic files
+ * @count: number of entries in mfdesc
+ * @core:  core class to be associated with root
+ * @rootde: output parameter to return the newly created root dentry
+ */
+
+int rcfs_mkroot(struct rcfs_magf *mfdesc, int mfcount, struct dentry **rootde)
+{
+	int sz;
+	struct rcfs_magf *rootdesc = &mfdesc[0];
+	struct dentry *dentry;
+	struct rcfs_inode_info *rootri;
+
+	if ((mfcount < 0) || (!mfdesc))
+		return -EINVAL;
+
+	rootdesc = &mfdesc[0];
+	printk("allocating classtype root <%s>\n", rootdesc->name);
+	dentry = rcfs_create_internal(rcfs_rootde, rootdesc, 0);
+
+	if (!dentry) {
+		printk(KERN_ERR "Could not create %s\n", rootdesc->name);
+		return -ENOMEM;
+	}
+	rootri = rcfs_get_inode_info(dentry->d_inode);
+	sz = strlen(rootdesc->name) + strlen(RCFS_ROOT) + 2;
+	rootri->name = kmalloc(sz, GFP_KERNEL);
+	if (!rootri->name) {
+		printk(KERN_ERR "Error allocating name for %s\n",
+		       rootdesc->name);
+		rcfs_delete_internal(dentry);
+		return -ENOMEM;
+	}
+	snprintf(rootri->name, sz, "%s/%s", RCFS_ROOT, rootdesc->name);
+	if (rootdesc->i_fop)
+		dentry->d_inode->i_fop = rootdesc->i_fop;
+	if (rootdesc->i_op)
+		dentry->d_inode->i_op = rootdesc->i_op;
+
+	/* set output parameters */
+	*rootde = dentry;
+
+	return 0;
+}
+
+EXPORT_SYMBOL_GPL(rcfs_mkroot);
+
+int rcfs_rmroot(struct dentry *rootde)
+{
+	struct rcfs_inode_info *ri;
+
+	if (!rootde)
+		return -EINVAL;
+
+	rcfs_clear_magic(rootde);
+	ri = rcfs_get_inode_info(rootde->d_inode);
+	kfree(ri->name);
+	ri->name = NULL;
+	rcfs_delete_internal(rootde);
+	return 0;
+}
+
+EXPORT_SYMBOL_GPL(rcfs_rmroot);
+
+int rcfs_register_classtype(struct ckrm_classtype * clstype)
+{
+	int rc;
+	struct rcfs_inode_info *rootri;
+	struct rcfs_magf *mfdesc;
+
+	if (genmfdesc[clstype->mfidx] == NULL) {
+		return -ENOMEM;
+	}
+
+	clstype->mfdesc = (void *)genmfdesc[clstype->mfidx]->rootmf;
+	clstype->mfcount = genmfdesc[clstype->mfidx]->rootmflen;
+
+	mfdesc = (struct rcfs_magf *)clstype->mfdesc;
+
+	/* rcfs root entry has the same name as the classtype */
+	strncpy(mfdesc[0].name, clstype->name, RCFS_MAGF_NAMELEN);
+
+	rc = rcfs_mkroot(mfdesc, clstype->mfcount,
+			 (struct dentry **)&(clstype->rootde));
+	if (rc)
+		return rc;
+	rootri = rcfs_get_inode_info(((struct dentry *)(clstype->rootde))->d_inode);
+	rootri->core = clstype->default_class;
+	clstype->default_class->name = rootri->name;
+	ckrm_core_grab(clstype->default_class);
+
+	/* Create magic files under root */
+	if ((rc = rcfs_create_magic(clstype->rootde, &mfdesc[1],
+				    clstype->mfcount - 1))) {
+		kfree(rootri->name);
+		rootri->name = NULL;
+		rcfs_delete_internal(clstype->rootde);
+		return rc;
+	}
+	return rc;
+}
+
+EXPORT_SYMBOL_GPL(rcfs_register_classtype);
+
+int rcfs_deregister_classtype(struct ckrm_classtype * clstype)
+{
+	int rc;
+
+	rc = rcfs_rmroot((struct dentry *)clstype->rootde);
+	if (!rc) {
+		clstype->default_class->name = NULL;
+		ckrm_core_drop(clstype->default_class);
+	}
+	return rc;
+}
+
+EXPORT_SYMBOL_GPL(rcfs_deregister_classtype);
+
+/* Common root and magic file entries.
+ * root name, root permissions, magic file names and magic file permissions
+ * are needed by all entities (classtypes and classification engines) existing
+ * under the rcfs mount point
+ *
+ * The common sets of these attributes are listed here as a table. Individual
+ * classtypes and classification engines can simple specify the index into the
+ * table to initialize their magf entries.
+ */
+
+struct rcfs_mfdesc *genmfdesc[] = {
+	NULL,
+};
Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/super.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/super.c	2005-05-05 09:35:06.000000000 -0700
@@ -0,0 +1,291 @@
+/*
+ * fs/rcfs/super.c
+ *
+ * Copyright (C) Shailabh Nagar,  IBM Corp. 2004
+ *		 Vivek Kashyap,   IBM Corp. 2004
+ *
+ * Super block operations for rcfs
+ *
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License
+ * as published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/namespace.h>
+#include <linux/dcache.h>
+#include <linux/seq_file.h>
+#include <linux/pagemap.h>
+#include <linux/highmem.h>
+#include <linux/init.h>
+#include <linux/string.h>
+#include <linux/smp_lock.h>
+#include <linux/backing-dev.h>
+#include <linux/parser.h>
+#include <linux/rcfs.h>
+#include <linux/ckrm_rc.h>
+#include <linux/ckrm_ce.h>
+#include <asm/uaccess.h>
+
+static kmem_cache_t *rcfs_inode_cachep;
+
+inline struct rcfs_inode_info *rcfs_get_inode_info(struct inode *inode)
+{
+	return container_of(inode, struct rcfs_inode_info, vfs_inode);
+}
+
+static struct inode *rcfs_alloc_inode(struct super_block *sb)
+{
+	struct rcfs_inode_info *ri;
+	ri = (struct rcfs_inode_info *)kmem_cache_alloc(rcfs_inode_cachep,
+							SLAB_KERNEL);
+	if (!ri)
+		return NULL;
+	ri->name = NULL;
+	return &ri->vfs_inode;
+}
+
+static void rcfs_destroy_inode(struct inode *inode)
+{
+	struct rcfs_inode_info *ri = rcfs_get_inode_info(inode);
+
+	kfree(ri->name);
+	kmem_cache_free(rcfs_inode_cachep, ri);
+}
+
+static void
+rcfs_init_once(void *foo, kmem_cache_t * cachep, unsigned long flags)
+{
+	struct rcfs_inode_info *ri = (struct rcfs_inode_info *)foo;
+
+	if ((flags & (SLAB_CTOR_VERIFY | SLAB_CTOR_CONSTRUCTOR)) ==
+	    SLAB_CTOR_CONSTRUCTOR)
+		inode_init_once(&ri->vfs_inode);
+}
+
+int rcfs_init_inodecache(void)
+{
+	rcfs_inode_cachep = kmem_cache_create("rcfs_inode_cache",
+					      sizeof(struct rcfs_inode_info),
+					      0,
+					      SLAB_HWCACHE_ALIGN |
+					      SLAB_RECLAIM_ACCOUNT,
+					      rcfs_init_once, NULL);
+	if (rcfs_inode_cachep == NULL)
+		return -ENOMEM;
+	return 0;
+}
+
+void rcfs_destroy_inodecache(void)
+{
+	pr_debug("destroy inodecache was called\n");
+	if (kmem_cache_destroy(rcfs_inode_cachep))
+		printk(KERN_INFO
+		       "rcfs_inode_cache: not all structures were freed\n");
+}
+
+struct super_operations rcfs_super_ops = {
+	.alloc_inode = rcfs_alloc_inode,
+	.destroy_inode = rcfs_destroy_inode,
+	.statfs = simple_statfs,
+	.drop_inode = generic_delete_inode,
+};
+
+struct dentry *rcfs_rootde;	/* redundant; can also get it from sb */
+static struct inode *rcfs_root;
+static struct rcfs_inode_info *rcfs_rootri;
+
+static int rcfs_fill_super(struct super_block *sb, void *data, int silent)
+{
+	struct inode *inode;
+	struct dentry *root;
+	struct rcfs_inode_info *rootri;
+	struct ckrm_classtype *clstype;
+	int i, rc;
+
+	sb->s_fs_info = NULL;
+	if (rcfs_mounted) {
+		return -EPERM;
+	}
+	rcfs_mounted++;
+
+	sb->s_blocksize = PAGE_CACHE_SIZE;
+	sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
+	sb->s_magic = RCFS_MAGIC;
+	sb->s_op = &rcfs_super_ops;
+	inode = rcfs_get_inode(sb, S_IFDIR | 0755, 0);
+	if (!inode)
+		return -ENOMEM;
+	inode->i_op = &rcfs_rootdir_inode_operations;
+
+	root = d_alloc_root(inode);
+	if (!root) {
+		iput(inode);
+		return -ENOMEM;
+	}
+	sb->s_root = root;
+
+	/* Link inode and core class */
+	rootri = rcfs_get_inode_info(inode);
+	rootri->name = kmalloc(strlen(RCFS_ROOT) + 1, GFP_KERNEL);
+	if (!rootri->name) {
+		d_delete(root);
+		iput(inode);
+		return -ENOMEM;
+	}
+	strcpy(rootri->name, RCFS_ROOT);
+	rootri->core = NULL;
+
+	rcfs_root = inode;
+	sb->s_fs_info = rcfs_root = inode;
+	rcfs_rootde = root;
+	rcfs_rootri = rootri;
+
+	/* register metatypes */
+	for (i = 0; i < CKRM_MAX_CLASSTYPES; i++) {
+		clstype = ckrm_classtypes[i];
+		if (clstype == NULL)
+			continue;
+		printk("A non null classtype\n");
+
+		if ((rc = rcfs_register_classtype(clstype)))
+			continue;	/* could return with an error too */
+	}
+
+	/*
+	 * do post-mount initializations needed by CE
+	 * this is distinct from CE registration done on rcfs module load
+	 */
+	if (rcfs_engine_regd) {
+		if (rcfs_eng_callbacks.mnt)
+			if ((rc = (*rcfs_eng_callbacks.mnt) ())) {
+				printk(KERN_ERR "Error in CE mnt %d\n", rc);
+			}
+	}
+	/*
+	 * Following comment handled by code above; keep nonetheless if it
+	 * can be done better
+	 *
+	 * register CE's with rcfs
+	 * check if CE loaded
+	 * call rcfs_register_engine for each classtype
+	 * AND rcfs_mkroot (preferably subsume latter in former)
+	 */
+	return 0;
+}
+
+static struct super_block *rcfs_get_sb(struct file_system_type *fs_type,
+				       int flags, const char *dev_name,
+				       void *data)
+{
+	return get_sb_nodev(fs_type, flags, data, rcfs_fill_super);
+}
+
+void rcfs_kill_sb(struct super_block *sb)
+{
+	int i, rc;
+	struct ckrm_classtype *clstype;
+
+	if (sb->s_fs_info != rcfs_root) {
+		generic_shutdown_super(sb);
+		return;
+	}
+	rcfs_mounted--;
+
+	for (i = 0; i < CKRM_MAX_CLASSTYPES; i++) {
+		clstype = ckrm_classtypes[i];
+		if (clstype == NULL || clstype->rootde == NULL)
+			continue;
+
+		if ((rc = rcfs_deregister_classtype(clstype))) {
+			printk(KERN_ERR "Error removing classtype %s\n",
+			       clstype->name);
+		}
+	}
+
+	/*
+	 * do pre-umount shutdown needed by CE
+	 * this is distinct from CE deregistration done on rcfs module unload
+	 */
+	if (rcfs_engine_regd) {
+		if (rcfs_eng_callbacks.umnt)
+			if ((rc = (*rcfs_eng_callbacks.umnt) ())) {
+				printk(KERN_ERR "Error in CE umnt %d\n", rc);
+				/* TODO: return ; until error handling improves */
+			}
+	}
+	/*
+	 * Following comment handled by code above; keep nonetheless if it
+	 * can be done better
+	 *
+	 * deregister CE with rcfs
+	 * Check if loaded
+	 * if ce is in  one directory /rcfs/ce,
+	 *       rcfs_deregister_engine for all classtypes within above
+	 *             codebase
+	 *       followed by
+	 *       rcfs_rmroot here
+	 * if ce in multiple (per-classtype) directories
+	 *       call rbce_deregister_engine within ckrm_deregister_classtype
+	 *
+	 * following will automatically clear rcfs root entry including its
+	 *  rcfs_inode_info
+	 */
+
+	generic_shutdown_super(sb);
+}
+
+static struct file_system_type rcfs_fs_type = {
+	.name = "rcfs",
+	.get_sb = rcfs_get_sb,
+	.kill_sb = rcfs_kill_sb,
+};
+
+struct rcfs_functions my_rcfs_fn = {
+	.mkroot = rcfs_mkroot,
+	.rmroot = rcfs_rmroot,
+	.register_classtype = rcfs_register_classtype,
+	.deregister_classtype = rcfs_deregister_classtype,
+};
+
+extern struct rcfs_functions rcfs_fn;
+
+static int __init init_rcfs_fs(void)
+{
+	int ret;
+
+	ret = register_filesystem(&rcfs_fs_type);
+	if (ret)
+		goto init_register_err;
+	ret = rcfs_init_inodecache();
+	if (ret)
+		goto init_cache_err;
+	rcfs_fn = my_rcfs_fn;
+	/*
+	 * Due to tight coupling of this module with ckrm
+	 * do not allow this module to be removed.
+	 */
+	try_module_get(THIS_MODULE);
+	return ret;
+
+init_cache_err:
+	unregister_filesystem(&rcfs_fs_type);
+init_register_err:
+	return ret;
+}
+
+static void __exit exit_rcfs_fs(void)
+{
+	rcfs_destroy_inodecache();
+	unregister_filesystem(&rcfs_fs_type);
+}
+
+module_init(init_rcfs_fs)
+module_exit(exit_rcfs_fs)
+
+MODULE_LICENSE("GPL");
Index: linux-2.6.12-rc3-ckrm5/include/linux/rcfs.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/include/linux/rcfs.h	2005-05-05 09:35:04.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/include/linux/rcfs.h	2005-05-05 09:35:06.000000000 -0700
@@ -16,12 +16,24 @@
 #define RCFS_MAGF_NAMELEN 20
 extern int RCFS_IS_MAGIC;
 
+/*
+ * Following strings are the names of the system defined files under
+ * the rcfs filesystem
+ */
+
+#define RCFS_CONFIG_NAME	"config"
+#define RCFS_MEMBERS_NAME	"members"
+#define RCFS_STATS_NAME		"stats"
+#define RCFS_SHARES_NAME	"shares"
+#define RCFS_RECLASSIFY_NAME	"reclassify"
+
 #define rcfs_is_magic(dentry)  ((dentry)->d_fsdata == &RCFS_IS_MAGIC)
 
 struct rcfs_inode_info {
 	struct ckrm_core_class *core;
 	char *name;
 	struct inode vfs_inode;
+ 	struct dentry *mfdentry;
 };
 
 #define RCFS_DEFAULT_DIR_MODE	(S_IFDIR | S_IRUGO | S_IXUGO)
@@ -45,16 +57,10 @@ struct rcfs_mfdesc {
 
 extern struct rcfs_mfdesc *genmfdesc[];
 
-struct rcfs_inode_info *RCFS_I(struct inode *inode);
-
-int rcfs_empty(struct dentry *);
+struct rcfs_inode_info *rcfs_get_inode_info(struct inode *inode);
 struct inode *rcfs_get_inode(struct super_block *, int, dev_t);
 int rcfs_mknod(struct inode *, struct dentry *, int, dev_t);
-int _rcfs_mknod(struct inode *, struct dentry *, int, dev_t);
-int rcfs_mkdir(struct inode *, struct dentry *, int);
-struct ckrm_core_class *rcfs_make_core(struct dentry *, struct ckrm_core_class *);
 struct dentry *rcfs_set_magf_byname(char *, void *);
-
 struct dentry *rcfs_create_internal(struct dentry *, struct rcfs_magf *, int);
 int rcfs_delete_internal(struct dentry *);
 int rcfs_create_magic(struct dentry *, struct rcfs_magf *, int);
Index: linux-2.6.12-rc3-ckrm5/init/Kconfig
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/init/Kconfig	2005-05-05 09:35:02.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/init/Kconfig	2005-05-05 09:35:06.000000000 -0700
@@ -160,6 +160,16 @@ config CKRM
 	  If you say Y here, enable the Resource Class File System and at least
 	  one of the resource controllers below. Say N if you are unsure.
 
+config RCFS_FS
+	tristate "Resource Class File System (User API)"
+	depends on CKRM
+	default m
+	help
+	  RCFS is the filesystem API for CKRM. Compiling it as a module permits
+	  users to only load RCFS if they intend to use CKRM.
+
+	  Say M if unsure, Y to save on module loading. N doesn't make sense
+	  when CKRM has been configured.
 endmenu
 
 config SYSCTL

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 05/21] CKRM: Classtype definitions for task class
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (3 preceding siblings ...)
  2005-05-05 18:07 ` [patch 04/21] CKRM: Resource Control File System (rcfs) gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 06/21] CKRM: Classtype definitions for socket class gh
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=05-diff_taskclass


 This patch provides the extensions for CKRM to track task classes.
 This is the base to enable task class based resource control for
 cpu, memory and disk I/O.

Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@us.ibm.com>
Signed-Off-By: Vivek Kashyap <vivk@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>


 fs/rcfs/Makefile        |    1 
 fs/rcfs/rootdir.c       |   12 
 fs/rcfs/tc_magic.c      |   93 +++++
 include/linux/ckrm_tc.h |   46 ++
 include/linux/sched.h   |   11 
 init/Kconfig            |   12 
 kernel/ckrm/Makefile    |    1 
 kernel/ckrm/ckrm_tc.c   |  745 ++++++++++++++++++++++++++++++++++++++++++++++++
 8 files changed, 915 insertions(+), 6 deletions(-)

Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/fs/rcfs/Makefile	2005-05-05 09:35:06.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/Makefile	2005-05-05 09:35:07.000000000 -0700
@@ -5,3 +5,4 @@
 obj-$(CONFIG_RCFS_FS) += rcfs.o
 
 rcfs-y := super.o inode.o dir.o rootdir.o magic.o
+rcfs-$(CONFIG_CKRM_TYPE_TASKCLASS) += tc_magic.o
Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/rootdir.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/fs/rcfs/rootdir.c	2005-05-05 09:35:06.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/rootdir.c	2005-05-05 09:35:07.000000000 -0700
@@ -58,7 +58,7 @@ int rcfs_unregister_engine(struct rbce_e
 	return 0;
 }
 
-EXPORT_SYMBOL(rcfs_unregister_engine);
+EXPORT_SYMBOL_GPL(rcfs_unregister_engine);
 
 /*
  * rcfs_mkroot
@@ -183,6 +183,10 @@ int rcfs_deregister_classtype(struct ckr
 
 EXPORT_SYMBOL_GPL(rcfs_deregister_classtype);
 
+#ifdef CONFIG_CKRM_TYPE_TASKCLASS
+extern struct rcfs_mfdesc tc_mfdesc;
+#endif
+
 /* Common root and magic file entries.
  * root name, root permissions, magic file names and magic file permissions
  * are needed by all entities (classtypes and classification engines) existing
@@ -193,6 +197,10 @@ EXPORT_SYMBOL_GPL(rcfs_deregister_classt
  * table to initialize their magf entries.
  */
 
-struct rcfs_mfdesc *genmfdesc[] = {
+struct rcfs_mfdesc *genmfdesc[CKRM_MAX_CLASSTYPES] = {
+#ifdef CONFIG_CKRM_TYPE_TASKCLASS
+	&tc_mfdesc,
+#else
 	NULL,
+#endif
 };
Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/tc_magic.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/tc_magic.c	2005-05-05 09:35:07.000000000 -0700
@@ -0,0 +1,93 @@
+/*
+ * fs/rcfs/tc_magic.c
+ *
+ * Copyright (C) Shailabh Nagar,      IBM Corp. 2004
+ *           (C) Vivek Kashyap,       IBM Corp. 2004
+ *           (C) Chandra Seetharaman, IBM Corp. 2004
+ *           (C) Hubertus Franke,     IBM Corp. 2004
+ *
+ * define magic fileops for taskclass classtype
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/rcfs.h>
+#include <linux/ckrm_tc.h>
+
+/*
+ * Taskclass general
+ *
+ * Define structures for taskclass root directory and its magic files
+ * In taskclasses, there is one set of magic files, created automatically under
+ * the taskclass root (upon classtype registration) and each directory (class)
+ * created subsequently. However, classtypes can also choose to have different
+ * sets of magic files created under their root and other directories under
+ * root using their mkdir function. RCFS only provides helper functions for
+ * creating the root directory and its magic files
+ *
+ */
+
+#define TC_FILE_MODE (S_IFREG | S_IRUGO | S_IWUSR)
+
+#define NR_TCROOTMF  7
+struct rcfs_magf tc_rootdesc[NR_TCROOTMF] = {
+	/* First entry must be root */
+	{
+	/* .name = should not be set, copy from classtype name */
+	 .mode = RCFS_DEFAULT_DIR_MODE,
+	 .i_op = &rcfs_dir_inode_operations,
+	 .i_fop = &simple_dir_operations,
+	 },
+	/* Rest are root's magic files */
+	{
+	 .name = "target",
+	 .mode = TC_FILE_MODE,
+	 .i_fop = &target_fileops,
+	 .i_op = &rcfs_file_inode_operations,
+	 },
+	{
+	 .name = "members",
+	 .mode = TC_FILE_MODE,
+	 .i_fop = &members_fileops,
+	 .i_op = &rcfs_file_inode_operations,
+	 },
+	{
+	 .name = "stats",
+	 .mode = TC_FILE_MODE,
+	 .i_fop = &stats_fileops,
+	 .i_op = &rcfs_file_inode_operations,
+	 },
+	{
+	 .name = "shares",
+	 .mode = TC_FILE_MODE,
+	 .i_fop = &shares_fileops,
+	 .i_op = &rcfs_file_inode_operations,
+	 },
+	/*
+	 * Reclassify and Config should be made available only at the
+	 * root level. Make sure they are the last two entries, as
+	 * rcfs_mkdir depends on it.
+	 */
+	{
+	 .name = "reclassify",
+	 .mode = TC_FILE_MODE,
+	 .i_fop = &reclassify_fileops,
+	 .i_op = &rcfs_file_inode_operations,
+	 },
+	{
+	 .name = "config",
+	 .mode = TC_FILE_MODE,
+	 .i_fop = &config_fileops,
+	 .i_op = &rcfs_file_inode_operations,
+	 },
+};
+
+struct rcfs_mfdesc tc_mfdesc = {
+	.rootmf = tc_rootdesc,
+	.rootmflen = NR_TCROOTMF,
+};
Index: linux-2.6.12-rc3-ckrm5/include/linux/ckrm_tc.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/include/linux/ckrm_tc.h	2005-05-05 09:35:07.000000000 -0700
@@ -0,0 +1,46 @@
+/* ckrm_tc.h - Header file to be used by task class users
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003, 2004
+ *
+ * Provides data structures, macros and kernel API for the
+ * classtype, taskclass.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#ifndef _LINUX_CKRM_TC_H_
+#define _LINUX_CKRM_TC_H_
+
+#ifdef CONFIG_CKRM_TYPE_TASKCLASS
+#include <linux/ckrm_rc.h>
+
+#define TASK_CLASS_TYPE_NAME "taskclass"
+
+struct ckrm_task_class {
+	struct ckrm_core_class core;
+};
+
+/*
+ * Index into genmfdesc array, defined in rcfs/dir_modules.c,
+ * which has the mfdesc entry that taskclass wants to use.
+ */
+#define TC_MF_IDX  0
+
+extern int ckrm_forced_reclassify_pid(int, struct ckrm_task_class *);
+
+#else /* CONFIG_CKRM_TYPE_TASKCLASS */
+
+#define ckrm_forced_reclassify_pid(a, b) (0)
+
+#endif
+
+#endif /* _LINUX_CKRM_TC_H_ */
Index: linux-2.6.12-rc3-ckrm5/include/linux/sched.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/include/linux/sched.h	2005-05-05 09:35:04.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/include/linux/sched.h	2005-05-05 09:35:07.000000000 -0700
@@ -738,14 +738,17 @@ struct task_struct {
 	nodemask_t mems_allowed;
 	int cpuset_mems_generation;
 #endif
-#ifdef CONFIG_DELAY_ACCT
-	struct task_delay_info delays;
-#endif
 #ifdef CONFIG_CKRM
 	spinlock_t  ckrm_tsklock;
 	void       *ce_data;
+#ifdef CONFIG_CKRM_TYPE_TASKCLASS
+	struct ckrm_task_class *taskclass;
+	struct list_head taskclass_link;
+#endif /* CONFIG_CKRM_TYPE_TASKCLASS */
+#endif /* CONFIG_CKRM */
+#ifdef CONFIG_DELAY_ACCT
+	struct task_delay_info delays;
 #endif
-
 };
 
 static inline pid_t process_group(struct task_struct *tsk)
Index: linux-2.6.12-rc3-ckrm5/init/Kconfig
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/init/Kconfig	2005-05-05 09:35:06.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/init/Kconfig	2005-05-05 09:35:07.000000000 -0700
@@ -170,6 +170,18 @@ config RCFS_FS
 
 	  Say M if unsure, Y to save on module loading. N doesn't make sense
 	  when CKRM has been configured.
+
+config CKRM_TYPE_TASKCLASS
+	bool "Class Manager for Task Groups"
+	depends on CKRM && RCFS_FS
+	default y
+	help
+	  TASKCLASS provides the extensions for CKRM to track task classes
+	  This is the base to enable task class based resource control for
+	  cpu, memory and disk I/O.
+	
+	  Say Y if unsure
+
 endmenu
 
 config SYSCTL
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_tc.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_tc.c	2005-05-05 09:35:07.000000000 -0700
@@ -0,0 +1,745 @@
+/* ckrm_tc.c - Class-based Kernel Resource Management (CKRM)
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003,2004
+ *           (C) Shailabh Nagar,  IBM Corp. 2003
+ *           (C) Chandra Seetharaman,  IBM Corp. 2003
+ *	     (C) Vivek Kashyap,	IBM Corp. 2004
+ *
+ *
+ * Provides kernel API of CKRM for in-kernel,per-resource controllers
+ * (one each for cpu, memory, io, network) and callbacks for
+ * classification modules.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include <linux/config.h>
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <asm/uaccess.h>
+#include <linux/mm.h>
+#include <asm/errno.h>
+#include <linux/string.h>
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/ckrm_rc.h>
+
+#include <linux/ckrm_tc.h>
+
+static struct ckrm_task_class taskclass_dflt_class = {
+};
+
+const char *dflt_taskclass_name = TASK_CLASS_TYPE_NAME;
+
+static struct ckrm_core_class *ckrm_alloc_task_class(struct ckrm_core_class
+						     *parent, const char *name);
+static int ckrm_free_task_class(struct ckrm_core_class *core);
+
+static int tc_forced_reclassify(struct ckrm_core_class * target,
+				const char *resname);
+static int tc_show_members(struct ckrm_core_class *core, struct seq_file *seq);
+static void tc_add_resctrl(struct ckrm_core_class *core, int resid);
+
+struct ckrm_classtype ct_taskclass = {
+	.mfidx = TC_MF_IDX,
+	.name = TASK_CLASS_TYPE_NAME,
+	.type_id = CKRM_CLASSTYPE_TASK_CLASS,
+	.maxdepth = 3,		/* starting point */
+	.resid_reserved = 4,
+	.max_res_ctlrs = CKRM_MAX_RES_CTLRS,
+	.max_resid = 0,
+	.bit_res_ctlrs = 0L,
+	.res_ctlrs_lock = SPIN_LOCK_UNLOCKED,
+	.classes = LIST_HEAD_INIT(ct_taskclass.classes),
+
+	.default_class = &taskclass_dflt_class.core,
+
+	/* private version of functions */
+	.alloc = &ckrm_alloc_task_class,
+	.free = &ckrm_free_task_class,
+	.show_members = &tc_show_members,
+	.forced_reclassify = &tc_forced_reclassify,
+
+	/* use of default functions */
+	.show_shares = &ckrm_class_show_shares,
+	.show_stats = &ckrm_class_show_stats,
+	.show_config = &ckrm_class_show_config,
+	.set_config = &ckrm_class_set_config,
+	.set_shares = &ckrm_class_set_shares,
+	.reset_stats = &ckrm_class_reset_stats,
+
+	/* mandatory private version; no default available */
+	.add_resctrl = &tc_add_resctrl,
+};
+
+/*
+ * Change the task class of the given task.
+ *
+ * Change the task's task class  to "newcls" if the task's current
+ * class (task->taskclass) is same as given "oldcls", if it is non-NULL.
+ *
+ * Caller is responsible to make sure the task structure stays put through
+ * this function.
+ *
+ * This function should be called with the following locks NOT held
+ * 	- tsk->ckrm_tsklock
+ * 	- core->ckrm_lock, if core is NULL then ckrm_dflt_class.ckrm_lock
+ * 	- tsk->taskclass->ckrm_lock
+ *
+ * Function is also called with a ckrm_core_grab on the new core, hence
+ * it needs to be dropped if no assignment takes place.
+ */
+static void
+ckrm_set_taskclass(struct task_struct *tsk, struct ckrm_task_class *newcls,
+		   struct ckrm_task_class *oldcls, enum ckrm_event event)
+{
+	int i;
+	struct ckrm_classtype *clstype;
+	struct ckrm_res_ctlr *rcbs;
+	struct ckrm_task_class *curcls;
+	void *old_res_class, *new_res_class;
+	int drop_old_cls;
+
+	spin_lock(&tsk->ckrm_tsklock);
+	curcls = tsk->taskclass;
+
+	if ((void *)-1 == curcls) {
+		/* task is disassociated from ckrm.  Don't bother it. */
+		spin_unlock(&tsk->ckrm_tsklock);
+		ckrm_core_drop(class_core(newcls));
+		return;
+	}
+
+	if ((curcls == NULL) && (newcls == (void *)-1)) {
+		/*
+		 * Task needs to disassociated from ckrm and has no circles
+		 * just disassociate and return.
+		 */
+		tsk->taskclass = newcls;
+		spin_unlock(&tsk->ckrm_tsklock);
+		return;
+	}
+	if (oldcls && (oldcls != curcls)) {
+		spin_unlock(&tsk->ckrm_tsklock);
+		if (newcls) {
+			/* compensate for previous grab */
+			pr_debug("(%s:%d): Race-condition caught <%s> %d\n",
+				 tsk->comm, tsk->pid, class_core(newcls)->name,
+				 event);
+			ckrm_core_drop(class_core(newcls));
+		}
+		return;
+	}
+	/* Make sure we have a real destination core. */
+	if (!newcls) {
+		newcls = &taskclass_dflt_class;
+		ckrm_core_grab(class_core(newcls));
+	}
+	/* Take out of old class and drop the oldcore. */
+	if ((drop_old_cls = (curcls != NULL))) {
+		class_lock(class_core(curcls));
+		if (newcls == curcls) {
+			/*
+			 * We are already in the destination class.
+			 * we still need to drop oldcore.
+			 */
+			class_unlock(class_core(curcls));
+			spin_unlock(&tsk->ckrm_tsklock);
+			goto out;
+		}
+		list_del(&tsk->taskclass_link);
+		INIT_LIST_HEAD(&tsk->taskclass_link);
+		tsk->taskclass = NULL;
+		class_unlock(class_core(curcls));
+		if (newcls == (void *)-1) {
+			tsk->taskclass = newcls;
+			spin_unlock(&tsk->ckrm_tsklock);
+
+			/* still need to get out of old class. */
+			newcls = NULL;
+			goto rc_handling;
+		}
+	}
+	/* put into new class */
+	class_lock(class_core(newcls));
+	tsk->taskclass = newcls;
+	list_add(&tsk->taskclass_link, &class_core(newcls)->objlist);
+	class_unlock(class_core(newcls));
+
+	if (newcls == curcls) {
+		spin_unlock(&tsk->ckrm_tsklock);
+		goto out;
+	}
+
+	CE_NOTIFY(&ct_taskclass, event, newcls, tsk);
+
+	spin_unlock(&tsk->ckrm_tsklock);
+
+      rc_handling:
+	clstype = &ct_taskclass;
+	if (clstype->bit_res_ctlrs) {	
+		/* avoid running through the entire list if none are registered */
+		for (i = 0; i < clstype->max_resid; i++) {
+			if (clstype->res_ctlrs[i] == NULL)
+				continue;
+			atomic_inc(&clstype->nr_resusers[i]);
+			old_res_class =
+			    curcls ? class_core(curcls)->res_class[i] : NULL;
+			new_res_class =
+			    newcls ? class_core(newcls)->res_class[i] : NULL;
+			rcbs = clstype->res_ctlrs[i];
+			if (rcbs && rcbs->change_resclass
+			    && (old_res_class != new_res_class))
+				(*rcbs->change_resclass) (tsk, old_res_class,
+							  new_res_class);
+			atomic_dec(&clstype->nr_resusers[i]);
+		}
+	}
+
+      out:
+	if (drop_old_cls)
+		ckrm_core_drop(class_core(curcls));
+	return;
+}
+
+static void tc_add_resctrl(struct ckrm_core_class *core, int resid)
+{
+	struct task_struct *tsk;
+	struct ckrm_res_ctlr *rcbs;
+
+	if ((resid < 0) || (resid >= CKRM_MAX_RES_CTLRS)
+	    || ((rcbs = core->classtype->res_ctlrs[resid]) == NULL))
+		return;
+
+	class_lock(core);
+	list_for_each_entry(tsk, &core->objlist, taskclass_link) {
+		if (rcbs->change_resclass)
+			(*rcbs->change_resclass) (tsk, (void *)-1,
+						  core->res_class[resid]);
+	}
+	class_unlock(core);
+}
+
+/**************************************************************************
+ *                   Functions called from classification points          *
+ **************************************************************************/
+
+#define CE_CLASSIFY_TASK(event, tsk)					\
+do {									\
+	struct ckrm_task_class *newcls = NULL;				\
+ 	struct ckrm_task_class *oldcls = tsk->taskclass;		\
+									\
+	CE_CLASSIFY_RET(newcls,&ct_taskclass,event,tsk);		\
+	if (newcls) {							\
+		/* called synchrously. no need to get task struct */	\
+		ckrm_set_taskclass(tsk, newcls, oldcls, event);		\
+	}								\
+} while (0)
+
+
+#define CE_CLASSIFY_TASK_PROTECT(event, tsk)	\
+do {						\
+	ce_protect(&ct_taskclass);		\
+	CE_CLASSIFY_TASK(event,tsk);		\
+	ce_release(&ct_taskclass);              \
+} while (0)
+
+static void cb_taskclass_newtask(struct task_struct *tsk)
+{
+	tsk->taskclass = NULL;
+	INIT_LIST_HEAD(&tsk->taskclass_link);
+}
+
+static void cb_taskclass_fork(struct task_struct *tsk)
+{
+	struct ckrm_task_class *cls = NULL;
+
+	pr_debug("%p:%d:%s\n", tsk, tsk->pid, tsk->comm);
+
+	ce_protect(&ct_taskclass);
+	CE_CLASSIFY_RET(cls, &ct_taskclass, CKRM_EVENT_FORK, tsk);
+	if (cls == NULL) {
+		spin_lock(&tsk->parent->ckrm_tsklock);
+		cls = tsk->parent->taskclass;
+		ckrm_core_grab(class_core(cls));
+		spin_unlock(&tsk->parent->ckrm_tsklock);
+	}
+	if (!list_empty(&tsk->taskclass_link))
+		pr_debug("cb_taskclass_fork: BUG in cb_fork.. tsk (%s:%d> already linked\n",
+		       tsk->comm, tsk->pid);
+
+	ckrm_set_taskclass(tsk, cls, NULL, CKRM_EVENT_FORK);
+	ce_release(&ct_taskclass);
+}
+
+static void cb_taskclass_exit(struct task_struct *tsk)
+{
+	CE_CLASSIFY_NORET(&ct_taskclass, CKRM_EVENT_EXIT, tsk);
+	ckrm_set_taskclass(tsk, (void *)-1, NULL, CKRM_EVENT_EXIT);
+}
+
+static void cb_taskclass_exec(const char *filename)
+{
+	pr_debug("%p:%d:%s <%s>\n", current, current->pid, current->comm,
+		   filename);
+	CE_CLASSIFY_TASK_PROTECT(CKRM_EVENT_EXEC, current);
+}
+
+static void cb_taskclass_uid(void)
+{
+	pr_debug("%p:%d:%s\n", current, current->pid, current->comm);
+	CE_CLASSIFY_TASK_PROTECT(CKRM_EVENT_UID, current);
+}
+
+static void cb_taskclass_gid(void)
+{
+	pr_debug("%p:%d:%s\n", current, current->pid, current->comm);
+	CE_CLASSIFY_TASK_PROTECT(CKRM_EVENT_GID, current);
+}
+
+static struct ckrm_event_spec taskclass_events_callbacks[] = {
+	{CKRM_EVENT_NEWTASK, { cb_taskclass_newtask, NULL}},
+	{CKRM_EVENT_EXEC, { cb_taskclass_exec, NULL }},
+	{CKRM_EVENT_FORK, { cb_taskclass_fork, NULL }},
+	{CKRM_EVENT_EXIT, { cb_taskclass_exit, NULL }},
+	{CKRM_EVENT_UID, { cb_taskclass_uid, NULL }},
+	{CKRM_EVENT_GID, { cb_taskclass_gid, NULL }},
+	{-1, { -1, NULL }}
+};
+
+/*
+ * Asynchronous callback functions   (driven by RCFS)
+ *
+ *    Async functions force a setting of the task structure
+ *    synchronous callbacks are protected against race conditions
+ *    by using a cmpxchg on the core before setting it.
+ *    Async calls need to be serialized to ensure they can't
+ *    race against each other
+ */
+
+static DECLARE_MUTEX(ckrm_async_serializer);	/* serialize all async functions */
+
+/*
+ * Go through the task list and reclassify all tasks according to the current
+ * classification rules.
+ *
+ * We have the problem that we can not hold any lock (including the
+ * tasklist_lock) while classifying. Two methods possible
+ *
+ * (a) go through entire pidrange (0..pidmax) and if a task exists at
+ *     that pid then reclassify it
+ * (b) go several time through task list and build a bitmap for a particular
+ *     subrange of pid otherwise the memory requirements ight be too much.
+ *
+ * We use a hybrid by comparing ratio nr_threads/pidmax
+ */
+
+static int ckrm_reclassify_all_tasks(void)
+{
+	extern int pid_max;
+
+	struct task_struct *proc, *thread;
+	int i;
+	int curpidmax = pid_max;
+	int ratio;
+	int use_bitmap;
+
+	/* Check permissions */
+	if ((!capable(CAP_SYS_NICE)) && (!capable(CAP_SYS_RESOURCE))) {
+		return -EPERM;
+	}
+
+	ratio = curpidmax / nr_threads;
+	if (curpidmax <= PID_MAX_DEFAULT) {
+		use_bitmap = 1;
+	} else {
+		use_bitmap = (ratio >= 2);
+	}
+
+	ce_protect(&ct_taskclass);
+
+      retry:
+
+	if (use_bitmap == 0) {
+		/* Go through it in one walk. */
+		read_lock(&tasklist_lock);
+		for (i = 0; i < curpidmax; i++) {
+			if ((thread = find_task_by_pid(i)) == NULL)
+				continue;
+			get_task_struct(thread);
+			read_unlock(&tasklist_lock);
+			CE_CLASSIFY_TASK(CKRM_EVENT_RECLASSIFY, thread);
+			put_task_struct(thread);
+			read_lock(&tasklist_lock);
+		}
+		read_unlock(&tasklist_lock);
+	} else {
+		unsigned long *bitmap;
+		int bitmapsize;
+		int order = 0;
+		int num_loops;
+		int pid, do_next;
+
+		bitmap = (unsigned long *)__get_free_pages(GFP_KERNEL, order);
+		if (bitmap == NULL) {
+			use_bitmap = 0;
+			goto retry;
+		}
+
+		bitmapsize = 8 * (1 << (order + PAGE_SHIFT));
+		num_loops = (curpidmax + bitmapsize - 1) / bitmapsize;
+
+		do_next = 1;
+		for (i = 0; i < num_loops && do_next; i++) {
+			int pid_start = i * bitmapsize;
+			int pid_end = pid_start + bitmapsize;
+			int num_found = 0;
+			int pos;
+
+			memset(bitmap, 0, bitmapsize / 8);	/* start afresh */
+			do_next = 0;
+
+			read_lock(&tasklist_lock);
+			do_each_thread(proc, thread) {
+				pid = thread->pid;
+				if ((pid < pid_start) || (pid >= pid_end)) {
+					if (pid >= pid_end) {
+						do_next = 1;
+					}
+					continue;
+				}
+				pid -= pid_start;
+				set_bit(pid, bitmap);
+				num_found++;
+			}
+			while_each_thread(proc, thread);
+			read_unlock(&tasklist_lock);
+
+			if (num_found == 0)
+				continue;
+
+			pos = 0;
+			for (; num_found--;) {
+				pos = find_next_bit(bitmap, bitmapsize, pos);
+				pid = pos + pid_start;
+
+				read_lock(&tasklist_lock);
+				if ((thread = find_task_by_pid(pid)) != NULL) {
+					get_task_struct(thread);
+					read_unlock(&tasklist_lock);
+					CE_CLASSIFY_TASK(CKRM_EVENT_RECLASSIFY,
+							 thread);
+					put_task_struct(thread);
+				} else {
+					read_unlock(&tasklist_lock);
+				}
+				pos++;
+			}
+		}
+
+	}
+	ce_release(&ct_taskclass);
+	return 0;
+}
+
+/*
+ * Reclassify all tasks in the given core class.
+ */
+
+static void ckrm_reclassify_class_tasks(struct ckrm_task_class *cls)
+{
+	int ce_regd;
+	struct ckrm_hnode *cnode;
+	struct ckrm_task_class *parcls;
+	int num = 0;
+
+	if (!ckrm_validate_and_grab_core(&cls->core))
+		return;
+
+	down(&ckrm_async_serializer);
+	pr_debug("start %p:%s:%d:%d\n", cls, cls->core.name,
+		 atomic_read(&cls->core.refcnt),
+		 atomic_read(&cls->core.hnode.parent->refcnt));
+	/*
+	 * If no CE registered for this classtype, following will be needed
+	 * repeatedly.
+	 */
+	ce_regd = atomic_read(&class_core(cls)->classtype->ce_regd);
+	cnode = &(class_core(cls)->hnode);
+	parcls = class_type(struct ckrm_task_class, cnode->parent);
+
+      next_task:
+	class_lock(class_core(cls));
+	if (!list_empty(&class_core(cls)->objlist)) {
+		struct ckrm_task_class *newcls = NULL;
+		struct task_struct *tsk =
+		    list_entry(class_core(cls)->objlist.next,
+			       struct task_struct, taskclass_link);
+
+		get_task_struct(tsk);
+		class_unlock(class_core(cls));
+
+		if (ce_regd) {
+			CE_CLASSIFY_RET(newcls, &ct_taskclass,
+					CKRM_EVENT_RECLASSIFY, tsk);
+			if (cls == newcls) {
+				/*
+				 * Don't allow reclassifying to the same class
+				 * as we are in the process of cleaning up
+				 * this class
+				 */
+
+				/* compensate for CE's grab */
+				ckrm_core_drop(class_core(newcls));	
+				newcls = NULL;
+			}
+		}
+		if (newcls == NULL) {
+			newcls = parcls;
+			ckrm_core_grab(class_core(newcls));
+		}
+		ckrm_set_taskclass(tsk, newcls, cls, CKRM_EVENT_RECLASSIFY);
+		put_task_struct(tsk);
+		num++;
+		goto next_task;
+	}
+	pr_debug("stop  %p:%s:%d:%d   %d\n", cls, cls->core.name,
+		 atomic_read(&cls->core.refcnt),
+		 atomic_read(&cls->core.hnode.parent->refcnt), num);
+	class_unlock(class_core(cls));
+	ckrm_core_drop(class_core(cls));
+
+	up(&ckrm_async_serializer);
+
+	return;
+}
+
+/*
+ * Change the core class of the given task
+ */
+
+int ckrm_forced_reclassify_pid(pid_t pid, struct ckrm_task_class *cls)
+{
+	struct task_struct *tsk;
+
+	if (cls && !ckrm_validate_and_grab_core(class_core(cls)))
+		return -EINVAL;
+
+	read_lock(&tasklist_lock);
+	if ((tsk = find_task_by_pid(pid)) == NULL) {
+		read_unlock(&tasklist_lock);
+		if (cls)
+			ckrm_core_drop(class_core(cls));
+		return -EINVAL;
+	}
+	get_task_struct(tsk);
+	read_unlock(&tasklist_lock);
+
+	/* Check permissions */
+	if ((!capable(CAP_SYS_NICE)) &&
+	    (!capable(CAP_SYS_RESOURCE)) && (current->user != tsk->user)) {
+		if (cls)
+			ckrm_core_drop(class_core(cls));
+		put_task_struct(tsk);
+		return -EPERM;
+	}
+
+	ce_protect(&ct_taskclass);
+	if (cls == NULL)
+		CE_CLASSIFY_TASK(CKRM_EVENT_RECLASSIFY,tsk);
+	else
+		ckrm_set_taskclass(tsk, cls, NULL, CKRM_EVENT_MANUAL);
+
+	ce_release(&ct_taskclass);
+	put_task_struct(tsk);
+
+	return 0;
+}
+
+static struct ckrm_core_class *ckrm_alloc_task_class(struct ckrm_core_class
+						     *parent, const char *name)
+{
+	struct ckrm_task_class *taskcls;
+	taskcls = kmalloc(sizeof(struct ckrm_task_class), GFP_KERNEL);
+	if (taskcls == NULL)
+		return NULL;
+	memset(taskcls, 0, sizeof(struct ckrm_task_class));
+
+	ckrm_init_core_class(&ct_taskclass, class_core(taskcls), parent, name);
+
+	ce_protect(&ct_taskclass);
+	if (ct_taskclass.ce_cb_active && ct_taskclass.ce_callbacks.class_add)
+		(*ct_taskclass.ce_callbacks.class_add) (name, taskcls,
+							ct_taskclass.type_id);
+	ce_release(&ct_taskclass);
+
+	return class_core(taskcls);
+}
+
+static int ckrm_free_task_class(struct ckrm_core_class *core)
+{
+	struct ckrm_task_class *taskcls;
+
+	if (!ckrm_is_core_valid(core)) {
+		return (-EINVAL);		/* Invalid core */
+	}
+	if (core == core->classtype->default_class) {
+		/* reset the name tag */
+		core->name = dflt_taskclass_name;
+		return 0;
+	}
+
+	pr_debug("%p:%s:%d\n", core, core->name, atomic_read(&core->refcnt));
+
+	taskcls = class_type(struct ckrm_task_class, core);
+
+	ce_protect(&ct_taskclass);
+
+	if (ct_taskclass.ce_cb_active && ct_taskclass.ce_callbacks.class_delete)
+		(*ct_taskclass.ce_callbacks.class_delete) (core->name, taskcls,
+							   ct_taskclass.type_id);
+	ckrm_reclassify_class_tasks(taskcls);
+
+	ce_release(&ct_taskclass);
+
+	ckrm_release_core_class(core);	
+	return 0;
+}
+
+void __init ckrm_meta_init_taskclass(void)
+{
+	pr_debug("...... Initializing ClassType<%s> ........\n",
+	       ct_taskclass.name);
+	/* intialize the default class */
+	ckrm_init_core_class(&ct_taskclass, class_core(&taskclass_dflt_class),
+			     NULL, dflt_taskclass_name);
+
+	/* register classtype and initialize default task class */
+	ckrm_register_classtype(&ct_taskclass);
+	ckrm_register_event_set(taskclass_events_callbacks);
+
+	/*
+	 * note registeration of all resource controllers will be done
+	 * later dynamically as these are specified as modules
+	 */
+}
+
+static int tc_show_members(struct ckrm_core_class *core, struct seq_file *seq)
+{
+	struct list_head *lh;
+	struct task_struct *tsk;
+
+	class_lock(core);
+	list_for_each(lh, &core->objlist) {
+		tsk = container_of(lh, struct task_struct, taskclass_link);
+		seq_printf(seq, "%ld\n", (long)tsk->pid);
+	}
+	class_unlock(core);
+
+	return 0;
+}
+
+static int tc_forced_reclassify(struct ckrm_core_class *target, const char *obj)
+{
+	pid_t pid;
+	int rc = -EINVAL;
+
+	pid = (pid_t) simple_strtol(obj, NULL, 0);
+
+	down(&ckrm_async_serializer);	/* protect against race with reclassify_class */
+	if (pid < 0) {
+		/* TBD: We could treat this as a process group. */
+		rc = -EINVAL;
+	} else if (pid == 0) {
+		rc = (target == NULL) ? ckrm_reclassify_all_tasks() : -EINVAL;
+	} else {
+		struct ckrm_task_class *cls = NULL;
+		if (target)
+			cls = class_type(struct ckrm_task_class, target);
+		rc = ckrm_forced_reclassify_pid(pid,cls);
+	}
+	up(&ckrm_async_serializer);
+	return rc;
+}
+
+#if 0
+
+/******************************************************************************
+ * Debugging Task Classes:  Utility functions
+ ******************************************************************************/
+
+void check_tasklist_sanity(struct ckrm_task_class *cls)
+{
+	struct ckrm_core_class *core = class_core(cls);
+	struct list_head *lh1, *lh2;
+	int count = 0;
+
+	if (core) {
+		class_lock(core);
+		if (list_empty(&core->objlist)) {
+			class_lock(core);
+			pr_debug("check_tasklist_sanity: class %s empty list\n",
+			       core->name);
+			return;
+		}
+		list_for_each_safe(lh1, lh2, &core->objlist) {
+			struct task_struct *tsk =
+			    container_of(lh1, struct task_struct,
+					 taskclass_link);
+			if (count++ > 20000) {
+				pr_debug("check_tasklist_sanity: CKRM taskclass list is CORRUPTED\n");
+				break;
+			}
+			if (tsk->taskclass != cls) {
+				const char *tclsname;
+				tclsname = (tsk->taskclass) ?
+					class_core(tsk->taskclass)->name:"NULL";
+				pr_debug("sanity: task %s:%d has ckrm_core "
+				       "|%s| but in list |%s|\n", tsk->comm,
+				       tsk->pid, tclsname, core->name);
+			}
+		}
+		class_unlock(core);
+	}
+}
+
+void ckrm_debug_free_task_class(struct ckrm_task_class *tskcls)
+{
+	struct task_struct *proc, *thread;
+	int count = 0;
+
+	pr_debug("ckrm_debug_free_task_class: Analyze Error <%s> %d\n",
+	       class_core(tskcls)->name,
+	       atomic_read(&(class_core(tskcls)->refcnt)));
+
+	read_lock(&tasklist_lock);
+	class_lock(class_core(tskcls));
+	do_each_thread(proc, thread) {
+		count += (tskcls == thread->taskclass);
+		if ((thread->taskclass == tskcls) || (tskcls == NULL)) {
+			const char *tclsname;
+			tclsname = (thread->taskclass) ?
+				class_core(thread->taskclass)->name :"NULL";
+			pr_debug("ckrm-debug_free_task_class: %d thread=<%s:%d>  -> <%s> <%lx>\n", count,
+			       thread->comm, thread->pid, tclsname,
+			       thread->flags & PF_EXITING);
+		}
+	} while_each_thread(proc, thread);
+	class_unlock(class_core(tskcls));
+	read_unlock(&tasklist_lock);
+
+	pr_debug("ckrm_debug_free_task_class: End Analyze Error <%s> %d\n",
+	       class_core(tskcls)->name,
+	       atomic_read(&(class_core(tskcls)->refcnt)));
+}
+
+#endif
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/Makefile	2005-05-05 09:35:04.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile	2005-05-05 09:35:07.000000000 -0700
@@ -3,3 +3,4 @@
 #
 
 obj-y += ckrm_events.o ckrm.o ckrmutils.o
+obj-$(CONFIG_CKRM_TYPE_TASKCLASS) += ckrm_tc.o

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 06/21] CKRM: Classtype definitions for socket class
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (4 preceding siblings ...)
  2005-05-05 18:07 ` [patch 05/21] CKRM: Classtype definitions for task class gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 07/21] CKRM: Numtasks Controller gh
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=06-diff_sockclass


This patch provides the extensions for CKRM to track per socket classes.
This is the base to enable socket based resource control for inbound
connection control, bandwidth control etc.

Signed-Off-By: Vivek Kashyap <vivk@us.ibm.com>
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

 fs/rcfs/Makefile         |    1 
 fs/rcfs/rootdir.c        |   10 
 fs/rcfs/socket_fs.c      |  280 +++++++++++++++++++++++
 include/linux/ckrm_net.h |   42 +++
 include/net/sock.h       |    3 
 include/net/tcp.h        |    4 
 init/Kconfig             |   11 
 kernel/ckrm/Makefile     |    1 
 kernel/ckrm/ckrm_sockc.c |  559 +++++++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/tcp_ipv4.c      |    4 
 10 files changed, 914 insertions(+), 1 deletion(-)

Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/fs/rcfs/Makefile	2005-05-05 09:35:07.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/Makefile	2005-05-05 09:35:09.000000000 -0700
@@ -6,3 +6,4 @@ obj-$(CONFIG_RCFS_FS) += rcfs.o
 
 rcfs-y := super.o inode.o dir.o rootdir.o magic.o
 rcfs-$(CONFIG_CKRM_TYPE_TASKCLASS) += tc_magic.o
+rcfs-$(CONFIG_CKRM_TYPE_SOCKETCLASS) += socket_fs.o
Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/rootdir.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/fs/rcfs/rootdir.c	2005-05-05 09:35:07.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/rootdir.c	2005-05-05 09:35:09.000000000 -0700
@@ -187,6 +187,10 @@ EXPORT_SYMBOL_GPL(rcfs_deregister_classt
 extern struct rcfs_mfdesc tc_mfdesc;
 #endif
 
+#ifdef CONFIG_CKRM_TYPE_SOCKETCLASS
+extern struct rcfs_mfdesc rcfs_sock_mfdesc;
+#endif
+
 /* Common root and magic file entries.
  * root name, root permissions, magic file names and magic file permissions
  * are needed by all entities (classtypes and classification engines) existing
@@ -203,4 +207,10 @@ struct rcfs_mfdesc *genmfdesc[CKRM_MAX_C
 #else
 	NULL,
 #endif
+#ifdef CONFIG_CKRM_TYPE_SOCKETCLASS
+	&rcfs_sock_mfdesc,
+#else
+	NULL,
+#endif
+
 };
Index: linux-2.6.12-rc3-ckrm5/fs/rcfs/socket_fs.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/fs/rcfs/socket_fs.c	2005-05-05 09:35:09.000000000 -0700
@@ -0,0 +1,280 @@
+/* ckrm_socketaq.c
+ *
+ * Copyright (C) Vivek Kashyap,      IBM Corp. 2004
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+/*******************************************************************************
+ *  Socket class type
+ *
+ * Defines the root structure for socket based classes. Currently only inbound
+ * connection control is supported based on prioritized accept queues.
+ ******************************************************************************/
+
+#include <linux/rcfs.h>
+#include <net/tcp.h>
+
+extern int rcfs_create_noperm(struct inode *, struct dentry *, int,
+		       struct nameidata *);
+extern int rcfs_symlink_noperm(struct inode *, struct dentry *, const char *);
+extern int rcfs_mkdir_noperm(struct inode *, struct dentry *, int);
+extern int rcfs_rmdir_noperm(struct inode *, struct dentry *);
+extern int rcfs_link_noperm(struct dentry *, struct inode *, struct dentry *);
+extern int rcfs_unlink_noperm(struct inode *, struct dentry *);
+extern int rcfs_mknod_noperm(struct inode *, struct dentry *, int mode, dev_t);
+
+extern int rcfs_rmdir(struct inode *, struct dentry *);
+extern int rcfs_unlink(struct inode *, struct dentry *);
+extern int rcfs_rename(struct inode *, struct dentry *, struct inode *,
+		       struct dentry *);
+
+extern int rcfs_create_coredir(struct inode *, struct dentry *);
+
+int rcfs_sock_mkdir(struct inode *, struct dentry *, int mode);
+int rcfs_sock_rmdir(struct inode *, struct dentry *);
+struct inode_operations my_iops;
+struct inode_operations class_iops;
+struct inode_operations sub_iops;
+
+
+struct rcfs_magf def_magf = {
+	.mode = RCFS_DEFAULT_DIR_MODE,
+	.i_op = &sub_iops,
+	.i_fop = NULL,
+};
+
+struct rcfs_magf rcfs_sock_rootdesc[] = {
+	{
+	 /* .name = should not be set, copy from classtype name, */
+	 .mode = RCFS_DEFAULT_DIR_MODE,
+	 .i_op = &my_iops,
+	 /* .i_fop   = &simple_dir_operations, */
+	 .i_fop = NULL,
+	 },
+	{
+	 .name = "members",
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_op = &my_iops,
+	 .i_fop = &members_fileops,
+	 },
+	{
+	 .name = "target",
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_op = &my_iops,
+	 .i_fop = &target_fileops,
+	 },
+	{
+	 .name = "reclassify",
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_op = &my_iops,
+	 .i_fop = &reclassify_fileops,
+	 },
+};
+
+struct rcfs_magf rcfs_sock_magf[] = {
+	{
+	 .name = "config",
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_op = &my_iops,
+	 .i_fop = &config_fileops,
+	 },
+	{
+	 .name = "members",
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_op = &my_iops,
+	 .i_fop = &members_fileops,
+	 },
+	{
+	 .name = "shares",
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_op = &my_iops,
+	 .i_fop = &shares_fileops,
+	 },
+	{
+	 .name = "stats",
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_op = &my_iops,
+	 .i_fop = &stats_fileops,
+	 },
+	{
+	 .name = "target",
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_op = &my_iops,
+	 .i_fop = &target_fileops,
+	 },
+};
+
+struct rcfs_magf sub_magf[] = {
+	{
+	 .name = "config",
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_op = &my_iops,
+	 .i_fop = &config_fileops,
+	 },
+	{
+	 .name = "shares",
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_op = &my_iops,
+	 .i_fop = &shares_fileops,
+	 },
+	{
+	 .name = "stats",
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_op = &my_iops,
+	 .i_fop = &stats_fileops,
+	 },
+};
+
+struct rcfs_mfdesc rcfs_sock_mfdesc = {
+	.rootmf = rcfs_sock_rootdesc,
+	.rootmflen = (sizeof(rcfs_sock_rootdesc) / sizeof(struct rcfs_magf)),
+};
+
+#define SOCK_MAX_MAGF (sizeof(rcfs_sock_magf)/sizeof(struct rcfs_magf))
+#define LAQ_MAX_SUBMAGF (sizeof(sub_magf)/sizeof(struct rcfs_magf))
+
+int rcfs_sock_rmdir(struct inode *p, struct dentry *me)
+{
+	struct dentry *mftmp, *mfdentry;
+	int ret = 0;
+
+	/* delete all magic sub directories */
+	list_for_each_entry_safe(mfdentry, mftmp, &me->d_subdirs, d_child) {
+		if (S_ISDIR(mfdentry->d_inode->i_mode)) {
+			ret = rcfs_rmdir(me->d_inode, mfdentry);
+			if (ret)
+				return ret;
+		}
+	}
+	/* delete ourselves */
+	ret = rcfs_rmdir(p, me);
+
+	return ret;
+}
+
+#ifdef NUM_ACCEPT_QUEUES
+#define LAQ_NUM_ACCEPT_QUEUES NUM_ACCEPT_QUEUES
+#else
+#define LAQ_NUM_ACCEPT_QUEUES 0
+#endif
+
+int rcfs_sock_mkdir(struct inode *dir, struct dentry *dentry, int mode)
+{
+	int retval = 0;
+	int i, j;
+	struct dentry *pentry, *mfdentry;
+
+	if (rcfs_mknod(dir, dentry, mode | S_IFDIR, 0)) {
+		printk(KERN_ERR "rcfs_sock_mkdir: error reaching parent\n");
+		return retval;
+	}
+	/* Needed if only rcfs_mknod is used instead of i_op->mkdir */
+	dir->i_nlink++;
+
+	retval = rcfs_create_coredir(dir, dentry);
+	if (retval)
+		goto mkdir_err;
+
+	/* create the default set of magic files */
+	for (i = 0; i < SOCK_MAX_MAGF; i++) {
+		mfdentry = rcfs_create_internal(dentry, &rcfs_sock_magf[i], 0);
+		mfdentry->d_fsdata = &RCFS_IS_MAGIC;
+		rcfs_get_inode_info(mfdentry->d_inode)->core =
+			rcfs_get_inode_info(dentry->d_inode)->core;
+		rcfs_get_inode_info(mfdentry->d_inode)->mfdentry = mfdentry;
+		if (rcfs_sock_magf[i].i_fop)
+			mfdentry->d_inode->i_fop = rcfs_sock_magf[i].i_fop;
+		if (rcfs_sock_magf[i].i_op)
+			mfdentry->d_inode->i_op = rcfs_sock_magf[i].i_op;
+	}
+
+	for (i = 1; i < LAQ_NUM_ACCEPT_QUEUES; i++) {
+		j = sprintf(def_magf.name, "%d", i);
+		def_magf.name[j] = '\0';
+
+		pentry = rcfs_create_internal(dentry, &def_magf, 0);
+		retval = rcfs_create_coredir(dentry->d_inode, pentry);
+		if (retval)
+			goto mkdir_err;
+		pentry->d_fsdata = &RCFS_IS_MAGIC;
+		for (j = 0; j < LAQ_MAX_SUBMAGF; j++) {
+			mfdentry =
+			    rcfs_create_internal(pentry, &sub_magf[j], 0);
+			mfdentry->d_fsdata = &RCFS_IS_MAGIC;
+			rcfs_get_inode_info(mfdentry->d_inode)->core =
+			    rcfs_get_inode_info(pentry->d_inode)->core;
+			rcfs_get_inode_info(mfdentry->d_inode)->mfdentry =
+				 mfdentry;
+			if (sub_magf[j].i_fop)
+				mfdentry->d_inode->i_fop = sub_magf[j].i_fop;
+			if (sub_magf[j].i_op)
+				mfdentry->d_inode->i_op = sub_magf[j].i_op;
+		}
+		pentry->d_inode->i_op = &sub_iops;
+	}
+	dentry->d_inode->i_op = &class_iops;
+	return 0;
+
+      mkdir_err:
+	/* Needed */
+	dir->i_nlink--;
+	return retval;
+}
+
+char *rcfs_sock_get_name(struct ckrm_core_class *c)
+{
+	char *p = (char *)c->name;
+
+	while (*p)
+		p++;
+	while (*p != '/' && p != c->name)
+		p--;
+
+	return ++p;
+}
+
+
+
+struct inode_operations my_iops = {
+	.create = rcfs_create_noperm,
+	.lookup = simple_lookup,
+	.link = rcfs_link_noperm,
+	.unlink = rcfs_unlink,
+	.symlink = rcfs_symlink_noperm,
+	.mkdir = rcfs_sock_mkdir,
+	.rmdir = rcfs_sock_rmdir,
+	.mknod = rcfs_mknod_noperm,
+	.rename = rcfs_rename,
+};
+
+struct inode_operations class_iops = {
+	.create = rcfs_create_noperm,
+	.lookup = simple_lookup,
+	.link = rcfs_link_noperm,
+	.unlink = rcfs_unlink_noperm,
+	.symlink = rcfs_symlink_noperm,
+	.mkdir = rcfs_mkdir_noperm,
+	.rmdir = rcfs_rmdir_noperm,
+	.mknod = rcfs_mknod_noperm,
+	.rename = rcfs_rename,
+};
+
+struct inode_operations sub_iops = {
+	.create = rcfs_create_noperm,
+	.lookup = simple_lookup,
+	.link = rcfs_link_noperm,
+	.unlink = rcfs_unlink_noperm,
+	.symlink = rcfs_symlink_noperm,
+	.mkdir = rcfs_mkdir_noperm,
+	.rmdir = rcfs_rmdir_noperm,
+	.mknod = rcfs_mknod_noperm,
+	.rename = rcfs_rename,
+};
+
Index: linux-2.6.12-rc3-ckrm5/include/linux/ckrm_net.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/include/linux/ckrm_net.h	2005-05-05 09:35:09.000000000 -0700
@@ -0,0 +1,42 @@
+/* ckrm_rc.h - Header file to be used by Resource controllers of CKRM
+ *
+ * Copyright (C) Vivek Kashyap , IBM Corp. 2004
+ *
+ * Provides data structures, macros and kernel API of CKRM for
+ * resource controllers.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#ifndef _LINUX_CKRM_NET_H
+#define _LINUX_CKRM_NET_H
+
+struct ckrm_sock_class;
+
+struct ckrm_net_struct {
+	int ns_type;		/* type of net class */
+	struct sock *ns_sk;	/* pointer to socket */
+	pid_t ns_tgid;		/* real process id */
+	pid_t ns_pid;		/* calling thread's pid */
+	struct task_struct *ns_tsk;
+	int ns_family;		/* IPPROTO_IPV4 || IPPROTO_IPV6 */
+				/* Currently only IPV4 is supported */
+	union {
+		__u32 ns_dipv4;	/* V4 listener's address */
+	} ns_daddr;
+	__u16 ns_dport;		/* listener's port */
+	__u16 ns_sport;		/* sender's port */
+	atomic_t ns_refcnt;
+	struct ckrm_sock_class *core;
+	struct list_head ckrm_link;
+};
+
+#define ns_daddrv4     ns_daddr.ns_dipv4
+
+#endif
Index: linux-2.6.12-rc3-ckrm5/include/net/sock.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/include/net/sock.h	2005-05-05 09:33:00.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/include/net/sock.h	2005-05-05 09:35:09.000000000 -0700
@@ -112,6 +112,8 @@ struct sock_common {
 	atomic_t		skc_refcnt;
 };
 
+struct ckrm_net_struct;
+
 /**
   *	struct sock - network layer representation of sockets
   *	@__sk_common - shared layout with tcp_tw_bucket
@@ -233,6 +235,7 @@ struct sock {
 	struct timeval		sk_stamp;
 	struct socket		*sk_socket;
 	void			*sk_user_data;
+	struct ckrm_net_struct  *sk_ckrm_ns;
 	struct page		*sk_sndmsg_page;
 	struct sk_buff		*sk_send_head;
 	__u32			sk_sndmsg_off;
Index: linux-2.6.12-rc3-ckrm5/include/net/tcp.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/include/net/tcp.h	2005-05-05 09:33:00.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/include/net/tcp.h	2005-05-05 09:35:09.000000000 -0700
@@ -800,6 +800,7 @@ extern int			tcp_rcv_established(struct 
 
 extern void			tcp_rcv_space_adjust(struct sock *sk);
 
+
 enum tcp_ack_state_t
 {
 	TCP_ACK_SCHED = 1,
@@ -930,6 +931,9 @@ extern void			tcp_unhash(struct sock *sk
 
 extern int			tcp_v4_hash_connecting(struct sock *sk);
 
+extern struct sock *		tcp_v4_lookup_listener(u32 daddr,
+						    unsigned short hnum,
+						    int dif);
 
 /* From syncookies.c */
 extern struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, 
Index: linux-2.6.12-rc3-ckrm5/init/Kconfig
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/init/Kconfig	2005-05-05 09:35:07.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/init/Kconfig	2005-05-05 09:35:09.000000000 -0700
@@ -182,6 +182,17 @@ config CKRM_TYPE_TASKCLASS
 	
 	  Say Y if unsure
 
+config CKRM_TYPE_SOCKETCLASS
+	bool "Class Manager for socket groups"
+	depends on CKRM && RCFS_FS
+	default y
+	help
+	  SOCKET provides the extensions for CKRM to track per socket
+	  classes.  This is the base to enable socket based resource
+	  control for inbound connection control, bandwidth control etc.
+	
+	  Say Y if unsure.
+
 endmenu
 
 config SYSCTL
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_sockc.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_sockc.c	2005-05-05 09:35:09.000000000 -0700
@@ -0,0 +1,559 @@
+/* ckrm_sock.c - Class-based Kernel Resource Management (CKRM)
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003,2004
+ *           (C) Shailabh Nagar,  IBM Corp. 2003
+ *           (C) Chandra Seetharaman,  IBM Corp. 2003
+ *	     (C) Vivek Kashyap,	IBM Corp. 2004
+ *
+ *
+ * Provides kernel API of CKRM for in-kernel,per-resource controllers
+ * (one each for cpu, memory, io, network) and callbacks for
+ * classification modules.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include <linux/config.h>
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <asm/uaccess.h>
+#include <linux/mm.h>
+#include <asm/errno.h>
+#include <linux/string.h>
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/ckrm_rc.h>
+#include <linux/parser.h>
+#include <net/tcp.h>
+
+#include <linux/ckrm_net.h>
+
+struct ckrm_sock_class {
+	struct ckrm_core_class core;
+};
+
+static struct ckrm_sock_class ckrm_sockclass_dflt_class = {
+};
+
+#define SOCKET_CLASS_TYPE_NAME  "socketclass"
+
+const char *dflt_sockclass_name = SOCKET_CLASS_TYPE_NAME;
+
+static struct ckrm_core_class *ckrm_sock_alloc_class(struct ckrm_core_class *parent,
+						const char *name);
+static int ckrm_sock_free_class(struct ckrm_core_class *core);
+
+static int ckrm_sock_forced_reclassify(struct ckrm_core_class * target,
+				  const char *resname);
+static int ckrm_sock_show_members(struct ckrm_core_class *core,
+			     struct seq_file *seq);
+static void ckrm_sock_add_resctrl(struct ckrm_core_class *core, int resid);
+static void ckrm_sock_reclassify_class(struct ckrm_sock_class *cls);
+
+struct ckrm_classtype ct_sockclass = {
+	.mfidx = 1,
+	.name = SOCKET_CLASS_TYPE_NAME,
+	.type_id = CKRM_CLASSTYPE_SOCKET_CLASS,
+	.maxdepth = 3,
+	.resid_reserved = 0,
+	.max_res_ctlrs = CKRM_MAX_RES_CTLRS,
+	.max_resid = 0,
+	.bit_res_ctlrs = 0L,
+	.res_ctlrs_lock = SPIN_LOCK_UNLOCKED,
+	.classes = LIST_HEAD_INIT(ct_sockclass.classes),
+
+	.default_class = &ckrm_sockclass_dflt_class.core,
+
+	/* private version of functions */
+	.alloc = &ckrm_sock_alloc_class,
+	.free = &ckrm_sock_free_class,
+	.show_members = &ckrm_sock_show_members,
+	.forced_reclassify = &ckrm_sock_forced_reclassify,
+
+	/* use of default functions */
+	.show_shares = &ckrm_class_show_shares,
+	.show_stats = &ckrm_class_show_stats,
+	.show_config = &ckrm_class_show_config,
+	.set_config = &ckrm_class_set_config,
+	.set_shares = &ckrm_class_set_shares,
+	.reset_stats = &ckrm_class_reset_stats,
+
+	/* Mandatory private version.  No default available */
+	.add_resctrl = &ckrm_sock_add_resctrl,
+};
+
+/* helper functions */
+
+void ckrm_ns_hold(struct ckrm_net_struct *ns)
+{
+	atomic_inc(&ns->ns_refcnt);
+	return;
+}
+
+void ckrm_ns_put(struct ckrm_net_struct *ns)
+{
+	if (atomic_dec_and_test(&ns->ns_refcnt))
+		kfree(ns);
+	return;
+}
+
+/*
+ * Change the class of a netstruct
+ *
+ * Change the task's task class  to "newcls" if the task's current
+ * class (task->taskclass) is same as given "oldcls", if it is non-NULL.
+ *
+ */
+
+static void
+ckrm_sock_set_class(struct ckrm_net_struct *ns, struct ckrm_sock_class *newcls,
+	       struct ckrm_sock_class *oldcls, enum ckrm_event event)
+{
+	int i;
+	struct ckrm_res_ctlr *rcbs;
+	struct ckrm_classtype *clstype;
+	void *old_res_class, *new_res_class;
+
+	if ((newcls == oldcls) || (newcls == NULL)) {
+		ns->core = (void *)oldcls;
+		return;
+	}
+
+	class_lock(class_core(newcls));
+	ns->core = newcls;
+	list_add(&ns->ckrm_link, &class_core(newcls)->objlist);
+	class_unlock(class_core(newcls));
+
+	clstype = class_isa(newcls);
+	for (i = 0; i < clstype->max_resid; i++) {
+		atomic_inc(&clstype->nr_resusers[i]);
+		old_res_class =
+		    oldcls ? class_core(oldcls)->res_class[i] : NULL;
+		new_res_class =
+		    newcls ? class_core(newcls)->res_class[i] : NULL;
+		rcbs = clstype->res_ctlrs[i];
+		if (rcbs && rcbs->change_resclass
+		    && (old_res_class != new_res_class))
+			(*rcbs->change_resclass) (ns, old_res_class,
+						  new_res_class);
+		atomic_dec(&clstype->nr_resusers[i]);
+	}
+	return;
+}
+
+static void ckrm_sock_add_resctrl(struct ckrm_core_class *core, int resid)
+{
+	struct ckrm_net_struct *ns;
+	struct ckrm_res_ctlr *rcbs;
+
+	if ((resid < 0) || (resid >= CKRM_MAX_RES_CTLRS)
+	    || ((rcbs = core->classtype->res_ctlrs[resid]) == NULL))
+		return;
+
+	class_lock(core);
+	list_for_each_entry(ns, &core->objlist, ckrm_link) {
+		if (rcbs->change_resclass)
+			(*rcbs->change_resclass) (ns, NULL,
+						  core->res_class[resid]);
+	}
+	class_unlock(core);
+}
+
+/**************************************************************************
+ *                   Functions called from classification points          *
+ **************************************************************************/
+
+static void cb_sockclass_listen_start(struct sock *sk)
+{
+	struct ckrm_net_struct *ns = NULL;
+	struct ckrm_sock_class *newcls = NULL;
+	struct ckrm_res_ctlr *rcbs;
+	struct ckrm_classtype *clstype;
+	int i = 0;
+
+	/* XXX - TBD ipv6 */
+	if (sk->sk_family == AF_INET6)
+		return;
+
+	/* to store the socket address */
+	ns = (struct ckrm_net_struct *)
+	    kmalloc(sizeof(struct ckrm_net_struct), GFP_ATOMIC);
+	if (!ns)
+		return;
+
+	memset(ns, 0, sizeof(*ns));
+	INIT_LIST_HEAD(&ns->ckrm_link);
+	ckrm_ns_hold(ns);
+
+	ns->ns_family = sk->sk_family;
+	if (ns->ns_family == AF_INET6)	// IPv6 not supported yet.
+		return;
+
+	ns->ns_daddrv4 = inet_sk(sk)->rcv_saddr;
+	ns->ns_dport = inet_sk(sk)->num;
+
+	ns->ns_pid = current->pid;
+	ns->ns_tgid = current->tgid;
+	ns->ns_tsk = current;
+	ce_protect(&ct_sockclass);
+	CE_CLASSIFY_RET(newcls, &ct_sockclass, CKRM_EVENT_LISTEN_START, ns,
+			current);
+	ce_release(&ct_sockclass);
+
+	if (newcls == NULL) {
+		newcls = &ckrm_sockclass_dflt_class;
+		ckrm_core_grab(class_core(newcls));
+	}
+
+	class_lock(class_core(newcls));
+	list_add(&ns->ckrm_link, &class_core(newcls)->objlist);
+	ns->core = newcls;
+	class_unlock(class_core(newcls));
+
+	/*
+	 * the socket is already locked
+	 * take a reference on socket on our behalf
+	 */
+	sock_hold(sk);
+	sk->sk_ckrm_ns = (void *)ns;
+	ns->ns_sk = sk;
+
+	/* modify its shares */
+	clstype = class_isa(newcls);
+	for (i = 0; i < clstype->max_resid; i++) {
+		atomic_inc(&clstype->nr_resusers[i]);
+		rcbs = clstype->res_ctlrs[i];
+		if (rcbs && rcbs->change_resclass) {
+			(*rcbs->change_resclass) ((void *)ns,
+						  NULL,
+						  class_core(newcls)->
+						  res_class[i]);
+		}
+		atomic_dec(&clstype->nr_resusers[i]);
+	}
+	return;
+}
+
+static void cb_sockclass_listen_stop(struct sock *sk)
+{
+	struct ckrm_net_struct *ns = NULL;
+	struct ckrm_sock_class *newcls = NULL;
+
+	/* XXX - TBD ipv6 */
+	if (sk->sk_family == AF_INET6)
+		return;
+
+	ns = (struct ckrm_net_struct *)sk->sk_ckrm_ns;
+	if (!ns)     /* listen_start called before socket_aq was loaded */
+		return;
+
+	newcls = ns->core;
+	if (newcls) {
+		class_lock(class_core(newcls));
+		list_del(&ns->ckrm_link);
+		INIT_LIST_HEAD(&ns->ckrm_link);
+		class_unlock(class_core(newcls));
+		ckrm_core_drop(class_core(newcls));
+	}
+	/* the socket is already locked */
+	sk->sk_ckrm_ns = NULL;
+	sock_put(sk);
+
+	// Should be the last count and free it
+	ckrm_ns_put(ns);
+	return;
+}
+
+static struct ckrm_event_spec ckrm_sock_events_callbacks[] = {
+	{CKRM_EVENT_LISTEN_START, {cb_sockclass_listen_start, NULL}},
+	{CKRM_EVENT_LISTEN_STOP, {cb_sockclass_listen_stop, NULL}},
+	{-1, {NULL, NULL}}
+};
+
+/**************************************************************************
+ *                  Class Object Creation / Destruction
+ **************************************************************************/
+
+static struct ckrm_core_class *ckrm_sock_alloc_class(struct ckrm_core_class *parent,
+						const char *name)
+{
+	struct ckrm_sock_class *sockcls;
+	sockcls = kmalloc(sizeof(struct ckrm_sock_class), GFP_KERNEL);
+	if (sockcls == NULL)
+		return NULL;
+	memset(sockcls, 0, sizeof(struct ckrm_sock_class));
+
+	ckrm_init_core_class(&ct_sockclass, class_core(sockcls), parent, name);
+
+	ce_protect(&ct_sockclass);
+	if (ct_sockclass.ce_cb_active && ct_sockclass.ce_callbacks.class_add)
+		(*ct_sockclass.ce_callbacks.class_add) (name, sockcls,
+							ct_sockclass.type_id);
+	ce_release(&ct_sockclass);
+
+	return class_core(sockcls);
+}
+
+static int ckrm_sock_free_class(struct ckrm_core_class *core)
+{
+	struct ckrm_sock_class *sockcls;
+
+	if (!ckrm_is_core_valid(core)) {
+		/* Invalid core */
+		return (-EINVAL);
+	}
+	if (core == core->classtype->default_class) {
+		/* reset the name tag */
+		core->name = dflt_sockclass_name;
+		return 0;
+	}
+
+	sockcls = class_type(struct ckrm_sock_class, core);
+
+	ce_protect(&ct_sockclass);
+
+	if (ct_sockclass.ce_cb_active && ct_sockclass.ce_callbacks.class_delete)
+		(*ct_sockclass.ce_callbacks.class_delete) (core->name, sockcls,
+							   ct_sockclass.type_id);
+
+	ckrm_sock_reclassify_class(sockcls);
+
+	ce_release(&ct_sockclass);
+
+	ckrm_release_core_class(core);	
+	/* Could just drop the class?  Error message? */
+
+	return 0;
+}
+
+static int ckrm_sock_show_members(struct ckrm_core_class *core, struct seq_file *seq)
+{
+	struct list_head *lh;
+	struct ckrm_net_struct *ns = NULL;
+
+	class_lock(core);
+	list_for_each(lh, &core->objlist) {
+		ns = container_of(lh, struct ckrm_net_struct, ckrm_link);
+		seq_printf(seq, "%d.%d.%d.%d\\%d\n",
+			   NIPQUAD(ns->ns_daddrv4), ns->ns_dport);
+	}
+	class_unlock(core);
+
+	return 0;
+}
+
+static int
+ckrm_sock_forced_reclassify_ns(struct ckrm_net_struct *tns,
+			  struct ckrm_core_class *core)
+{
+	struct ckrm_net_struct *ns = NULL;
+	struct sock *sk = NULL;
+	struct ckrm_sock_class *oldcls, *newcls;
+	int rc = -EINVAL;
+
+	if (!ckrm_is_core_valid(core)) {
+		return rc;
+	}
+
+	newcls = class_type(struct ckrm_sock_class, core);
+	/*
+	 * lookup the listening sockets
+	 * returns with a reference count set on socket
+	 */
+	if (tns->ns_family == AF_INET6)
+		return -EOPNOTSUPP;
+
+	sk = tcp_v4_lookup_listener(tns->ns_daddrv4, tns->ns_dport, 0);
+	if (!sk) {
+		printk(KERN_INFO "No such listener 0x%x:%d\n",
+		       tns->ns_daddrv4, tns->ns_dport);
+		return rc;
+	}
+	lock_sock(sk);
+	if (!sk->sk_ckrm_ns) {
+		goto out;
+	}
+	ns = sk->sk_ckrm_ns;
+	ckrm_ns_hold(ns);
+	if (!capable(CAP_NET_ADMIN) && (ns->ns_tsk->user != current->user)) {
+		ckrm_ns_put(ns);
+		rc = -EPERM;
+		goto out;
+	}
+
+	oldcls = ns->core;
+	if ((oldcls == NULL) || (oldcls == newcls)) {
+		ckrm_ns_put(ns);
+		goto out;
+	}
+	/* remove the net_struct from the current class */
+	class_lock(class_core(oldcls));
+	list_del(&ns->ckrm_link);
+	INIT_LIST_HEAD(&ns->ckrm_link);
+	ns->core = NULL;
+	class_unlock(class_core(oldcls));
+
+	ckrm_sock_set_class(ns, newcls, oldcls, CKRM_EVENT_MANUAL);
+	ckrm_ns_put(ns);
+	rc = 0;
+      out:
+	release_sock(sk);
+	sock_put(sk);
+
+	return rc;
+
+}
+
+enum ckrm_sock_target_token {
+	IPV4, IPV6, SOCKC_TARGET_ERR
+};
+
+static match_table_t ckrm_sock_target_tokens = {
+	{IPV4, "ipv4=%s"},
+	{IPV6, "ipv6=%s"},
+	{SOCKC_TARGET_ERR, NULL},
+};
+
+char *v4toi(char *s, char c, __u32 * v)
+{
+	unsigned int k = 0, n = 0;
+
+	while (*s && (*s != c)) {
+		if (*s == '.') {
+			n <<= 8;
+			n |= k;
+			k = 0;
+		} else
+			k = k * 10 + *s - '0';
+		s++;
+	}
+
+	n <<= 8;
+	*v = n | k;
+
+	return s;
+}
+
+static int
+ckrm_sock_forced_reclassify(struct ckrm_core_class *target, const char *options)
+{
+	char *p, *p2;
+	struct ckrm_net_struct ns;
+	__u32 v4addr, tmp;
+
+	if (!options)
+		return -EINVAL;
+
+	if (target == NULL) {
+		unsigned long id = simple_strtol(options,NULL,0);
+		if (!capable(CAP_NET_ADMIN))
+			return -EPERM;
+		if (id != 0)
+			return -EINVAL;
+		printk("ckrm_sock_class: reclassify all not net implemented\n");
+		return 0;
+	}
+
+	while ((p = strsep((char **)&options, ",")) != NULL) {
+		substring_t args[MAX_OPT_ARGS];
+		int token;
+
+		if (!*p)
+			continue;
+		token = match_token(p, ckrm_sock_target_tokens, args);
+		switch (token) {
+
+		case IPV4:
+
+			p2 = p;
+			while (*p2 && (*p2 != '='))
+				++p2;
+			p2++;
+			p2 = v4toi(p2, '\\', &(v4addr));
+			ns.ns_daddrv4 = htonl(v4addr);
+			ns.ns_family = AF_INET;
+			p2 = v4toi(++p2, ':', &tmp);
+			ns.ns_dport = (__u16) tmp;
+			if (*p2)
+				p2 = v4toi(++p2, '\0', &ns.ns_pid);
+			ckrm_sock_forced_reclassify_ns(&ns, target);
+			break;
+
+		case IPV6:
+			printk(KERN_INFO "rcfs: IPV6 not supported yet\n");
+			return -ENOSYS;
+		default:
+			return -EINVAL;
+		}
+	}
+	return -EINVAL;
+}
+
+/*
+ * Listen_aq reclassification.
+ */
+static void ckrm_sock_reclassify_class(struct ckrm_sock_class *cls)
+{
+	struct ckrm_net_struct *ns, *tns;
+	struct ckrm_core_class *core = class_core(cls);
+	LIST_HEAD(local_list);
+
+	if (!cls)
+		return;
+
+	if (!ckrm_validate_and_grab_core(core))
+		return;
+
+	class_lock(core);
+	/* we have the core refcnt */
+	if (list_empty(&core->objlist)) {
+		class_unlock(core);
+		ckrm_core_drop(core);
+		return;
+	}
+
+	INIT_LIST_HEAD(&local_list);
+	list_splice_init(&core->objlist, &local_list);
+	class_unlock(core);
+	ckrm_core_drop(core);
+
+	list_for_each_entry_safe(ns, tns, &local_list, ckrm_link) {
+		ckrm_ns_hold(ns);
+		list_del(&ns->ckrm_link);
+		if (ns->ns_sk) {
+			lock_sock(ns->ns_sk);
+			ckrm_sock_set_class(ns, &ckrm_sockclass_dflt_class, NULL,
+				       CKRM_EVENT_MANUAL);
+			release_sock(ns->ns_sk);
+		}
+		ckrm_ns_put(ns);
+	}
+	return;
+}
+
+void __init ckrm_meta_init_sockclass(void)
+{
+	printk("...... Initializing ClassType<%s> ........\n",
+	       ct_sockclass.name);
+	/* intialize the default class */
+	ckrm_init_core_class(&ct_sockclass, class_core(&ckrm_sockclass_dflt_class),
+			     NULL, dflt_sockclass_name);
+
+	/* register classtype and initialize default task class */
+	ckrm_register_classtype(&ct_sockclass);
+	ckrm_register_event_set(ckrm_sock_events_callbacks);
+
+	/*
+	 * note registeration of all resource controllers will be done
+	 * later dynamically as these are specified as modules
+	 */
+}
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/Makefile	2005-05-05 09:35:07.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile	2005-05-05 09:35:09.000000000 -0700
@@ -4,3 +4,4 @@
 
 obj-y += ckrm_events.o ckrm.o ckrmutils.o
 obj-$(CONFIG_CKRM_TYPE_TASKCLASS) += ckrm_tc.o
+obj-$(CONFIG_CKRM_TYPE_SOCKETCLASS) += ckrm_sockc.o
Index: linux-2.6.12-rc3-ckrm5/net/ipv4/tcp_ipv4.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/net/ipv4/tcp_ipv4.c	2005-05-05 09:33:01.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/net/ipv4/tcp_ipv4.c	2005-05-05 09:35:09.000000000 -0700
@@ -448,7 +448,8 @@ static struct sock *__tcp_v4_lookup_list
 }
 
 /* Optimize the common listener case. */
-static inline struct sock *tcp_v4_lookup_listener(u32 daddr,
+/* XXX:  Was inline - need to use for CKRM, fix before next release */
+struct sock *tcp_v4_lookup_listener(u32 daddr,
 		unsigned short hnum, int dif)
 {
 	struct sock *sk = NULL;
@@ -2645,6 +2646,7 @@ EXPORT_SYMBOL(tcp_prot);
 EXPORT_SYMBOL(tcp_put_port);
 EXPORT_SYMBOL(tcp_unhash);
 EXPORT_SYMBOL(tcp_v4_conn_request);
+EXPORT_SYMBOL(tcp_v4_lookup_listener);
 EXPORT_SYMBOL(tcp_v4_connect);
 EXPORT_SYMBOL(tcp_v4_do_rcv);
 EXPORT_SYMBOL(tcp_v4_rebuild_header);

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 07/21] CKRM: Numtasks Controller
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (5 preceding siblings ...)
  2005-05-05 18:07 ` [patch 06/21] CKRM: Classtype definitions for socket class gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 08/21] CKRM: Documentation gh
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=07-diff_numtasks


This patch provides a resource controller for controlling the number
of tasks per class in CKRM.

Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>


 include/linux/ckrm_tsk.h         |   35 ++
 init/Kconfig                     |   10 
 kernel/ckrm/Makefile             |    3 
 kernel/ckrm/ckrm_numtasks.c      |  522 +++++++++++++++++++++++++++++++++++++++
 kernel/ckrm/ckrm_numtasks_stub.c |   53 +++
 kernel/fork.c                    |    6 
 6 files changed, 628 insertions(+), 1 deletion(-)

Index: linux-2.6.12-rc3-ckrm5/include/linux/ckrm_tsk.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/include/linux/ckrm_tsk.h	2005-05-05 09:35:11.000000000 -0700
@@ -0,0 +1,35 @@
+/* ckrm_tsk.h - No. of tasks resource controller for CKRM
+ *
+ * Copyright (C) Chandra Seetharaman, IBM Corp. 2003
+ *
+ * Provides No. of tasks resource controller for CKRM
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#ifndef _LINUX_CKRM_TSK_H
+#define _LINUX_CKRM_TSK_H
+
+#ifdef CONFIG_CKRM_TYPE_TASKCLASS
+#include <linux/ckrm_rc.h>
+
+typedef int (*get_ref_t) (struct ckrm_core_class *, int);
+typedef void (*put_ref_t) (struct ckrm_core_class *);
+
+extern int numtasks_get_ref(struct ckrm_core_class *, int);
+extern void numtasks_put_ref(struct ckrm_core_class *);
+extern void ckrm_numtasks_register(get_ref_t, put_ref_t);
+
+#else /* CONFIG_CKRM_TYPE_TASKCLASS */
+
+#define numtasks_get_ref(core_class, ref) (1)
+#define numtasks_put_ref(core_class)  do {} while (0)
+
+#endif /* CONFIG_CKRM_TYPE_TASKCLASS */
+#endif /* _LINUX_CKRM_RES_H */
Index: linux-2.6.12-rc3-ckrm5/init/Kconfig
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/init/Kconfig	2005-05-05 09:35:09.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/init/Kconfig	2005-05-05 09:35:11.000000000 -0700
@@ -193,6 +193,16 @@ config CKRM_TYPE_SOCKETCLASS
 	
 	  Say Y if unsure.
 
+config CKRM_RES_NUMTASKS
+	tristate "Number of Tasks Resource Manager"
+	depends on CKRM_TYPE_TASKCLASS
+	default y
+	help
+	  Provides a Resource Controller for CKRM that allows limiting no of
+	  tasks a task class can have.
+	
+	  Say N if unsure, Y to use the feature.
+
 endmenu
 
 config SYSCTL
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_numtasks.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_numtasks.c	2005-05-05 09:35:11.000000000 -0700
@@ -0,0 +1,522 @@
+/* ckrm_numtasks.c - "Number of tasks" resource controller for CKRM
+ *
+ * Copyright (C) Chandra Seetharaman,  IBM Corp. 2003
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+/*
+ * CKRM Resource controller for tracking number of tasks in a class.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <asm/errno.h>
+#include <asm/div64.h>
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/ckrm_rc.h>
+#include <linux/ckrm_tc.h>
+#include <linux/ckrm_tsk.h>
+
+#define TOTAL_NUM_TASKS (131072)	/* 128 K */
+#define NUMTASKS_DEBUG
+#define NUMTASKS_NAME "numtasks"
+
+struct ckrm_numtasks {
+	struct ckrm_core_class *core;	/* the core i am part of... */
+	struct ckrm_core_class *parent;	/* parent of the core above. */
+	struct ckrm_shares shares;
+	spinlock_t cnt_lock;	/* always grab parent's lock before child's */
+	int cnt_guarantee;	/* num_tasks guarantee in local units */
+	int cnt_unused;		/* has to borrow if more than this is needed */
+	int cnt_limit;		/* no tasks over this limit. */
+	atomic_t cnt_cur_alloc;	/* current alloc from self */
+	atomic_t cnt_borrowed;	/* borrowed from the parent */
+
+	int over_guarantee;	/* turn on/off when cur_alloc goes  */
+				/* over/under guarantee */
+
+	/* internally maintained statictics to compare with max numbers */
+	int limit_failures;	/* # failures as request was over the limit */
+	int borrow_sucesses;	/* # successful borrows */
+	int borrow_failures;	/* # borrow failures */
+
+	/* Maximum the specific statictics has reached. */
+	int max_limit_failures;
+	int max_borrow_sucesses;
+	int max_borrow_failures;
+
+	/* Total number of specific statistics */
+	int tot_limit_failures;
+	int tot_borrow_sucesses;
+	int tot_borrow_failures;
+};
+
+struct ckrm_res_ctlr numtasks_rcbs;
+
+/* Initialize rescls values
+ * May be called on each rcfs unmount or as part of error recovery
+ * to make share values sane.
+ * Does not traverse hierarchy reinitializing children.
+ */
+static void numtasks_res_initcls_one(struct ckrm_numtasks * res)
+{
+	res->shares.my_guarantee = CKRM_SHARE_DONTCARE;
+	res->shares.my_limit = CKRM_SHARE_DONTCARE;
+	res->shares.total_guarantee = CKRM_SHARE_DFLT_TOTAL_GUARANTEE;
+	res->shares.max_limit = CKRM_SHARE_DFLT_MAX_LIMIT;
+	res->shares.unused_guarantee = CKRM_SHARE_DFLT_TOTAL_GUARANTEE;
+	res->shares.cur_max_limit = 0;
+
+	res->cnt_guarantee = CKRM_SHARE_DONTCARE;
+	res->cnt_unused = CKRM_SHARE_DONTCARE;
+	res->cnt_limit = CKRM_SHARE_DONTCARE;
+
+	res->over_guarantee = 0;
+
+	res->limit_failures = 0;
+	res->borrow_sucesses = 0;
+	res->borrow_failures = 0;
+
+	res->max_limit_failures = 0;
+	res->max_borrow_sucesses = 0;
+	res->max_borrow_failures = 0;
+
+	res->tot_limit_failures = 0;
+	res->tot_borrow_sucesses = 0;
+	res->tot_borrow_failures = 0;
+
+	atomic_set(&res->cnt_cur_alloc, 0);
+	atomic_set(&res->cnt_borrowed, 0);
+	return;
+}
+
+#if 0
+static void numtasks_res_initcls(void *my_res)
+{
+	struct ckrm_numtasks *res = my_res;
+
+	/* Write a version which propagates values all the way down
+	   and replace rcbs callback with that version */
+
+}
+#endif
+
+static int numtasks_get_ref_local(struct ckrm_core_class *core, int force)
+{
+	int rc, resid = numtasks_rcbs.resid;
+	struct ckrm_numtasks *res;
+
+	if ((resid < 0) || (core == NULL))
+		return 1;
+
+	res = ckrm_get_res_class(core, resid, struct ckrm_numtasks);
+	if (res == NULL)
+		return 1;
+
+	atomic_inc(&res->cnt_cur_alloc);
+
+	rc = 1;
+	if (((res->parent) && (res->cnt_unused == CKRM_SHARE_DONTCARE)) ||
+	    (atomic_read(&res->cnt_cur_alloc) > res->cnt_unused)) {
+
+		rc = 0;
+		if (!force && (res->cnt_limit != CKRM_SHARE_DONTCARE) &&
+		    (atomic_read(&res->cnt_cur_alloc) > res->cnt_limit)) {
+			res->limit_failures++;
+			res->tot_limit_failures++;
+		} else if (res->parent != NULL) {
+			if ((rc =
+			     numtasks_get_ref_local(res->parent, force)) == 1) {
+				atomic_inc(&res->cnt_borrowed);
+				res->borrow_sucesses++;
+				res->tot_borrow_sucesses++;
+				res->over_guarantee = 1;
+			} else {
+				res->borrow_failures++;
+				res->tot_borrow_failures++;
+			}
+		} else {
+			rc = force;
+		}
+	} else if (res->over_guarantee) {
+		res->over_guarantee = 0;
+
+		if (res->max_limit_failures < res->limit_failures) {
+			res->max_limit_failures = res->limit_failures;
+		}
+		if (res->max_borrow_sucesses < res->borrow_sucesses) {
+			res->max_borrow_sucesses = res->borrow_sucesses;
+		}
+		if (res->max_borrow_failures < res->borrow_failures) {
+			res->max_borrow_failures = res->borrow_failures;
+		}
+		res->limit_failures = 0;
+		res->borrow_sucesses = 0;
+		res->borrow_failures = 0;
+	}
+
+	if (!rc) {
+		atomic_dec(&res->cnt_cur_alloc);
+	}
+	return rc;
+}
+
+static void numtasks_put_ref_local(struct ckrm_core_class *core)
+{
+	int resid = numtasks_rcbs.resid;
+	struct ckrm_numtasks *res;
+
+	if ((resid == -1) || (core == NULL)) {
+		return;
+	}
+
+	res = ckrm_get_res_class(core, resid, struct ckrm_numtasks);
+	if (res == NULL)
+		return;
+	if (unlikely(atomic_read(&res->cnt_cur_alloc) == 0)) {
+		printk(KERN_WARNING "numtasks_put_ref: Trying to decrement "
+					"counter below 0\n");
+		return;
+	}
+	atomic_dec(&res->cnt_cur_alloc);
+	if (atomic_read(&res->cnt_borrowed) > 0) {
+		atomic_dec(&res->cnt_borrowed);
+		numtasks_put_ref_local(res->parent);
+	}
+	return;
+}
+
+static void *numtasks_res_alloc(struct ckrm_core_class *core,
+				struct ckrm_core_class *parent)
+{
+	struct ckrm_numtasks *res;
+
+	res = kmalloc(sizeof(struct ckrm_numtasks), GFP_ATOMIC);
+
+	if (res) {
+		memset(res, 0, sizeof(struct ckrm_numtasks));
+		res->core = core;
+		res->parent = parent;
+		numtasks_res_initcls_one(res);
+		res->cnt_lock = SPIN_LOCK_UNLOCKED;
+		if (parent == NULL) {
+			/*
+			 * I am part of root class. So set the max tasks
+			 * to available default.
+			 */
+			res->cnt_guarantee = TOTAL_NUM_TASKS;
+			res->cnt_unused = TOTAL_NUM_TASKS;
+			res->cnt_limit = TOTAL_NUM_TASKS;
+		}
+		try_module_get(THIS_MODULE);
+	} else {
+		printk(KERN_ERR
+		       "numtasks_res_alloc: failed GFP_ATOMIC alloc\n");
+	}
+	return res;
+}
+
+/*
+ * No locking of this resource class object necessary as we are not
+ * supposed to be assigned (or used) when/after this function is called.
+ */
+static void numtasks_res_free(void *my_res)
+{
+	struct ckrm_numtasks *res = my_res, *parres, *childres;
+	struct ckrm_core_class *child = NULL;
+	int i, borrowed, maxlimit, resid = numtasks_rcbs.resid;
+
+	if (!res)
+		return;
+
+	/* Assuming there will be no children when this function is called */
+
+	parres = ckrm_get_res_class(res->parent, resid, struct ckrm_numtasks);
+
+	if (unlikely(atomic_read(&res->cnt_cur_alloc) < 0)) {
+		printk(KERN_WARNING "numtasks_res: counter below 0\n");
+	}
+	if (unlikely(atomic_read(&res->cnt_cur_alloc) > 0 ||
+				atomic_read(&res->cnt_borrowed) > 0)) {
+		printk(KERN_WARNING "numtasks_res_free: resource still "
+		       "alloc'd %p\n", res);
+		if ((borrowed = atomic_read(&res->cnt_borrowed)) > 0) {
+			for (i = 0; i < borrowed; i++) {
+				numtasks_put_ref_local(parres->core);
+			}
+		}
+	}
+	/* return child's limit/guarantee to parent node */
+	spin_lock(&parres->cnt_lock);
+	child_guarantee_changed(&parres->shares, res->shares.my_guarantee, 0);
+
+	/* run thru parent's children and get the new max_limit of the parent */
+	ckrm_lock_hier(parres->core);
+	maxlimit = 0;
+	while ((child = ckrm_get_next_child(parres->core, child)) != NULL) {
+		childres = ckrm_get_res_class(child, resid, struct ckrm_numtasks);
+		if (maxlimit < childres->shares.my_limit) {
+			maxlimit = childres->shares.my_limit;
+		}
+	}
+	ckrm_unlock_hier(parres->core);
+	if (parres->shares.cur_max_limit < maxlimit) {
+		parres->shares.cur_max_limit = maxlimit;
+	}
+
+	spin_unlock(&parres->cnt_lock);
+	kfree(res);
+	module_put(THIS_MODULE);
+	return;
+}
+
+/*
+ * Recalculate the guarantee and limit in real units... and propagate the
+ * same to children.
+ * Caller is responsible for protecting res and for the integrity of parres
+ */
+static void
+recalc_and_propagate(struct ckrm_numtasks * res, struct ckrm_numtasks * parres)
+{
+	struct ckrm_core_class *child = NULL;
+	struct ckrm_numtasks *childres;
+	int resid = numtasks_rcbs.resid;
+
+	if (parres) {
+		struct ckrm_shares *par = &parres->shares;
+		struct ckrm_shares *self = &res->shares;
+
+		/* calculate cnt_guarantee and cnt_limit */
+		if (parres->cnt_guarantee == CKRM_SHARE_DONTCARE) {
+			res->cnt_guarantee = CKRM_SHARE_DONTCARE;
+		} else if (par->total_guarantee) {
+			u64 temp = (u64) self->my_guarantee * parres->cnt_guarantee;
+			do_div(temp, par->total_guarantee);
+			res->cnt_guarantee = (int) temp;
+		} else {
+			res->cnt_guarantee = 0;
+		}
+
+		if (parres->cnt_limit == CKRM_SHARE_DONTCARE) {
+			res->cnt_limit = CKRM_SHARE_DONTCARE;
+		} else if (par->max_limit) {
+			u64 temp = (u64) self->my_limit * parres->cnt_limit;
+			do_div(temp, par->max_limit);
+			res->cnt_limit = (int) temp;
+		} else {
+			res->cnt_limit = 0;
+		}
+
+		/* Calculate unused units */
+		if (res->cnt_guarantee == CKRM_SHARE_DONTCARE) {
+			res->cnt_unused = CKRM_SHARE_DONTCARE;
+		} else if (self->total_guarantee) {
+			u64 temp = (u64) self->unused_guarantee * res->cnt_guarantee;
+			do_div(temp, self->total_guarantee);
+			res->cnt_unused = (int) temp;
+		} else {
+			res->cnt_unused = 0;
+		}
+	}
+	/* propagate to children */
+	ckrm_lock_hier(res->core);
+	while ((child = ckrm_get_next_child(res->core, child)) != NULL) {
+		childres = ckrm_get_res_class(child, resid, struct ckrm_numtasks);
+
+		spin_lock(&childres->cnt_lock);
+		recalc_and_propagate(childres, res);
+		spin_unlock(&childres->cnt_lock);
+	}
+	ckrm_unlock_hier(res->core);
+	return;
+}
+
+static int numtasks_set_share_values(void *my_res, struct ckrm_shares *new)
+{
+	struct ckrm_numtasks *parres, *res = my_res;
+	struct ckrm_shares *cur = &res->shares, *par;
+	int rc = -EINVAL, resid = numtasks_rcbs.resid;
+
+	if (!res)
+		return rc;
+
+	if (res->parent) {
+		parres =
+		    ckrm_get_res_class(res->parent, resid, struct ckrm_numtasks);
+		spin_lock(&parres->cnt_lock);
+		spin_lock(&res->cnt_lock);
+		par = &parres->shares;
+	} else {
+		spin_lock(&res->cnt_lock);
+		par = NULL;
+		parres = NULL;
+	}
+
+	rc = set_shares(new, cur, par);
+
+	if ((rc == 0) && parres) {
+		/* Calculate parent's unused units */
+		if (parres->cnt_guarantee == CKRM_SHARE_DONTCARE) {
+			parres->cnt_unused = CKRM_SHARE_DONTCARE;
+		} else if (par->total_guarantee) {
+			u64 temp = (u64) par->unused_guarantee * parres->cnt_guarantee;
+			do_div(temp, par->total_guarantee);
+			parres->cnt_unused = (int) temp;
+		} else {
+			parres->cnt_unused = 0;
+		}
+		recalc_and_propagate(res, parres);
+	}
+	spin_unlock(&res->cnt_lock);
+	if (res->parent) {
+		spin_unlock(&parres->cnt_lock);
+	}
+	return rc;
+}
+
+static int numtasks_get_share_values(void *my_res, struct ckrm_shares *shares)
+{
+	struct ckrm_numtasks *res = my_res;
+
+	if (!res)
+		return -EINVAL;
+	*shares = res->shares;
+	return 0;
+}
+
+static int numtasks_get_stats(void *my_res, struct seq_file *sfile)
+{
+	struct ckrm_numtasks *res = my_res;
+
+	if (!res)
+		return -EINVAL;
+
+	seq_printf(sfile, "Number of tasks resource:\n");
+	seq_printf(sfile, "Total Over limit failures: %d\n",
+		   res->tot_limit_failures);
+	seq_printf(sfile, "Total Over guarantee sucesses: %d\n",
+		   res->tot_borrow_sucesses);
+	seq_printf(sfile, "Total Over guarantee failures: %d\n",
+		   res->tot_borrow_failures);
+
+	seq_printf(sfile, "Maximum Over limit failures: %d\n",
+		   res->max_limit_failures);
+	seq_printf(sfile, "Maximum Over guarantee sucesses: %d\n",
+		   res->max_borrow_sucesses);
+	seq_printf(sfile, "Maximum Over guarantee failures: %d\n",
+		   res->max_borrow_failures);
+#ifdef NUMTASKS_DEBUG
+	seq_printf(sfile,
+		   "cur_alloc %d; borrowed %d; cnt_guar %d; cnt_limit %d "
+		   "cnt_unused %d, unused_guarantee %d, cur_max_limit %d\n",
+		   atomic_read(&res->cnt_cur_alloc),
+		   atomic_read(&res->cnt_borrowed), res->cnt_guarantee,
+		   res->cnt_limit, res->cnt_unused,
+		   res->shares.unused_guarantee,
+		   res->shares.cur_max_limit);
+#endif
+
+	return 0;
+}
+
+static int numtasks_show_config(void *my_res, struct seq_file *sfile)
+{
+	struct ckrm_numtasks *res = my_res;
+
+	if (!res)
+		return -EINVAL;
+
+	seq_printf(sfile, "res=%s,parameter=somevalue\n", NUMTASKS_NAME);
+	return 0;
+}
+
+static int numtasks_set_config(void *my_res, const char *cfgstr)
+{
+	struct ckrm_numtasks *res = my_res;
+
+	if (!res)
+		return -EINVAL;
+	printk("numtasks config='%s'\n", cfgstr);
+	return 0;
+}
+
+static void numtasks_change_resclass(void *task, void *old, void *new)
+{
+	struct ckrm_numtasks *oldres = old;
+	struct ckrm_numtasks *newres = new;
+
+	if (oldres != (void *)-1) {
+		struct task_struct *tsk = task;
+		if (!oldres) {
+			struct ckrm_core_class *old_core =
+			    &(tsk->parent->taskclass->core);
+			oldres =
+			    ckrm_get_res_class(old_core, numtasks_rcbs.resid,
+					       struct ckrm_numtasks);
+		}
+		numtasks_put_ref_local(oldres->core);
+	}
+	if (newres) {
+		(void)numtasks_get_ref_local(newres->core, 1);
+	}
+}
+
+struct ckrm_res_ctlr numtasks_rcbs = {
+	.res_name = NUMTASKS_NAME,
+	.res_hdepth = 1,
+	.resid = -1,
+	.res_alloc = numtasks_res_alloc,
+	.res_free = numtasks_res_free,
+	.set_share_values = numtasks_set_share_values,
+	.get_share_values = numtasks_get_share_values,
+	.get_stats = numtasks_get_stats,
+	.show_config = numtasks_show_config,
+	.set_config = numtasks_set_config,
+	.change_resclass = numtasks_change_resclass,
+};
+
+int __init init_ckrm_numtasks_res(void)
+{
+	struct ckrm_classtype *clstype;
+	int resid = numtasks_rcbs.resid;
+
+	clstype = ckrm_find_classtype_by_name("taskclass");
+	if (clstype == NULL) {
+		printk(KERN_INFO " Unknown ckrm classtype<taskclass>");
+		return -ENOENT;
+	}
+
+	if (resid == -1) {
+		resid = ckrm_register_res_ctlr(clstype, &numtasks_rcbs);
+		printk("........init_ckrm_numtasks_res -> %d\n", resid);
+		if (resid != -1) {
+			ckrm_numtasks_register(numtasks_get_ref_local,
+					       numtasks_put_ref_local);
+			numtasks_rcbs.classtype = clstype;
+		}
+	}
+	return 0;
+}
+
+void __exit exit_ckrm_numtasks_res(void)
+{
+	if (numtasks_rcbs.resid != -1) {
+		ckrm_numtasks_register(NULL, NULL);
+	}
+	ckrm_unregister_res_ctlr(&numtasks_rcbs);
+	numtasks_rcbs.resid = -1;
+}
+
+module_init(init_ckrm_numtasks_res)
+module_exit(exit_ckrm_numtasks_res)
+
+MODULE_LICENSE("GPL");
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_numtasks_stub.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_numtasks_stub.c	2005-05-05 09:35:11.000000000 -0700
@@ -0,0 +1,53 @@
+/* ckrm_tasks_stub.c - Stub file for ckrm_tasks modules
+ *
+ * Copyright (C) Chandra Seetharaman,  IBM Corp. 2004
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/ckrm_tsk.h>
+
+static spinlock_t stub_lock = SPIN_LOCK_UNLOCKED;
+
+static get_ref_t real_get_ref = NULL;
+static put_ref_t real_put_ref = NULL;
+
+void ckrm_numtasks_register(get_ref_t gr, put_ref_t pr)
+{
+	spin_lock(&stub_lock);
+	real_get_ref = gr;
+	real_put_ref = pr;
+	spin_unlock(&stub_lock);
+}
+
+int numtasks_get_ref(struct ckrm_core_class *arg, int force)
+{
+	int ret = 1;
+	spin_lock(&stub_lock);
+	if (real_get_ref) {
+		ret = (*real_get_ref) (arg, force);
+	}
+	spin_unlock(&stub_lock);
+	return ret;
+}
+
+void numtasks_put_ref(struct ckrm_core_class *arg)
+{
+	spin_lock(&stub_lock);
+	if (real_put_ref) {
+		(*real_put_ref) (arg);
+	}
+	spin_unlock(&stub_lock);
+}
+
+EXPORT_SYMBOL(ckrm_numtasks_register);
+EXPORT_SYMBOL(numtasks_get_ref);
+EXPORT_SYMBOL(numtasks_put_ref);
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/Makefile	2005-05-05 09:35:09.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile	2005-05-05 09:35:11.000000000 -0700
@@ -3,5 +3,6 @@
 #
 
 obj-y += ckrm_events.o ckrm.o ckrmutils.o
-obj-$(CONFIG_CKRM_TYPE_TASKCLASS) += ckrm_tc.o
+obj-$(CONFIG_CKRM_TYPE_TASKCLASS) += ckrm_tc.o ckrm_numtasks_stub.o
 obj-$(CONFIG_CKRM_TYPE_SOCKETCLASS) += ckrm_sockc.o
+obj-$(CONFIG_CKRM_RES_NUMTASKS) += ckrm_numtasks.o
Index: linux-2.6.12-rc3-ckrm5/kernel/fork.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/fork.c	2005-05-05 09:35:02.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/fork.c	2005-05-05 09:35:11.000000000 -0700
@@ -42,6 +42,8 @@
 #include <linux/rmap.h>
 #include <linux/acct.h>
 #include <linux/ckrm_events.h>
+#include <linux/ckrm_tsk.h>
+#include <linux/ckrm_tc.h>
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -1211,6 +1213,9 @@ long do_fork(unsigned long clone_flags,
 			clone_flags |= CLONE_PTRACE;
 	}
 
+	if (numtasks_get_ref(&current->taskclass->core, 0) == 0) {
+		return -ENOMEM;
+	}
 	p = copy_process(clone_flags, stack_start, regs, stack_size, parent_tidptr, child_tidptr, pid);
 	/*
 	 * Do this prior waking up the new thread - the thread pointer
@@ -1250,6 +1255,7 @@ long do_fork(unsigned long clone_flags,
 				ptrace_notify ((PTRACE_EVENT_VFORK_DONE << 8) | SIGTRAP);
 		}
 	} else {
+		numtasks_put_ref(&current->taskclass->core);
 		free_pidmap(pid);
 		pid = PTR_ERR(p);
 	}

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 08/21] CKRM: Documentation
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (6 preceding siblings ...)
  2005-05-05 18:07 ` [patch 07/21] CKRM: Numtasks Controller gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 09/21] CKRM: Add missing read_unlock gh
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=10-diff_docs


This patch adds all current documentation on CKRM.

Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@us.ibm.com>
Signed-Off-By: Vivek Kashyap <vivk@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

 TODO         |   16 +++++++++
 ckrm_basics  |   66 +++++++++++++++++++++++++++++++++++++++
 core_usage   |   72 +++++++++++++++++++++++++++++++++++++++++++
 crbce        |   33 +++++++++++++++++++
 installation |   70 ++++++++++++++++++++++++++++++++++++++++++
 rbce_basics  |   67 ++++++++++++++++++++++++++++++++++++++++
 rbce_usage   |   98 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 7 files changed, 422 insertions(+)

Index: linux-2.6.12-rc3-ckrm5/Documentation/ckrm/ckrm_basics
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/Documentation/ckrm/ckrm_basics	2005-05-05 09:35:13.000000000 -0700
@@ -0,0 +1,66 @@
+CKRM Basics
+-------------
+A brief review of CKRM concepts and terminology will help make installation
+and testing easier. For more details, please visit http://ckrm.sf.net.
+
+Currently there are two class types, taskclass and socketclass for grouping,
+regulating and monitoring tasks and sockets respectively.
+
+To avoid repeating instructions for each classtype, this document assumes a
+task to be the kernel object being grouped. By and large, one can replace task
+with socket and taskclass with socketclass.
+
+RCFS depicts a CKRM class as a directory. Hierarchy of classes can be
+created in which children of a class share resources allotted to
+the parent. Tasks can be classified to any class which is at any level.
+There is no correlation between parent-child relationship of tasks and
+the parent-child relationship of classes they belong to.
+
+Without a Classification Engine, class is inherited by a task. A privileged
+user can reassigned a task to a class as described below, after which all
+the child tasks under that task will be assigned to that class, unless the
+user reassigns any of them.
+
+A Classification Engine, if one exists, will be used by CKRM to
+classify a task to a class. The Rule based classification engine uses some
+of the attributes of the task to classify a task. When a CE is present
+class is not inherited by a task.
+
+Characteristics of a class can be accessed/changed through the following magic
+files under the directory representing the class:
+
+shares:  allows to change the shares of different resources managed by the
+         class
+stats:   allows to see the statistics associated with each resources managed
+         by the class
+target:  allows to assign a task to a class. If a CE is present, assigning
+         a task to a class through this interface will prevent CE from
+		 reassigning the task to any class during reclassification.
+members: allows to see which tasks has been assigned to a class
+config:  allow to view and modify configuration information of different
+         resources in a class.
+
+Resource allocations for a class is controlled by the parameters:
+
+guarantee: specifies how much of a resource is guranteed to a class. A
+           special value DONT_CARE(-2) mean that there is no specific
+	   guarantee of a resource is specified, this class may not get
+	   any resource if the system is runing short of resources
+limit:     specifies the maximum amount of resource that is allowed to be
+           allocated by a class. A special value DONT_CARE(-2) mean that
+	   there is no specific limit is specified, this class can get all
+	   the resources available.
+total_guarantee: total guarantee that is allowed among the children of this
+           class. In other words, the sum of "guarantee"s of all children
+	   of this class cannot exit this number.
+max_limit: Maximum "limit" allowed for any of this class's children. In
+	   other words, "limit" of any children of this class cannot exceed
+	   this value.
+
+None of this parameters are absolute or have any units associated with
+them. These are just numbers(that are relative to its parents') that are
+used to calculate the absolute number of resource available for a specific
+class.
+
+Note: The root class has an absolute number of resource units associated with it.
+
Index: linux-2.6.12-rc3-ckrm5/Documentation/ckrm/core_usage
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/Documentation/ckrm/core_usage	2005-05-05 09:35:13.000000000 -0700
@@ -0,0 +1,72 @@
+Usage of CKRM without a classification engine
+-----------------------------------------------
+
+1. Create a class
+
+   # mkdir /rcfs/taskclass/c1
+   creates a taskclass named c1 , while
+   # mkdir /rcfs/socket_class/s1
+   creates a socketclass named s1
+
+The newly created class directory is automatically populated by magic files
+shares, stats, members, target and config.
+
+2. View default shares
+
+   # cat /rcfs/taskclass/c1/shares
+
+   "guarantee=-2,limit=-2,total_guarantee=100,max_limit=100" is the default
+   value set for resources that have controllers registered with CKRM.
+
+3. change shares of a <class>
+
+   One or more of the following fields can/must be specified
+       res=<res_name> #mandatory
+       guarantee=<number>
+       limit=<number>
+       total_guarantee=<number>
+       max_limit=<number>
+   e.g.
+	# echo "res=numtasks,limit=20" > /rcfs/taskclass/c1
+
+   If any of these parameters are not specified, the current value will be
+   retained.
+
+4. Reclassify a task (listening socket)
+
+   write the pid of the process to the destination class' target file
+   # echo 1004 > /rcfs/taskclass/c1/target	
+
+   write the "<ipaddress>\<port>" string to the destination class' target file
+   # echo "0.0.0.0\32770"  > /rcfs/taskclass/c1/target
+
+5. Get a list of tasks (sockets) assigned to a taskclass (socketclass)
+
+   # cat /rcfs/taskclass/c1/members
+   lists pids of tasks belonging to c1
+
+   # cat /rcfs/socket_class/s1/members
+   lists the ipaddress\port of all listening sockets in s1
+
+6. Get the statictics of different resources of a class
+
+   # cat /rcfs/tasksclass/c1/stats
+   shows c1's statistics for each resource with a registered resource
+   controller.
+
+   # cat /rcfs/socket_class/s1/stats
+   show's s1's stats for the listenaq controller.	
+
+7. View the configuration values of the resources associated with a class
+
+   # cat /rcfs/taskclass/c1/config
+   shows per-controller config values for c1.
+
+8. Change the configuration values of resources associated with a class
+   Configuration values are different for different resources. the comman
+   field "res=<resname>" must always be specified.
+
+   # echo "res=numtasks,parameter=value" > /rcfs/taskclass/c1/config
+   to change (without any effect), the value associated with <parameter>.
+
+
Index: linux-2.6.12-rc3-ckrm5/Documentation/ckrm/crbce
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/Documentation/ckrm/crbce	2005-05-05 09:35:13.000000000 -0700
@@ -0,0 +1,33 @@
+CRBCE
+----------
+
+crbce is a superset of rbce. In addition to providing automatic
+classification, the crbce module
+- monitors per-process delay data that is collected by the delay
+accounting patch
+- collects data on significant kernel events where reclassification
+could occur e.g. fork/exec/setuid/setgid etc., and
+- uses relayfs to supply both these datapoints to userspace
+
+To illustrate the utility of the data gathered by crbce, we provide a
+userspace daemon called crbcedmn that prints the header info received
+from the records sent by the crbce module.
+
+0. Ensure that a CKRM-enabled kernel with following options configured
+   has been compiled. At a minimum, core, rcfs, at least one classtype,
+   delay-accounting patch and relayfs. For testing, it is recommended
+   all classtypes and resource controllers be compiled as modules.
+
+1. Ensure that the Makefile's BUILD_CRBCE=1 and KDIR points to the
+   kernel of step 1 and call make.
+   This also builds the userspace daemon, crbcedmn.
+
+2..9 Same as rbce installation and testing instructions,
+     except replacing rbce.ko with crbce.ko
+
+10. Read the pseudo daemon help file
+    # ./crbcedmn -h
+
+11. Run the crbcedmn to display all records being processed
+    # ./crbcedmn
+
Index: linux-2.6.12-rc3-ckrm5/Documentation/ckrm/installation
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/Documentation/ckrm/installation	2005-05-05 09:35:13.000000000 -0700
@@ -0,0 +1,70 @@
+Kernel installation
+------------------------------
+
+<kernver> = version of mainline Linux kernel
+<ckrmver> = version of CKRM
+
+Note: It is expected that CKRM versions will change fairly rapidly. Hence once
+a CKRM version has been released for some <kernver>, it will only be made
+available for future <kernver>'s until the next CKRM version is released.
+
+1. Patch
+
+    Apply ckrm/kernel/<kernver>/ckrm-<ckrmversion>.patch to a mainline kernel
+    tree with version <kernver>.
+
+    If CRBCE will be used, additionally apply the following patches, in order:
+       delayacctg-<ckrmversion>.patch
+       relayfs-<ckrmversion>.patch
+
+
+2. Configure
+
+Select appropriate configuration options:
+
+a. for taskclasses
+
+   General Setup-->Class Based Kernel Resource Management
+
+   [*] Class Based Kernel Resource Management
+   <M> Resource Class File System (User API)
+   [*]   Class Manager for Task Groups
+   <M>     Number of Tasks Resource Manager
+
+b. To test socket_classes and multiple accept queue controller
+
+   General Setup-->Class Based Kernel Resource Management
+   [*] Class Based Kernel Resource Management
+   <M> Resource Class File System (User API)
+   [*]   Class Manager for socket groups
+   <M>     Multiple Accept Queues Resource Manager
+
+   Device Drivers-->Networking Support-->Networking options-->
+   [*] Network packet filtering (replaces ipchains)
+   [*] IP: TCP Multiple accept queues support
+
+c. To test CRBCE later (requires 2a.)
+
+   File Systems-->Pseudo filesystems-->
+   <M> Relayfs filesystem support
+   (enable all sub fields)
+
+   General Setup-->
+   [*] Enable delay accounting
+
+
+3. Build, boot into kernel
+
+4. Enable rcfs
+
+    # insmod <patchedtree>/fs/rcfs/rcfs.ko
+    # mount -t rcfs rcfs /rcfs
+
+    This will create the directories /rcfs/taskclass and
+    /rcfs/socketclass which are the "roots" of subtrees for creating
+    taskclasses and socketclasses respectively.
+  	
+5. Load numtasks and listenaq controllers
+
+    # insmod <patchedtree>/kernel/ckrm/ckrm_tasks.ko
+    # insmod <patchedtree>/kernel/ckrm/ckrm_listenaq.ko
Index: linux-2.6.12-rc3-ckrm5/Documentation/ckrm/rbce_basics
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/Documentation/ckrm/rbce_basics	2005-05-05 09:35:13.000000000 -0700
@@ -0,0 +1,67 @@
+Rule-based Classification Engine (RBCE)
+-------------------------------------------
+
+The ckrm/rbce directory contains the sources for two classification engines
+called rbce and crbce. Both are optional, built as kernel modules and share much
+of their codebase. Only one classification engine (CE) can be loaded at a time
+in CKRM.
+
+
+With RBCE, user can specify rules for how tasks are classified to a
+class.  Rules are specified by one or more attribute-value pairs and
+an associated class. The tasks that match all the attr-value pairs
+will get classified to the class attached with the rule.
+
+The file rbce_info under /rcfs/ce directory details the functionality
+of different files available under the directory and also details
+about attributes that can are used to define rules.
+
+order: When multiple rules are defined the rules are executed
+	   according to the order of a rule. Order can be specified
+	   while defining a rule.  If order is not specified, the
+	   highest order will be assigned to the rule(i.e, the new
+	   rule will be executed after all the previously defined
+	   evaluate false). So, order of rules is important as that
+	   will decide, which class a task will get assigned to. For
+	   example, if we have the two following rules: r1:
+	   uid=1004,order=10,class=/rcfs/taskclass/c1 r2:
+	   uid=1004,cmd=grep,order=20,class=/rcfs/taskclass/c2 then,
+	   the task "grep" executed by user 1004 will always be
+	   assigned to class /rcfs/taskclass/c1, as rule r1 will be
+	   executed before r2 and the task successfully matched the
+	   rule's attr-value pairs. Rule r2 will never be consulted
+	   for the command.  Note: The order in which the rules are
+	   displayed(by ls) has no correlation with the order of the
+	   rule.
+
+dependency: Rules can be defined to be depend on another rule. i.e a
+	   rule can be dependent on one rule and has its own
+	   additional attr-value pairs. the dependent rule will
+	   evaluate true only if all the attr-value pairs of both
+	   rules are satisfied.  ex: r1: gid=502,class=/rcfs/taskclass
+	   r2: depend=r1,cmd=grep,class=rcfstaskclass/c1 r2 is a
+	   dependent rule that depends on r1, a task will be assigned
+	   to /rcfs/taskclass/c1 if its gid is 502 and the executable
+	   command name is "grep". If a task's gid is 502 but the
+	   command name is _not_ "grep" then it will be assigned to
+	   /rcfs/taskclass
+
+	   Note: The order of dependent rule must be _lesser_ than the
+	   rule it depends on, so that it is evaluated _before the
+	   base rule is evaluated. Otherwise the base rule will
+	   evaluate true and the task will be assigned to the class of
+	   that rule without the dependent rule ever getting
+	   evaluated. In the example above, order of r2 must be lesser
+	   than order of r1.
+
+app_tag: a task can be attached with a tag(ascii string), that becomes
+	   an attribute of that task and rules can be defined with the
+	   tag value.
+
+state: states are at two levels in RBCE. The entire RBCE can be
+	   enabled or disabled which writing 1 or 0 to the file
+	   rbce_state under /rcfs/ce.  Disabling RBCE, would mean that
+	   the rules defined in RBCE will not be utilized for
+	   classifying a task to a class.  A specific rule can be
+	   enabled/disabled by changing the state of that rule. Once
+	   it is disabled, the rule will not be evaluated.
Index: linux-2.6.12-rc3-ckrm5/Documentation/ckrm/rbce_usage
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/Documentation/ckrm/rbce_usage	2005-05-05 09:35:13.000000000 -0700
@@ -0,0 +1,98 @@
+Usage of CKRM with RBCE
+--------------------------
+
+0. Ensure that a CKRM-enabled kernel with following options configured
+   has been compiled. At a minimum, core, rcfs and at least one
+   classtype. For testing, it is recommended all classtypes and
+   resource controllers be compiled as modules.
+
+1. Change ckrm/rbce/Makefile's KDIR to point to this compiled kernel's source
+   tree and call make
+
+2. Load rbce module.
+   # insmod ckrm/rbce/rbce.ko
+   Note that /rcfs has to be mounted before this.
+   Note: this command should populate the directory /rcfs/ce with files
+   rbce_reclassify, rbce_tag, rbce_info, rbce_state and a directory
+   rules.
+
+   Note2: If these are not created automatically, just create them by
+   using the commands touch and mkdir.(bug that needs to be fixed)
+
+3. Defining a rule
+   Rules are defined by creating(by writing) to a file under the
+   /rcfs/ce/rules directory by concatinating multiple attribute value
+   pairs.
+
+   Note that the classes must be defined before defining rules that
+   uses the classes.  eg: the command # echo
+   "uid=1004,class=/rcfs/taskclass/c1" > /rcfs/ce/rules/r1 will define
+   a rule r1 that classifies all tasks belong to user id 1004 to class
+   /rcfs/taskclass/c1
+
+4. Viewing a rule
+   read the corresponding file.
+   to read rule r1, issue the command:
+      # cat /rcfs/ce/rules/r1
+
+5. Changing a rule
+
+   Changing a rule is done the same way as defining a rule, the new
+   rule will include the old set of attr-value pairs slapped with new
+   attr-value pairs.  eg: if the current r2 is
+   uid=1004,depend=r1,class=/rcfs/taskclass/c1
+   (r1 as defined in step 3)
+
+   the command:
+     # echo gid=502 > /rcfs/ce/rules/r1
+   will change the rule to
+     r1: uid=1004,gid=502,depend=r1,class=/rcfs/taskclass/c1
+
+   the command:
+     # echo uid=1005 > /rcfs/ce/rules/r1
+   will change the rule to
+     r1: uid=1005,class=/rcfs/taskclass/c1
+
+   the command:
+     # echo class=/rcfs/taskclass/c2 > /rcfs/ce/rules/r1
+   will change the rule to
+     r1: uid=1004,depend=r1,class=/rcfs/taskclass/c2
+
+   the command:
+     # echo depend=r4 > /rcfs/ce/rules/r1
+   will change the rule to
+     r1: uid=1004,depend=r4,class=/rcfs/taskclass/c2
+
+   the command:
+     # echo +depend=r4 > /rcfs/ce/rules/r1
+   will change the rule to
+     r1: uid=1004,depend=r1,depend=r4,class=/rcfs/taskclass/c2
+
+   the command:
+     # echo -depend=r1 > /rcfs/ce/rules/r1
+   will change the rule to
+     r1: uid=1004,class=/rcfs/taskclass/c2
+
+6. Checking the state of RBCE
+   State(enabled/disabled) of RBCE can be checked by reading the file
+   /rcfs/ce/rbce_state, it will show 1(enabled) or 0(disabled).
+   By default, RBCE is enabled(1).
+   ex: # cat /rcfs/ce/rbce_state
+
+7. Changing the state of RBCE
+   State of RBCE can be changed by writing 1(enable) or 0(disable).
+   ex: # echo 1 > cat /rcfs/ce/rbce_state
+
+8. Checking the state of a rule
+   State of a rule is displayed in the rule. Rule can be viewed by
+   reading the rule file.  ex: # cat /rcfs/ce/rules/r1
+
+9. Changing the state of a rule
+
+   State of a rule can be changed by writing "state=1"(enable) or
+   "state=0"(disable) to the corresponding rule file. By defeault, the
+   rule is enabled when defined.  ex: to disable an existing rule r1,
+   issue the command
+   # echo "state=0" > /rcfs/ce/rules/r1
+
+
Index: linux-2.6.12-rc3-ckrm5/Documentation/ckrm/TODO
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/Documentation/ckrm/TODO	2005-05-05 09:35:13.000000000 -0700
@@ -0,0 +1,16 @@
+Current tasks in queue
+
+- Use __bitfield for enums
+- Add listenaq controller
+- Add RBCE/CRBCE
+- Add memory controller
+- Add I/O controller
+- Add forkrate control to numtasks controller
+- remove target file and move functionality to members file
+- init_task is not classified under any class
+- use netlink instead of relayfs for crbce
+- convert refcount usages to kref_t
+- add kerneldoc format headers to user consumable functions/macros
+- convert ckrm_init() to use standard initcalls
+- verify that the LGPL/GPL header/symbol usage is correct
+-

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 09/21] CKRM: Add missing read_unlock
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (7 preceding siblings ...)
  2005-05-05 18:07 ` [patch 08/21] CKRM: Documentation gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 10/21] CKRM: Move Callbacks from listenaq to socketclass gh
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=03a-missing_unlock


Function returns without unlocking the readlock in a case.
This patch fixes it.

Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

 ckrm.c |    1 +
 1 files changed, 1 insertion(+)

Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/ckrm.c	2005-05-05 09:35:04.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm.c	2005-05-05 09:35:14.000000000 -0700
@@ -106,6 +106,7 @@ void *ckrm_classobj(char *classname, int
 			if (core->name && !strcmp(core->name, classname)) {
 				/* FIXME:   should grep reference. */
 				*classtype_id = ctype->type_id;
+				read_unlock(&ckrm_class_lock);
 				return core;
 			}
 		}

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 10/21] CKRM: Move Callbacks from listenaq to socketclass
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (8 preceding siblings ...)
  2005-05-05 18:07 ` [patch 09/21] CKRM: Add missing read_unlock gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 11/21] CKRM: Change ipaddr_port syntax gh
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=06a-ckrm_net_cb


Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

Callbacks are not called from appropriate places in the socketclass
patch. The patch was wrongly present in the listenaq controller.
Moving from listenaq controller to socketclass patch.

 tcp.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletion(-)

Index: linux-2.6.12-rc3-ckrm5/net/ipv4/tcp.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/net/ipv4/tcp.c	2005-05-05 09:33:01.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/net/ipv4/tcp.c	2005-05-05 09:36:26.000000000 -0700
@@ -263,6 +263,7 @@
 #include <net/xfrm.h>
 #include <net/ip.h>
 
+#include <linux/ckrm_events.h>
 
 #include <asm/uaccess.h>
 #include <asm/ioctls.h>
@@ -496,7 +497,7 @@ int tcp_listen_start(struct sock *sk)
 
 		sk_dst_reset(sk);
 		sk->sk_prot->hash(sk);
-
+		ckrm_cb_listen_start(sk);
 		return 0;
 	}
 
@@ -529,6 +530,8 @@ static void tcp_listen_stop (struct sock
 	write_unlock_bh(&tp->syn_wait_lock);
 	tp->accept_queue = tp->accept_queue_tail = NULL;
 
+	ckrm_cb_listen_stop(sk);
+
 	if (lopt->qlen) {
 		for (i = 0; i < TCP_SYNQ_HSIZE; i++) {
 			while ((req = lopt->syn_table[i]) != NULL) {

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 11/21] CKRM: Change ipaddr_port syntax
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (9 preceding siblings ...)
  2005-05-05 18:07 ` [patch 10/21] CKRM: Move Callbacks from listenaq to socketclass gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 12/21] CKRM: Check to see if my guarantee is set to DONTCARE gh
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=06b-ckrm_sockc


Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Vivek Kashyap <kashyapv@us.ibm.com
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

Change the ipaddr_port syntax from "xxx.xxx.xxx.xxx\\YY" to
"xxx.xxx.xxx.xxx:YY" to make it easy for cut-n-paste.


 ckrm_sockc.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_sockc.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/ckrm_sockc.c	2005-05-05 09:35:09.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_sockc.c	2005-05-05 09:36:27.000000000 -0700
@@ -343,7 +343,7 @@ static int ckrm_sock_show_members(struct
 	class_lock(core);
 	list_for_each(lh, &core->objlist) {
 		ns = container_of(lh, struct ckrm_net_struct, ckrm_link);
-		seq_printf(seq, "%d.%d.%d.%d\\%d\n",
+		seq_printf(seq, "%d.%d.%d.%d:%d\n",
 			   NIPQUAD(ns->ns_daddrv4), ns->ns_dport);
 	}
 	class_unlock(core);
@@ -459,7 +459,7 @@ ckrm_sock_forced_reclassify(struct ckrm_
 			return -EPERM;
 		if (id != 0)
 			return -EINVAL;
-		printk("ckrm_sock_class: reclassify all not net implemented\n");
+		printk("socketclass: reclassify all not implemented yet\n");
 		return 0;
 	}
 
@@ -478,15 +478,15 @@ ckrm_sock_forced_reclassify(struct ckrm_
 			while (*p2 && (*p2 != '='))
 				++p2;
 			p2++;
-			p2 = v4toi(p2, '\\', &(v4addr));
+			p2 = v4toi(p2, ':', &(v4addr));
 			ns.ns_daddrv4 = htonl(v4addr);
 			ns.ns_family = AF_INET;
-			p2 = v4toi(++p2, ':', &tmp);
+			p2 = v4toi(++p2, '/', &tmp);
 			ns.ns_dport = (__u16) tmp;
 			if (*p2)
 				p2 = v4toi(++p2, '\0', &ns.ns_pid);
 			ckrm_sock_forced_reclassify_ns(&ns, target);
-			break;
+			return 0;
 
 		case IPV6:
 			printk(KERN_INFO "rcfs: IPV6 not supported yet\n");

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 12/21] CKRM: Check to see if my guarantee is set to DONTCARE
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (10 preceding siblings ...)
  2005-05-05 18:07 ` [patch 11/21] CKRM: Change ipaddr_port syntax gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 13/21] CKRM: Minor cosmetic cleanups in numtasks controller gh
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=07a-numtasks_config


Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Vivek Kashyap <kashyapv@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

recalc and propagate was not checking for class's my_guarantee and
my_limit againt DONT_CARE. This was leading to different wierd
problems. This patch fixes it.

 ckrm_numtasks.c |    9 ++++++---
 1 files changed, 6 insertions(+), 3 deletions(-)

Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_numtasks.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/ckrm_numtasks.c	2005-05-05 09:35:11.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_numtasks.c	2005-05-05 09:36:29.000000000 -0700
@@ -296,7 +296,8 @@ recalc_and_propagate(struct ckrm_numtask
 		struct ckrm_shares *self = &res->shares;
 
 		/* calculate cnt_guarantee and cnt_limit */
-		if (parres->cnt_guarantee == CKRM_SHARE_DONTCARE) {
+		if ((parres->cnt_guarantee == CKRM_SHARE_DONTCARE) ||
+				(self->my_guarantee == CKRM_SHARE_DONTCARE)) {
 			res->cnt_guarantee = CKRM_SHARE_DONTCARE;
 		} else if (par->total_guarantee) {
 			u64 temp = (u64) self->my_guarantee * parres->cnt_guarantee;
@@ -306,7 +307,8 @@ recalc_and_propagate(struct ckrm_numtask
 			res->cnt_guarantee = 0;
 		}
 
-		if (parres->cnt_limit == CKRM_SHARE_DONTCARE) {
+		if ((parres->cnt_limit == CKRM_SHARE_DONTCARE) ||
+				(self->my_limit == CKRM_SHARE_DONTCARE)) {
 			res->cnt_limit = CKRM_SHARE_DONTCARE;
 		} else if (par->max_limit) {
 			u64 temp = (u64) self->my_limit * parres->cnt_limit;
@@ -317,7 +319,8 @@ recalc_and_propagate(struct ckrm_numtask
 		}
 
 		/* Calculate unused units */
-		if (res->cnt_guarantee == CKRM_SHARE_DONTCARE) {
+		if ((res->cnt_guarantee == CKRM_SHARE_DONTCARE) ||
+				(self->my_guarantee == CKRM_SHARE_DONTCARE)) {
 			res->cnt_unused = CKRM_SHARE_DONTCARE;
 		} else if (self->total_guarantee) {
 			u64 temp = (u64) self->unused_guarantee * res->cnt_guarantee;

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 13/21] CKRM: Minor cosmetic cleanups in numtasks controller
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (11 preceding siblings ...)
  2005-05-05 18:07 ` [patch 12/21] CKRM: Check to see if my guarantee is set to DONTCARE gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 14/21] CKRM: undo removal of check in numtasks_put_ref_local gh
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=07c-numtasks_cleanup


Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

Simple code cleanup. No functional changes.

 ckrm_numtasks.c |   98 +++++++++++++++++---------------------------------------
 1 files changed, 30 insertions(+), 68 deletions(-)

Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_numtasks.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/ckrm_numtasks.c	2005-05-05 09:36:29.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_numtasks.c	2005-05-05 09:37:41.000000000 -0700
@@ -99,17 +99,6 @@ static void numtasks_res_initcls_one(str
 	return;
 }
 
-#if 0
-static void numtasks_res_initcls(void *my_res)
-{
-	struct ckrm_numtasks *res = my_res;
-
-	/* Write a version which propagates values all the way down
-	   and replace rcbs callback with that version */
-
-}
-#endif
-
 static int numtasks_get_ref_local(struct ckrm_core_class *core, int force)
 {
 	int rc, resid = numtasks_rcbs.resid;
@@ -144,29 +133,24 @@ static int numtasks_get_ref_local(struct
 				res->borrow_failures++;
 				res->tot_borrow_failures++;
 			}
-		} else {
+		} else
 			rc = force;
-		}
 	} else if (res->over_guarantee) {
 		res->over_guarantee = 0;
 
-		if (res->max_limit_failures < res->limit_failures) {
+		if (res->max_limit_failures < res->limit_failures)
 			res->max_limit_failures = res->limit_failures;
-		}
-		if (res->max_borrow_sucesses < res->borrow_sucesses) {
+		if (res->max_borrow_sucesses < res->borrow_sucesses)
 			res->max_borrow_sucesses = res->borrow_sucesses;
-		}
-		if (res->max_borrow_failures < res->borrow_failures) {
+		if (res->max_borrow_failures < res->borrow_failures)
 			res->max_borrow_failures = res->borrow_failures;
-		}
 		res->limit_failures = 0;
 		res->borrow_sucesses = 0;
 		res->borrow_failures = 0;
 	}
 
-	if (!rc) {
+	if (!rc)
 		atomic_dec(&res->cnt_cur_alloc);
-	}
 	return rc;
 }
 
@@ -175,18 +159,12 @@ static void numtasks_put_ref_local(struc
 	int resid = numtasks_rcbs.resid;
 	struct ckrm_numtasks *res;
 
-	if ((resid == -1) || (core == NULL)) {
+	if ((resid == -1) || (core == NULL))
 		return;
-	}
 
 	res = ckrm_get_res_class(core, resid, struct ckrm_numtasks);
 	if (res == NULL)
 		return;
-	if (unlikely(atomic_read(&res->cnt_cur_alloc) == 0)) {
-		printk(KERN_WARNING "numtasks_put_ref: Trying to decrement "
-					"counter below 0\n");
-		return;
-	}
 	atomic_dec(&res->cnt_cur_alloc);
 	if (atomic_read(&res->cnt_borrowed) > 0) {
 		atomic_dec(&res->cnt_borrowed);
@@ -242,19 +220,10 @@ static void numtasks_res_free(void *my_r
 
 	parres = ckrm_get_res_class(res->parent, resid, struct ckrm_numtasks);
 
-	if (unlikely(atomic_read(&res->cnt_cur_alloc) < 0)) {
-		printk(KERN_WARNING "numtasks_res: counter below 0\n");
-	}
-	if (unlikely(atomic_read(&res->cnt_cur_alloc) > 0 ||
-				atomic_read(&res->cnt_borrowed) > 0)) {
-		printk(KERN_WARNING "numtasks_res_free: resource still "
-		       "alloc'd %p\n", res);
-		if ((borrowed = atomic_read(&res->cnt_borrowed)) > 0) {
-			for (i = 0; i < borrowed; i++) {
-				numtasks_put_ref_local(parres->core);
-			}
-		}
-	}
+	if ((borrowed = atomic_read(&res->cnt_borrowed)) > 0)
+		for (i = 0; i < borrowed; i++)
+			numtasks_put_ref_local(parres->core);
+
 	/* return child's limit/guarantee to parent node */
 	spin_lock(&parres->cnt_lock);
 	child_guarantee_changed(&parres->shares, res->shares.my_guarantee, 0);
@@ -264,14 +233,12 @@ static void numtasks_res_free(void *my_r
 	maxlimit = 0;
 	while ((child = ckrm_get_next_child(parres->core, child)) != NULL) {
 		childres = ckrm_get_res_class(child, resid, struct ckrm_numtasks);
-		if (maxlimit < childres->shares.my_limit) {
+		if (maxlimit < childres->shares.my_limit)
 			maxlimit = childres->shares.my_limit;
-		}
 	}
 	ckrm_unlock_hier(parres->core);
-	if (parres->shares.cur_max_limit < maxlimit) {
+	if (parres->shares.cur_max_limit < maxlimit)
 		parres->shares.cur_max_limit = maxlimit;
-	}
 
 	spin_unlock(&parres->cnt_lock);
 	kfree(res);
@@ -297,39 +264,37 @@ recalc_and_propagate(struct ckrm_numtask
 
 		/* calculate cnt_guarantee and cnt_limit */
 		if ((parres->cnt_guarantee == CKRM_SHARE_DONTCARE) ||
-				(self->my_guarantee == CKRM_SHARE_DONTCARE)) {
+				(self->my_guarantee == CKRM_SHARE_DONTCARE))
 			res->cnt_guarantee = CKRM_SHARE_DONTCARE;
-		} else if (par->total_guarantee) {
+		else if (par->total_guarantee) {
 			u64 temp = (u64) self->my_guarantee * parres->cnt_guarantee;
 			do_div(temp, par->total_guarantee);
 			res->cnt_guarantee = (int) temp;
-		} else {
+		} else
 			res->cnt_guarantee = 0;
-		}
 
 		if ((parres->cnt_limit == CKRM_SHARE_DONTCARE) ||
-				(self->my_limit == CKRM_SHARE_DONTCARE)) {
+				(self->my_limit == CKRM_SHARE_DONTCARE))
 			res->cnt_limit = CKRM_SHARE_DONTCARE;
-		} else if (par->max_limit) {
+		else if (par->max_limit) {
 			u64 temp = (u64) self->my_limit * parres->cnt_limit;
 			do_div(temp, par->max_limit);
 			res->cnt_limit = (int) temp;
-		} else {
+		} else
 			res->cnt_limit = 0;
-		}
 
 		/* Calculate unused units */
 		if ((res->cnt_guarantee == CKRM_SHARE_DONTCARE) ||
-				(self->my_guarantee == CKRM_SHARE_DONTCARE)) {
+				(self->my_guarantee == CKRM_SHARE_DONTCARE))
 			res->cnt_unused = CKRM_SHARE_DONTCARE;
-		} else if (self->total_guarantee) {
+		else if (self->total_guarantee) {
 			u64 temp = (u64) self->unused_guarantee * res->cnt_guarantee;
 			do_div(temp, self->total_guarantee);
 			res->cnt_unused = (int) temp;
-		} else {
+		} else
 			res->cnt_unused = 0;
-		}
 	}
+
 	/* propagate to children */
 	ckrm_lock_hier(res->core);
 	while ((child = ckrm_get_next_child(res->core, child)) != NULL) {
@@ -368,21 +333,19 @@ static int numtasks_set_share_values(voi
 
 	if ((rc == 0) && parres) {
 		/* Calculate parent's unused units */
-		if (parres->cnt_guarantee == CKRM_SHARE_DONTCARE) {
+		if (parres->cnt_guarantee == CKRM_SHARE_DONTCARE)
 			parres->cnt_unused = CKRM_SHARE_DONTCARE;
-		} else if (par->total_guarantee) {
+		else if (par->total_guarantee) {
 			u64 temp = (u64) par->unused_guarantee * parres->cnt_guarantee;
 			do_div(temp, par->total_guarantee);
 			parres->cnt_unused = (int) temp;
-		} else {
+		} else
 			parres->cnt_unused = 0;
-		}
 		recalc_and_propagate(res, parres);
 	}
 	spin_unlock(&res->cnt_lock);
-	if (res->parent) {
+	if (res->parent)
 		spin_unlock(&parres->cnt_lock);
-	}
 	return rc;
 }
 
@@ -403,7 +366,7 @@ static int numtasks_get_stats(void *my_r
 	if (!res)
 		return -EINVAL;
 
-	seq_printf(sfile, "Number of tasks resource:\n");
+	seq_printf(sfile, "---------Number of tasks stats start---------\n");
 	seq_printf(sfile, "Total Over limit failures: %d\n",
 		   res->tot_limit_failures);
 	seq_printf(sfile, "Total Over guarantee sucesses: %d\n",
@@ -417,6 +380,7 @@ static int numtasks_get_stats(void *my_r
 		   res->max_borrow_sucesses);
 	seq_printf(sfile, "Maximum Over guarantee failures: %d\n",
 		   res->max_borrow_failures);
+	seq_printf(sfile, "---------Number of tasks stats end---------\n");
 #ifdef NUMTASKS_DEBUG
 	seq_printf(sfile,
 		   "cur_alloc %d; borrowed %d; cnt_guar %d; cnt_limit %d "
@@ -468,9 +432,8 @@ static void numtasks_change_resclass(voi
 		}
 		numtasks_put_ref_local(oldres->core);
 	}
-	if (newres) {
+	if (newres)
 		(void)numtasks_get_ref_local(newres->core, 1);
-	}
 }
 
 struct ckrm_res_ctlr numtasks_rcbs = {
@@ -512,9 +475,8 @@ int __init init_ckrm_numtasks_res(void)
 
 void __exit exit_ckrm_numtasks_res(void)
 {
-	if (numtasks_rcbs.resid != -1) {
+	if (numtasks_rcbs.resid != -1)
 		ckrm_numtasks_register(NULL, NULL);
-	}
 	ckrm_unregister_res_ctlr(&numtasks_rcbs);
 	numtasks_rcbs.resid = -1;
 }

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 14/21] CKRM: undo removal of check in numtasks_put_ref_local
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (12 preceding siblings ...)
  2005-05-05 18:07 ` [patch 13/21] CKRM: Minor cosmetic cleanups in numtasks controller gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 15/21] CKRM: Rule Based Classification Engine, stub rcfs support gh
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=07c2-numtasks-undo-delete


Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

Removed without realizing it. Putting it back.

 ckrm_numtasks.c |    3 +++
 1 files changed, 3 insertions(+)

Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_numtasks.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/ckrm_numtasks.c	2005-05-05 09:37:41.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm_numtasks.c	2005-05-05 09:37:55.000000000 -0700
@@ -165,6 +165,9 @@ static void numtasks_put_ref_local(struc
 	res = ckrm_get_res_class(core, resid, struct ckrm_numtasks);
 	if (res == NULL)
 		return;
+
+	if (atomic_read(&res->cnt_cur_alloc) == 0)
+		return;
 	atomic_dec(&res->cnt_cur_alloc);
 	if (atomic_read(&res->cnt_borrowed) > 0) {
 		atomic_dec(&res->cnt_borrowed);

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 15/21] CKRM: Rule Based Classification Engine, stub rcfs support
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (13 preceding siblings ...)
  2005-05-05 18:07 ` [patch 14/21] CKRM: undo removal of check in numtasks_put_ref_local gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 16/21] CKRM: Rule Based Classification Engine, basic " gh
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=09-01-rbce_fs


Part 1 of 5 patches to support Rule Based Classification Engine for CKRM.
This patch provides the the basic rcfs interface for rbce. It just provides
the interface with stub functions.

Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@us.ibm.com>
Signed-Off-By: Vivek Kashyap <vivk@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

 include/linux/rbce.h             |   28 ++
 init/Kconfig                     |   12 +
 kernel/ckrm/Makefile             |    1 
 kernel/ckrm/rbce/Makefile        |    6 
 kernel/ckrm/rbce/rbce_fs.c       |  365 +++++++++++++++++++++++++++++++++++++++
 kernel/ckrm/rbce/rbce_internal.h |   59 ++++++
 kernel/ckrm/rbce/rbce_main.c     |   97 ++++++++++
 7 files changed, 568 insertions(+)

Index: linux-2.6.12-rc3-ckrm5/include/linux/rbce.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/include/linux/rbce.h	2005-05-05 09:38:01.000000000 -0700
@@ -0,0 +1,28 @@
+/* Rule-based Classification Engine (RBCE) module
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ *           (C) Chandra Seetharaman, IBM Corp. 2003
+ *
+ * Module for loading of classification policies and providing
+ * a user API for Class-based Kernel Resource Management (CKRM)
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ *
+ */
+
+#ifndef _LINUX_RBCE_H
+#define _LINUX_RBCE_H
+
+#define RBCE_MOD_DESCR "Rule Based Classification Engine Module for CKRM"
+#define RBCE_MOD_NAME  "rbce"
+
+#endif	/* _LINUX_RBCE_H */
Index: linux-2.6.12-rc3-ckrm5/init/Kconfig
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/init/Kconfig	2005-05-05 09:35:11.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/init/Kconfig	2005-05-05 09:38:01.000000000 -0700
@@ -203,6 +203,18 @@ config CKRM_RES_NUMTASKS
 	
 	  Say N if unsure, Y to use the feature.
 
+
+config CKRM_RBCE
+	tristate "Vanilla Rule-based Classification Engine (RBCE)"
+	depends on CKRM && RCFS_FS
+	default m
+	help
+	  Provides an optional module to support creation of rules for automatic
+	  classification of kernel objects. Rules are created/deleted/modified
+	  through an rcfs interface. RBCE is not required for CKRM.
+
+	  If unsure, say N.
+
 endmenu
 
 config SYSCTL
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/Makefile	2005-05-05 09:35:11.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/Makefile	2005-05-05 09:38:01.000000000 -0700
@@ -6,3 +6,4 @@ obj-y += ckrm_events.o ckrm.o ckrmutils.
 obj-$(CONFIG_CKRM_TYPE_TASKCLASS) += ckrm_tc.o ckrm_numtasks_stub.o
 obj-$(CONFIG_CKRM_TYPE_SOCKETCLASS) += ckrm_sockc.o
 obj-$(CONFIG_CKRM_RES_NUMTASKS) += ckrm_numtasks.o
+obj-$(CONFIG_CKRM_RBCE) += rbce/
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/Makefile
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/Makefile	2005-05-05 09:38:01.000000000 -0700
@@ -0,0 +1,6 @@
+#
+# Makefile for RBCE
+#
+
+obj-$(CONFIG_CKRM_RBCE)	+= rbce.o
+rbce-objs := rbce_fs.o rbce_main.o
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_fs.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_fs.c	2005-05-05 09:38:01.000000000 -0700
@@ -0,0 +1,365 @@
+/*
+ * RCFS API for Rule-based Classification Engine (RBCE)
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ *           (C) Chandra Seetharaman, IBM Corp. 2003
+ *           (C) Vivek Kashyap, IBM Corp. 2004
+ *
+ * Module for loading of classification policies and providing
+ * a user API for Class-based Kernel Resource Management (CKRM)
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#include <linux/pagemap.h>
+#include <linux/rcfs.h>
+#include "rbce_internal.h"
+
+#define CONFIG_CE_DIR		"ce"
+#define CONFIG_RULES_DIR	"rules"
+#define CONFIG_RBCE_STATE	"rbce_state"
+#define CONFIG_RBCE_TAG		"rbce_tag"
+
+static int rbce_unlink(struct inode *, struct dentry *);
+
+static ssize_t
+rbce_write(struct file *file, const char __user * buf,
+	   size_t len, loff_t * ppos)
+{
+	char *line, *ptr;
+	int rc = 0, pid;
+
+	line = kmalloc(len + 1, GFP_KERNEL);
+	if (!line) {
+		return -ENOMEM;
+	}
+	if (copy_from_user(line, buf, len)) {
+		kfree(line);
+		return -EFAULT;
+	}
+	line[len] = '\0';
+	ptr = line;
+	while (*ptr) {
+		if (*ptr == '\n') {
+			*ptr = '\0';
+			break;
+		}
+		ptr++;
+	}
+	if (!strcmp(file->f_dentry->d_name.name, CONFIG_RBCE_TAG)) {
+		pid = simple_strtol(line, &ptr, 0);
+		rc = rbce_set_tasktag(pid, ptr + 1); /* syntax "pid tag" */
+	} else if (!strcmp(file->f_dentry->d_name.name, CONFIG_RBCE_STATE))
+		rbce_enabled = line[0] - '0';
+	else
+		rc = rbce_change_rule(file->f_dentry->d_name.name, line);
+	if (rc)
+		len = rc;
+	kfree(line);
+	return len;
+}
+
+static int rbce_show(struct seq_file *seq, void *offset)
+{
+	struct file *file = (struct file *)seq->private;
+	char result[256];
+
+	memset(result, 0, 256);
+	if (!strcmp(file->f_dentry->d_name.name, CONFIG_RBCE_TAG))
+		return -EPERM;
+	if (!strcmp(file->f_dentry->d_name.name, CONFIG_RBCE_STATE))
+		seq_printf(seq, "%d\n", rbce_enabled);
+	else {
+		rbce_get_rule(file->f_dentry->d_name.name, result);
+		seq_printf(seq, "%s\n", result);
+	}
+	return 0;
+}
+
+static int rbce_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, rbce_show, file);
+}
+
+static int rbce_close(struct inode *ino, struct file *file)
+{
+	const char *name = file->f_dentry->d_name.name;
+
+	if (strcmp(name, CONFIG_RBCE_STATE) &&
+			strcmp(name, CONFIG_RBCE_TAG) &&
+			!rbce_rule_exists(name))
+		rbce_unlink(file->f_dentry->d_parent->d_inode, file->f_dentry);
+	return single_release(ino, file);
+}
+
+static struct file_operations rbce_file_operations;
+static struct inode_operations rbce_file_inode_operations;
+static struct inode_operations rbce_dir_inode_operations;
+
+static struct inode *rbce_get_inode(struct inode *dir, int mode, dev_t dev)
+{
+	struct inode *inode = new_inode(dir->i_sb);
+
+	if (inode) {
+		inode->i_mode = mode;
+		inode->i_uid = current->fsuid;
+		inode->i_gid = current->fsgid;
+		inode->i_blksize = PAGE_CACHE_SIZE;
+		inode->i_blocks = 0;
+		inode->i_mapping->a_ops = dir->i_mapping->a_ops;
+		inode->i_mapping->backing_dev_info =
+		    dir->i_mapping->backing_dev_info;
+		inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+		switch (mode & S_IFMT) {
+		default:
+			init_special_inode(inode, mode, dev);
+			break;
+		case S_IFREG:
+			/* Treat as default assignment */
+			inode->i_op = &rbce_file_inode_operations;
+			inode->i_fop = &rbce_file_operations;
+			break;
+		case S_IFDIR:
+			inode->i_op = &rbce_dir_inode_operations;
+			inode->i_fop = &simple_dir_operations;
+
+			/* directory inodes start off with i_nlink == 2
+			   (for "." entry) */
+			inode->i_nlink++;
+			break;
+		}
+	}
+	return inode;
+}
+
+/*
+ * File creation. Allocate an inode, and we're done..
+ */
+static int
+rbce_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
+{
+	struct inode *inode = rbce_get_inode(dir, mode, dev);
+	int error = -ENOSPC;
+
+	if (inode) {
+		if (dir->i_mode & S_ISGID) {
+			inode->i_gid = dir->i_gid;
+			if (S_ISDIR(mode))
+				inode->i_mode |= S_ISGID;
+		}
+		d_instantiate(dentry, inode);
+		dget(dentry);	/* Extra count - pin the dentry in core */
+		error = 0;
+
+	}
+	return error;
+}
+
+static int rbce_unlink(struct inode *dir, struct dentry *dentry)
+{
+	struct inode *inode = dentry->d_inode;
+	int rc;
+
+	rc = rbce_delete_rule(dentry->d_name.name);
+	if (rc == 0) {
+		if (dir)
+			dir->i_ctime = dir->i_mtime = CURRENT_TIME;
+		inode->i_ctime = CURRENT_TIME;
+		inode->i_nlink--;
+		dput(dentry);
+	}
+	return rc;
+}
+
+static int
+rbce_rename(struct inode *old_dir, struct dentry *old_dentry,
+	    struct inode *new_dir, struct dentry *new_dentry)
+{
+	int rc;
+	struct inode *inode = old_dentry->d_inode;
+	struct dentry *old_d = list_entry(old_dir->i_dentry.next,
+					  struct dentry, d_alias);
+	struct dentry *new_d = list_entry(new_dir->i_dentry.next,
+					  struct dentry, d_alias);
+
+	/* Do not allow renaming any directory */
+	if (S_ISDIR(old_dentry->d_inode->i_mode))
+		return -EINVAL;
+
+	/* Do not allow renaming files just under under /ce */
+	if (!strcmp(old_d->d_name.name, CONFIG_CE_DIR))
+		return -EINVAL;
+
+	/* cannot move anything to /ce */
+	if (!strcmp(new_d->d_name.name, CONFIG_CE_DIR))
+		return -EINVAL;
+
+	rc = rbce_rename_rule(old_dentry->d_name.name, new_dentry->d_name.name);
+
+	if (!rc)
+		old_dir->i_ctime = old_dir->i_mtime = new_dir->i_ctime =
+		    new_dir->i_mtime = inode->i_ctime = CURRENT_TIME;
+	return rc;
+}
+
+/* CE allows only the rules directory to be created */
+int rbce_mkdir(struct inode *dir, struct dentry *dentry, int mode)
+{
+	int retval = -EINVAL;
+
+	struct dentry *pd =
+	    list_entry(dir->i_dentry.next, struct dentry, d_alias);
+
+	/* Allow only /rcfs/ce and ce/rules */
+	if ((!strcmp(pd->d_name.name, CONFIG_CE_DIR) &&
+			!strcmp(dentry->d_name.name, CONFIG_RULES_DIR)) ||
+			(!strcmp(pd->d_name.name, "/") &&
+			!strcmp(dentry->d_name.name, CONFIG_CE_DIR))) {
+		if (!strcmp(dentry->d_name.name, CONFIG_CE_DIR))
+			try_module_get(THIS_MODULE);
+		retval = rbce_mknod(dir, dentry, mode | S_IFDIR, 0);
+		if (!retval)
+			dir->i_nlink++;
+	}
+
+	return retval;
+}
+
+/* CE doesn't allow deletion of directory */
+int rbce_rmdir(struct inode *dir, struct dentry *dentry)
+{
+	int rc;
+	rc = simple_rmdir(dir, dentry);
+
+	if (!rc && !strcmp(dentry->d_name.name, CONFIG_CE_DIR))
+		module_put(THIS_MODULE);
+	return rc;
+}
+
+static int
+rbce_create(struct inode *dir, struct dentry *dentry,
+	    int mode, struct nameidata *nd)
+{
+	struct dentry *pd =
+	    list_entry(dir->i_dentry.next, struct dentry, d_alias);
+
+	/* No creation allowed under ce */
+	if (!strcmp(pd->d_name.name, CONFIG_CE_DIR))
+		return -EINVAL;
+
+	return rbce_mknod(dir, dentry, mode | S_IFREG, 0);
+}
+
+static int rbce_link(struct dentry *old_d, struct inode *dir, struct dentry *d)
+{
+	return -EINVAL;
+}
+
+static int
+rbce_symlink(struct inode *dir, struct dentry *dentry, const char *symname)
+{
+	return -EINVAL;
+}
+
+/******************************* Config files  ********************/
+
+#define RBCE_NR_CONFIG 5
+struct rcfs_magf rbce_config_files[RBCE_NR_CONFIG] = {
+	{
+	 .name = CONFIG_CE_DIR,
+	 .mode = RCFS_DEFAULT_DIR_MODE,
+	 .i_op = &rbce_dir_inode_operations,
+	 },
+	{
+	 .name = CONFIG_RBCE_TAG,
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_fop = &rbce_file_operations,
+	 },
+	{
+	 .name = CONFIG_RBCE_STATE,
+	 .mode = RCFS_DEFAULT_FILE_MODE,
+	 .i_fop = &rbce_file_operations,
+	 },
+	{
+	 .name = CONFIG_RULES_DIR,
+	 .mode = (RCFS_DEFAULT_DIR_MODE | S_IWUSR),
+	 .i_fop = &simple_dir_operations,
+	 .i_op = &rbce_dir_inode_operations,
+	 }
+};
+
+static struct dentry *ce_root_dentry;
+
+int rbce_create_config(void)
+{
+	int rc;
+
+	/* Make root dentry */
+	rc = rcfs_mkroot(rbce_config_files, RBCE_NR_CONFIG, &ce_root_dentry);
+	if ((!ce_root_dentry) || rc)
+		return rc;
+
+	/* Create config files */
+	if ((rc = rcfs_create_magic(ce_root_dentry, &rbce_config_files[1],
+				    RBCE_NR_CONFIG - 1))) {
+		printk(KERN_ERR "Failed to create c/rbce config files."
+		       " Deleting c/rbce root\n");
+		rcfs_rmroot(ce_root_dentry);
+		return rc;
+	}
+
+	return rc;
+}
+
+int rbce_clear_config(void)
+{
+	int rc = 0;
+	if (ce_root_dentry)
+		rc = rcfs_rmroot(ce_root_dentry);
+	return rc;
+}
+
+/******************************* File ops ********************/
+
+static struct file_operations rbce_file_operations = {
+	.owner = THIS_MODULE,
+	.open = rbce_open,
+	.llseek = seq_lseek,
+	.read = seq_read,
+	.write = rbce_write,
+	.release = rbce_close,
+};
+
+static struct inode_operations rbce_file_inode_operations = {
+	.getattr = simple_getattr,
+};
+
+static struct inode_operations rbce_dir_inode_operations = {
+	.create = rbce_create,
+	.lookup = simple_lookup,
+	.link = rbce_link,
+	.unlink = rbce_unlink,
+	.symlink = rbce_symlink,
+	.mkdir = rbce_mkdir,
+	.rmdir = rbce_rmdir,
+	.mknod = rbce_mknod,
+	.rename = rbce_rename,
+	.getattr = simple_getattr,
+};
+
+struct rbce_eng_callback rbce_rcfs_ecbs = {
+	rbce_mkdir,
+	rbce_rmdir,
+	rbce_create_config,
+	rbce_clear_config
+};
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_internal.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_internal.h	2005-05-05 09:38:01.000000000 -0700
@@ -0,0 +1,59 @@
+/*
+ * rbce internal header file.
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ *           (C) Chandra Seetharaman, IBM Corp. 2003
+ *           (C) Vivek Kashyap, IBM Corp. 2004
+ *
+ * Module for loading of classification policies and providing
+ * a user API for Class-based Kernel Resource Management (CKRM)
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+#ifndef _RBCE_INTERNAL_H
+#define _RBCE_INTERNAL_H
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/mount.h>
+#include <linux/proc_fs.h>
+#include <linux/limits.h>
+#include <linux/pid.h>
+#include <linux/sysctl.h>
+
+#include <linux/ckrm_rc.h>
+#include <linux/ckrm_ce.h>
+#include <linux/ckrm_net.h>
+#include <linux/rbce.h>
+
+#include <asm/io.h>
+#include <asm/uaccess.h>
+
+extern int rbce_enabled;
+extern struct rbce_eng_callback rbce_rcfs_ecbs;
+
+extern int rbce_mkdir(struct inode *, struct dentry *, int);
+extern int rbce_rmdir(struct inode *, struct dentry *);
+extern int rbce_create_config(void);
+extern int rbce_clear_config(void);
+
+extern void rbce_get_rule(const char *, char *);
+extern int rbce_rule_exists(const char *);
+extern int rbce_change_rule(const char *, char *);
+extern int rbce_delete_rule(const char *);
+extern int rbce_set_tasktag(int, char *);
+extern int rbce_rename_rule(const char *, const char *);
+
+#endif /* _RBCE_INTERNAL_H */
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_main.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_main.c	2005-05-05 09:38:01.000000000 -0700
@@ -0,0 +1,97 @@
+/*
+ * Rule Functionality and module initialization and destrution
+ * for Rule-based Classification Engine (RBCE)
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ *           (C) Chandra Seetharaman, IBM Corp. 2003
+ *           (C) Vivek Kashyap, IBM Corp. 2004
+ *
+ * Module for loading of classification policies and providing
+ * a user API for Class-based Kernel Resource Management (CKRM)
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#include "rbce_internal.h"
+
+MODULE_DESCRIPTION(RBCE_MOD_DESCR);
+MODULE_AUTHOR("Hubertus Franke, Chandra Seetharaman (IBM)");
+MODULE_LICENSE("GPL");
+
+static char modname[] = RBCE_MOD_NAME;
+
+/* Stub routines for now */
+void rbce_get_rule(const char *a, char *b)
+{
+}
+int rbce_rule_exists(const char *a)
+{
+	return 0;
+}
+int rbce_change_rule(const char *a, char *b)
+{
+	return 1;
+}
+int rbce_delete_rule(const char *a)
+{
+	return 1;
+}
+int rbce_set_tasktag(int i, char *a)
+{
+	return 1;
+}
+int rbce_rename_rule(const char *a, const char *b)
+{
+	return 1;
+}
+
+int rbce_enabled = 1;
+/* ======================= Module definition Functions ====================== */
+
+int init_rbce(void)
+{
+	int rc, line;
+
+	printk(KERN_INFO "Installing \'%s\' module\n", modname);
+
+	rc = rcfs_register_engine(&rbce_rcfs_ecbs);
+	line = __LINE__;
+	if (rc)
+		goto out;
+
+	if (rcfs_mounted) {
+		rc = rbce_create_config();
+		line = __LINE__;
+		if (!rc)
+			goto out;
+	}
+
+	rcfs_unregister_engine(&rbce_rcfs_ecbs);
+out:
+	printk(KERN_ERR "%s: error installing rc=%d line=%d\n",
+		__FUNCTION__, rc, line);
+	return rc;
+}
+
+void exit_rbce(void)
+{
+	printk(KERN_INFO "Removing \'%s\' module\n", modname);
+
+	if (rcfs_mounted)
+		rbce_clear_config();
+
+	rcfs_unregister_engine(&rbce_rcfs_ecbs);
+}
+
+module_init(init_rbce);
+module_exit(exit_rbce);

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 16/21] CKRM: Rule Based Classification Engine, basic rcfs support
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (14 preceding siblings ...)
  2005-05-05 18:07 ` [patch 15/21] CKRM: Rule Based Classification Engine, stub rcfs support gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 17/21] CKRM: Rule Based Classification Engine, bitvector support for classification info gh
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=09-02-rbce_fs-main


Part 2 of 5 patches to support Rule Based Classification Engine for CKRM.
This patch provides the functionality needed by the rcfs interface for
ce provided in patch 1. No classification functionality yet.

Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@us.ibm.com>
Signed-Off-By: Vivek Kashyap <vivk@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

 Makefile        |    2 
 rbce_internal.h |  161 +++++++++
 rbce_main.c     |  993 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 rbce_token.c    |  241 +++++++++++++
 4 files changed, 1383 insertions(+), 14 deletions(-)

Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/Makefile	2005-05-05 09:38:01.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/Makefile	2005-05-05 09:38:03.000000000 -0700
@@ -3,4 +3,4 @@
 #
 
 obj-$(CONFIG_CKRM_RBCE)	+= rbce.o
-rbce-objs := rbce_fs.o rbce_main.o
+rbce-objs := rbce_fs.o rbce_main.o rbce_token.o
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_internal.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/rbce_internal.h	2005-05-05 09:38:01.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_internal.h	2005-05-05 09:38:03.000000000 -0700
@@ -41,8 +41,165 @@
 #include <asm/io.h>
 #include <asm/uaccess.h>
 
-extern int rbce_enabled;
+/*
+ * comman data structure used for identification of class and rules
+ * in the RBCE namespace
+ */
+struct named_obj_hdr {
+	struct list_head link;
+	int referenced;
+	char *name;
+};
+
+#define GET_REF(x) ((x)->obj.referenced)
+#define INC_REF(x) (GET_REF(x)++)
+#define DEC_REF(x) (--GET_REF(x))
+
+struct rbce_class {
+	struct named_obj_hdr obj;
+	int classtype;
+	void *classobj;
+};
+
+typedef int __bitwise rbce_rule_op_t;
+enum rbce_rule_op {
+	RBCE_RULE_CMD_PATH = (__force rbce_rule_op_t) 1,
+	RBCE_RULE_CMD = (__force rbce_rule_op_t) 2,
+	RBCE_RULE_ARGS = (__force rbce_rule_op_t) 3,
+	RBCE_RULE_REAL_UID = (__force rbce_rule_op_t) 4,
+	RBCE_RULE_REAL_GID = (__force rbce_rule_op_t) 5,
+	RBCE_RULE_EFFECTIVE_UID = (__force rbce_rule_op_t) 6,
+	RBCE_RULE_EFFECTIVE_GID = (__force rbce_rule_op_t) 7,
+	RBCE_RULE_APP_TAG = (__force rbce_rule_op_t) 8,
+	RBCE_RULE_IPV4 = (__force rbce_rule_op_t) 9,
+	RBCE_RULE_IPV6 = (__force rbce_rule_op_t) 10,
+	RBCE_RULE_DEP_RULE = (__force rbce_rule_op_t) 11,
+	RBCE_RULE_INVALID = (__force rbce_rule_op_t) 12,
+	RBCE_RULE_INVALID2 = (__force rbce_rule_op_t) 13,
+};
+
+typedef int __bitwise rbce_operator_t;
+enum rbce_operator {
+	RBCE_EQUAL = (__force rbce_operator_t) 1,
+	RBCE_NOT = (__force rbce_operator_t) 2,
+	RBCE_LESS_THAN = (__force rbce_operator_t) 3,
+	RBCE_GREATER_THAN = (__force rbce_operator_t) 4,
+};
+
+struct rbce_rule_term {
+	rbce_rule_op_t op;
+	rbce_operator_t operator;
+	union {
+		char *string;	/* path, cmd, arg, tag, ipv4 and ipv6 */
+		long id;	/* uid, gid, euid, egid */
+		struct rbce_rule *deprule;
+	} u;
+};
+
+struct rbce_rule {
+	struct named_obj_hdr obj;
+	struct rbce_class *target_class;
+	int classtype;
+	int num_terms;
+	int *terms;	/* vector of indices into the global term vector */
+	int index;	/* index of this rule into the global term vector */
+	int termflag;	/* which term ids would require a recalculation */
+	int do_opt;	/* do we have to consider this rule during optimize */
+	char *strtab;	/* string table to store the strings of all terms */
+	int order;	/* order of execution of this rule */
+	int state;	/* RBCE_RULE_ENABLED/RBCE_RULE_DISABLED */
+};
+
+/* rules states */
+#define RBCE_RULE_DISABLED 0
+#define RBCE_RULE_ENABLED  1
+
+/*
+ * Data structures and macros used for optimization
+ */
+#define RBCE_TERM_CMD   (0)
+#define RBCE_TERM_UID   (1)
+#define RBCE_TERM_GID   (2)
+#define RBCE_TERM_TAG   (3)
+#define RBCE_TERM_IPV4  (4)
+#define RBCE_TERM_IPV6  (5)
+
+#define NUM_TERM_MASK_VECTOR  (6)
+
+/* Rule flags. 1 bit for each type of rule term */
+#define RBCE_TERMFLAG_CMD   (1 << RBCE_TERM_CMD)
+#define RBCE_TERMFLAG_UID   (1 << RBCE_TERM_UID)
+#define RBCE_TERMFLAG_GID   (1 << RBCE_TERM_GID)
+#define RBCE_TERMFLAG_TAG   (1 << RBCE_TERM_TAG)
+#define RBCE_TERMFLAG_IPV4  (1 << RBCE_TERM_IPV4)
+#define RBCE_TERMFLAG_IPV6  (1 << RBCE_TERM_IPV6)
+#define RBCE_TERMFLAG_ALL      (RBCE_TERMFLAG_CMD | RBCE_TERMFLAG_UID |	\
+				RBCE_TERMFLAG_GID | RBCE_TERMFLAG_TAG |	\
+				RBCE_TERMFLAG_IPV4 | RBCE_TERMFLAG_IPV6)
+
+/* Token operation related data structures, functions etc., */
+typedef int __bitwise rule_token_t;
+enum rule_token {
+	TOKEN_PATH = (__force rule_token_t) 1,
+	TOKEN_CMD = (__force rule_token_t) 2,
+	TOKEN_ARGS = (__force rule_token_t) 3,
+	TOKEN_RUID_EQ = (__force rule_token_t) 4,
+	TOKEN_RUID_LT = (__force rule_token_t) 5,
+	TOKEN_RUID_GT = (__force rule_token_t) 6,
+	TOKEN_RUID_NOT = (__force rule_token_t) 7,
+	TOKEN_RGID_EQ = (__force rule_token_t) 8,
+	TOKEN_RGID_LT = (__force rule_token_t) 9,
+	TOKEN_RGID_GT = (__force rule_token_t) 10,
+	TOKEN_RGID_NOT = (__force rule_token_t) 11,
+	TOKEN_EUID_EQ = (__force rule_token_t) 12,
+	TOKEN_EUID_LT = (__force rule_token_t) 13,
+	TOKEN_EUID_GT = (__force rule_token_t) 14,
+	TOKEN_EUID_NOT = (__force rule_token_t) 15,
+	TOKEN_EGID_EQ = (__force rule_token_t) 16,
+	TOKEN_EGID_LT = (__force rule_token_t) 17,
+	TOKEN_EGID_GT = (__force rule_token_t) 18,
+	TOKEN_EGID_NOT = (__force rule_token_t) 19,
+	TOKEN_TAG = (__force rule_token_t) 20,
+	TOKEN_IPV4 = (__force rule_token_t) 21,
+	TOKEN_IPV6 = (__force rule_token_t) 22,
+	TOKEN_DEP = (__force rule_token_t) 23,
+	TOKEN_DEP_ADD = (__force rule_token_t) 24,
+	TOKEN_DEP_DEL = (__force rule_token_t) 25,
+	TOKEN_ORDER = (__force rule_token_t) 26,
+	TOKEN_CLASS = (__force rule_token_t) 27,
+	TOKEN_STATE = (__force rule_token_t) 28,
+	TOKEN_INVALID = (__force rule_token_t) 29
+};
+
+typedef int __bitwise op_token_t;
+enum op_token {
+	TOKEN_OP_EQUAL = (__force op_token_t) RBCE_EQUAL,
+	TOKEN_OP_NOT = (__force op_token_t) RBCE_NOT,
+	TOKEN_OP_LESS_THAN = (__force op_token_t) RBCE_LESS_THAN,
+	TOKEN_OP_GREATER_THAN = (__force op_token_t) RBCE_GREATER_THAN,
+	TOKEN_OP_DEP = (__force op_token_t) (TOKEN_OP_GREATER_THAN+1),
+	TOKEN_OP_DEP_ADD = (__force op_token_t) (TOKEN_OP_GREATER_THAN+2),
+	TOKEN_OP_DEP_DEL = (__force op_token_t) (TOKEN_OP_GREATER_THAN+3),
+	TOKEN_OP_ORDER = (__force op_token_t) (TOKEN_OP_GREATER_THAN+4),
+	TOKEN_OP_CLASS = (__force op_token_t) (TOKEN_OP_GREATER_THAN+5),
+	TOKEN_OP_STATE = (__force op_token_t) (TOKEN_OP_GREATER_THAN+6),
+};
+
+
+/*
+ * data structure rbce_private_data to hold the app_tag for a task.
+ * Expands later.
+ *
+ */
+struct rbce_private_data {
+	char *app_tag;
+};
+
+#define RBCE_DATA(tsk) ((struct rbce_private_data*)((tsk)->ce_data))
+#define RBCE_DATAP(tsk) ((tsk)->ce_data)
+
 extern struct rbce_eng_callback rbce_rcfs_ecbs;
+extern int rbce_enabled;
 
 extern int rbce_mkdir(struct inode *, struct dentry *, int);
 extern int rbce_rmdir(struct inode *, struct dentry *);
@@ -56,4 +213,6 @@ extern int rbce_delete_rule(const char *
 extern int rbce_set_tasktag(int, char *);
 extern int rbce_rename_rule(const char *, const char *);
 
+extern int rules_parse(char *, struct rbce_rule_term **, int *);
+
 #endif /* _RBCE_INTERNAL_H */
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_main.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/rbce_main.c	2005-05-05 09:38:01.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_main.c	2005-05-05 09:38:03.000000000 -0700
@@ -30,32 +30,1001 @@ MODULE_LICENSE("GPL");
 
 static char modname[] = RBCE_MOD_NAME;
 
-/* Stub routines for now */
-void rbce_get_rule(const char *a, char *b)
+/* ==================== global variables etc., ==================== */
+
+int termop_2_vecidx[RBCE_RULE_INVALID] = {
+	[RBCE_RULE_CMD_PATH] = RBCE_TERM_CMD,
+	[RBCE_RULE_CMD] = RBCE_TERM_CMD,
+	[RBCE_RULE_ARGS] = RBCE_TERM_CMD,
+	[RBCE_RULE_REAL_UID] = RBCE_TERM_UID,
+	[RBCE_RULE_REAL_GID] = RBCE_TERM_GID,
+	[RBCE_RULE_EFFECTIVE_UID] = RBCE_TERM_UID,
+	[RBCE_RULE_EFFECTIVE_GID] = RBCE_TERM_GID,
+	[RBCE_RULE_APP_TAG] = RBCE_TERM_TAG,
+	[RBCE_RULE_IPV4] = RBCE_TERM_IPV4,
+	[RBCE_RULE_IPV6] = RBCE_TERM_IPV6,
+	[RBCE_RULE_DEP_RULE] = -1
+};
+
+#define TERMOP_2_TERMFLAG(x)	(1 << termop_2_vecidx[x])
+#define TERM_2_TERMFLAG(x)		(1 << x)
+
+#define POLICY_INC_NUMTERMS	(BITS_PER_LONG)	/* No. of terms alloc'd once */
+#define POLICY_ACTION_NEW_VERSION	0x01	/* Force reallocation */
+#define POLICY_ACTION_REDO_ALL		0x02	/* Recompute all rule flags */
+#define POLICY_ACTION_PACK_TERMS	0x04	/* Time to pack the terms */
+
+extern int errno;
+
+int rbce_enabled = 1;
+
+static LIST_HEAD(class_list);
+static struct list_head rules_list[CKRM_MAX_CLASSTYPES];
+static int gl_num_rules;
+static int gl_action, gl_num_terms;
+static int gl_allocated, gl_released;
+static struct rbce_rule_term *gl_terms;
+static int gl_rules_version;
+static rwlock_t rbce_rwlock = RW_LOCK_UNLOCKED;
+	/*
+	 * One lock to protect them all !!!
+	 * Additions, deletions to rules must
+	 * happen with this lock being held in write mode.
+	 * Access(read/write) to any of the data structures must happen
+	 * with this lock held in read mode.
+	 * Since, rule related changes do not happen very often it is ok to
+	 * have single rwlock.
+	 */
+
+/* ======================= Helper Functions ========================= */
+
+static inline struct rbce_rule *find_rule_name(const char *name)
+{
+	struct named_obj_hdr *pos;
+	int i;
+
+	for (i = 0; i < CKRM_MAX_CLASSTYPES; i++)
+		list_for_each_entry(pos, &rules_list[i], link)
+			if (!strcmp(pos->name, name))
+				return ((struct rbce_rule *)pos);
+	return NULL;
+}
+
+struct rbce_class *find_class_name(const char *name)
 {
+	struct named_obj_hdr *pos;
+
+	list_for_each_entry(pos, &class_list, link)
+		if (!strcmp(pos->name, name))
+			return (struct rbce_class *)pos;
+	return NULL;
 }
-int rbce_rule_exists(const char *a)
+
+/* Type of Rule insertion */
+#define INSERT		0	/* new rule */
+#define REINSERT	1	/* existing rule */
+
+/*
+ * Insert the given rule at the specified order
+ * 		order = -1 ==> insert at the tail.
+ *		type == INSERT - insert the rule
+ *		type == REINSERT - remove the rule from its current
+ *				 position and reinsert accoring to order.
+ *
+ * Caller must hold rbce_rwlock in write mode.
+ */
+static int insert_rule(struct rbce_rule *rule, int order, int type)
 {
+#define ORDER_COUNTER_INCR 10
+	static int order_counter;
+	int old_counter;
+	struct list_head *head = &rules_list[rule->classtype];
+	struct list_head *insert = head;
+	struct rbce_rule *tmp;
+
+	if (gl_num_rules == 0)
+		order_counter = 0;
+
+	switch (order) {
+	case -1:
+		rule->order = order_counter;
+		/* FIXME: order_counter overflow/wraparound!! */
+		order_counter += ORDER_COUNTER_INCR;
+		break;
+	default:
+		old_counter = order_counter;
+		if (order_counter < order)
+			order_counter = order;
+		rule->order = order;
+		order_counter += ORDER_COUNTER_INCR;
+		list_for_each_entry(tmp, head, obj.link) {
+			if (rule == tmp)
+				continue;
+			if (rule->order == tmp->order) {
+				order_counter = old_counter;
+				return -EEXIST;
+			}
+			if (rule->order < tmp->order) {
+				insert = &tmp->obj.link;
+				break;
+			}
+		}
+	}
+	if (type == REINSERT)
+		list_del(&rule->obj.link);
+	else {
+		/*  protect the module from removed if a rule exists */
+		try_module_get(THIS_MODULE);
+		gl_num_rules++;
+	}
+	gl_rules_version++;
+	list_add_tail(&rule->obj.link, insert);
 	return 0;
 }
-int rbce_change_rule(const char *a, char *b)
+
+/*
+ * Get a refernece to the class, create one if it doesn't exist
+ *
+ * Caller need to hold rbce_rwlock in write mode.
+ */
+
+struct rbce_class *create_rbce_class(const char *classname,
+					    int classtype, void *classobj)
 {
-	return 1;
+	struct rbce_class *cls;
+
+	if (classtype >= CKRM_MAX_CLASSTYPES) {
+		printk(KERN_ERR
+		       "ckrm_classobj returned %d as classtype which cannot "
+		       " be handled by RBCE\n", classtype);
+		return NULL;
+	}
+
+	cls = kmalloc(sizeof(struct rbce_class), GFP_ATOMIC);
+	if (!cls)
+		return NULL;
+	cls->obj.name = kmalloc(strlen(classname) + 1, GFP_ATOMIC);
+	if (cls->obj.name) {
+		GET_REF(cls) = 1;
+		cls->classobj = classobj;
+		strcpy(cls->obj.name, classname);
+		list_add_tail(&cls->obj.link, &class_list);
+		cls->classtype = classtype;
+	} else {
+		kfree(cls);
+		cls = NULL;
+	}
+	return cls;
 }
-int rbce_delete_rule(const char *a)
+
+static struct rbce_class *get_class(char *classname, int *classtype)
 {
-	return 1;
+	struct rbce_class *cls;
+	void *classobj;
+
+	if (!classname)
+		return NULL;
+	cls = find_class_name(classname);
+	if (cls) {
+		if (cls->classobj) {
+			INC_REF(cls);
+			*classtype = cls->classtype;
+			return cls;
+		}
+		return NULL;
+	}
+	classobj = ckrm_classobj(classname, classtype);
+	if (!classobj)
+		return NULL;
+	return create_rbce_class(classname, *classtype, classobj);
 }
-int rbce_set_tasktag(int i, char *a)
+
+/*
+ * Drop a refernece to the class, create one if it doesn't exist
+ *
+ * Caller need to hold rbce_rwlock in write mode.
+ */
+void put_class(struct rbce_class *cls)
 {
-	return 1;
+	if (cls && (DEC_REF(cls) <= 0)) {
+		list_del(&cls->obj.link);
+		kfree(cls->obj.name);
+		kfree(cls);
+	}
+	return;
 }
-int rbce_rename_rule(const char *a, const char *b)
+
+/*
+ * Allocate an index in the global term vector
+ * On success, returns the index. On failure returns -errno.
+ * Caller must hold the rbce_rwlock in write mode as global data is
+ * written onto.
+ */
+static int alloc_term_index(void)
 {
-	return 1;
+	int size = gl_allocated;
+
+	if (gl_num_terms >= size) {
+		int i;
+		struct rbce_rule_term *oldv, *newv;
+		int newsize = size + POLICY_INC_NUMTERMS;
+
+		oldv = gl_terms;
+		newv =
+		    kmalloc(newsize * sizeof(struct rbce_rule_term),
+			    GFP_ATOMIC);
+		if (!newv)
+			return -ENOMEM;
+		memcpy(newv, oldv, size * sizeof(struct rbce_rule_term));
+		for (i = size; i < newsize; i++)
+			newv[i].op = -1;
+		gl_terms = newv;
+		gl_allocated = newsize;
+		kfree(oldv);
+
+		gl_action |= POLICY_ACTION_NEW_VERSION;
+	}
+	return gl_num_terms++;
+}
+
+/*
+ * Release an index in the global term vector
+ *
+ * Caller must hold the rbce_rwlock in write mode as the global data
+ * is written onto.
+ */
+static void release_term_index(int idx)
+{
+	if ((idx < 0) || (idx > gl_num_terms))
+		return;
+
+	gl_terms[idx].op = -1;
+	gl_released++;
+	if ((gl_released > POLICY_INC_NUMTERMS) &&
+	    (gl_allocated >
+	     (gl_num_terms - gl_released + POLICY_INC_NUMTERMS))) {
+		gl_action |= POLICY_ACTION_PACK_TERMS;
+	}
+	return;
+}
+
+/*
+ * Release the indices, string memory, and terms associated with the given
+ * rule.
+ *
+ * Caller should be holding rbce_rwlock
+ */
+static void __release_rule(struct rbce_rule *rule)
+{
+	int i, *terms = rule->terms;
+
+	/* remove memory and references from other rules */
+	for (i = rule->num_terms; --i >= 0;) {
+		struct rbce_rule_term *term = &gl_terms[terms[i]];
+
+		if (term->op == RBCE_RULE_DEP_RULE)
+			DEC_REF(term->u.deprule);
+		release_term_index(terms[i]);
+	}
+	rule->num_terms = 0;
+	if (rule->strtab) {
+		kfree(rule->strtab);
+		rule->strtab = NULL;
+	}
+	if (rule->terms) {
+		kfree(rule->terms);
+		rule->terms = NULL;
+	}
+	return;
+}
+
+/*
+ * delete the given rule and all memory associated with it.
+ *
+ * Caller is responsible for protecting the global data
+ */
+static inline int __delete_rule(struct rbce_rule *rule)
+{
+	/* make sure we are not referenced by other rules */
+	if (GET_REF(rule))
+		return -EBUSY;
+	__release_rule(rule);
+	put_class(rule->target_class);
+	release_term_index(rule->index);
+	list_del(&rule->obj.link);
+	gl_num_rules--;
+	gl_rules_version++;
+	module_put(THIS_MODULE);
+	kfree(rule->obj.name);
+	kfree(rule);
+	return 0;
+}
+
+/* ======================= Rule related Functions ========================= */
+
+/*
+ * This function takes terms as input and digests the valid rbce terms
+ * and fills the newrule appropriately.
+ *	 Valid terms have op != RBCE_RULE_INVALID
+ * This function returns the number of valid terms found.
+ * In case of error it return -errno
+ */
+static inline int
+digest_terms(struct rbce_rule *newrule,
+	struct rbce_rule_term *terms, int nterms)
+{
+	char *strtab = NULL;
+	struct rbce_rule *deprule;
+	int i, j, real_nterms = 0, strtablen = 0;
+
+	for (i = 0; i < nterms; i++) {
+		if (terms[i].op == RBCE_RULE_INVALID)
+			continue;
+		real_nterms++;
+
+		switch (terms[i].op) {
+		case RBCE_RULE_DEP_RULE:
+			/* check if the depend rule is valid */
+			deprule = find_rule_name(terms[i].u.string);
+			if (!deprule || deprule == newrule) {
+				real_nterms = -EINVAL;
+				goto out;
+			} else {
+				/* make sure _a_ depend rule */
+				/* appears in only one term. */
+				for (j = 0; j < i; j++) {
+					if (terms[j].op ==
+					    RBCE_RULE_DEP_RULE
+					    && terms[j].u.deprule ==
+					    deprule) {
+						real_nterms = -EINVAL;
+						goto out;
+					}
+				}
+				terms[i].u.deprule = deprule;
+			}
+
+			/* +depend is acceptable and -depend is not */
+			if (terms[i].operator != TOKEN_OP_DEP_DEL)
+				terms[i].operator = RBCE_EQUAL;
+			else {
+				real_nterms = -EINVAL;
+				goto out;
+			}
+			break;
+
+		case RBCE_RULE_CMD_PATH:
+		case RBCE_RULE_CMD:
+		case RBCE_RULE_ARGS:
+		case RBCE_RULE_APP_TAG:
+		case RBCE_RULE_IPV4:
+		case RBCE_RULE_IPV6:
+			/* sum up the string length */
+			strtablen += strlen(terms[i].u.string) + 1;
+			break;
+		default:
+			break;
+
+		}
+	}
+	if (strtablen) {
+		strtab = kmalloc(strtablen, GFP_ATOMIC);
+		if (!strtab)
+			real_nterms = -ENOMEM;
+		else {
+			if (newrule->strtab)
+				kfree(newrule->strtab);
+			newrule->strtab = strtab;
+		}
+	}
+out:
+	return real_nterms;
+}
+
+/*
+ * This function takes terms as input and digests the non rbce terms
+ * and fills newrule appropriately.
+ * non rbce teerms are used to get attribute/value that are not part
+ * of the rule terms.
+ *	 non rbce terms have op != RBCE_RULE_INVALID
+ * This function returns 0 on success and -errno in case of error
+ */
+static inline int
+digest_nonterms(struct rbce_rule *newrule,
+	struct rbce_rule_term *terms, int nterms)
+{
+	char *class = NULL;
+	int state = -1, order = -1, rc = 0, i;
+
+	for (i = 0; i < nterms; i++) {
+		if (terms[i].op != RBCE_RULE_INVALID)
+			continue;
+		switch (terms[i].operator) {
+		case TOKEN_OP_ORDER:
+			order = terms[i].u.id;
+			if (order < 0) {
+				rc = -EINVAL;
+				goto out;
+			}
+			break;
+		case TOKEN_OP_STATE:
+			state = terms[i].u.id != 0;
+			break;
+		case TOKEN_OP_CLASS:
+			class = terms[i].u.string;
+			break;
+		default:
+			break;
+		}
+	}
+
+	/* Check if class was specified */
+	if (class != NULL) {
+		int classtype;
+		struct rbce_class *targetcls;
+		if ((targetcls = get_class(class, &classtype)) == NULL) {
+			rc = -EINVAL;
+			goto out;
+		}
+		if (newrule->target_class)
+			put_class(newrule->target_class);
+
+		newrule->target_class = targetcls;
+		newrule->classtype = classtype;
+	}
+	if (!newrule->target_class) {
+		rc = -EINVAL;
+		goto out;
+	}
+	if (state != -1)
+		newrule->state = state;
+	if (order != -1)
+		newrule->order = order;
+out:
+	return rc;
+}
+
+/*
+ * Allocate and fill the term vectors of the newrule from the terms array.
+ * Only handle the realy terms and ignore the nonterms.
+ */
+static inline int
+fill_term_vector(struct rbce_rule *newrule,
+	struct rbce_rule_term *terms, int real_nterms)
+{
+	int i, rc = 0, strtablen = 0, j, ii;
+	struct rbce_rule_term *term;
+
+	newrule->terms = kmalloc(real_nterms * sizeof(int), GFP_ATOMIC);
+	if (!newrule->terms) {
+		rc = -ENOMEM;
+		goto out;
+	}
+
+	for (i = 0, j = 0; j < real_nterms; i++, j++) {
+		if (terms[i].op == RBCE_RULE_INVALID)
+			continue;
+
+		newrule->terms[j] = alloc_term_index();
+		if (newrule->terms[j] < 0) {
+			for (ii = 0; ii < j; ii++)
+				release_term_index(newrule->terms[ii]);
+			rc = -ENOMEM;
+			goto out;
+		}
+		term = &gl_terms[newrule->terms[j]];
+		term->op = terms[i].op;
+		term->operator = terms[i].operator;
+		switch (terms[i].op) {
+		case RBCE_RULE_CMD_PATH:
+		case RBCE_RULE_CMD:
+		case RBCE_RULE_ARGS:
+		case RBCE_RULE_APP_TAG:
+		case RBCE_RULE_IPV4:
+		case RBCE_RULE_IPV6:
+			term->u.string = &newrule->strtab[strtablen];
+			strcpy(term->u.string, terms[i].u.string);
+			strtablen = strlen(term->u.string) + 1;
+			break;
+
+		case RBCE_RULE_REAL_UID:
+		case RBCE_RULE_REAL_GID:
+		case RBCE_RULE_EFFECTIVE_UID:
+		case RBCE_RULE_EFFECTIVE_GID:
+			term->u.id = terms[i].u.id;
+			break;
+
+		case RBCE_RULE_DEP_RULE:
+			term->u.deprule = terms[i].u.deprule;
+			INC_REF(term->u.deprule);
+			break;
+		default:
+			break;
+		}
+	}
+out:
+	if (rc) {
+		kfree(newrule->terms);
+		newrule->terms = NULL;
+	}
+	return rc;
+
+}
+
+/*
+ * Caller need to hold rbce_rwlock in write mode.
+ */
+static int
+fill_rule(struct rbce_rule *newrule, struct rbce_rule_term *terms, int nterms)
+{
+	int real_nterms, index = -1, rc = 0;
+	struct rbce_rule_term *term = NULL;
+
+	if (!newrule)
+		return -EINVAL;
+	newrule->num_terms = 0;
+	newrule->termflag = 0;
+
+	/* Digest filled terms. */
+	real_nterms = digest_terms(newrule, terms, nterms);
+	if (real_nterms < 0)
+		return real_nterms;
+	rc = digest_nonterms(newrule, terms, nterms);
+	if (rc < 0)
+		goto out;
+
+	if (newrule->index == -1) {
+		index = alloc_term_index();
+		if (index < 0) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		newrule->index = index;
+		term = &gl_terms[newrule->index];
+		term->op = RBCE_RULE_DEP_RULE;
+		term->u.deprule = newrule;
+	}
+
+	rc = fill_term_vector(newrule, terms, real_nterms);
+out:
+	if (rc) {
+		if (newrule->target_class) {
+			put_class(newrule->target_class);
+			newrule->target_class = NULL;
+		}
+		if (index >= 0) {
+			release_term_index(index);
+			newrule->index = -1;
+		}
+		kfree(newrule->terms);
+		newrule->terms = NULL;
+		kfree(newrule->strtab);
+		newrule->strtab = NULL;
+		newrule->num_terms = 0;
+	} else
+		newrule->num_terms = real_nterms;
+	return rc;
+}
+
+static inline int
+rbce_create_rule(struct rbce_rule_term *new_terms,
+	int nterms, const char *rname)
+{
+	struct rbce_rule *rule;
+	int rc = -ENOMEM;
+
+	rule = kmalloc (sizeof(struct rbce_rule), GFP_ATOMIC);
+	if (rule) {
+		rule->obj.name = kmalloc(strlen(rname) + 1, GFP_ATOMIC);
+		if (rule->obj.name) {
+			strcpy(rule->obj.name, rname);
+			GET_REF(rule) = 0;
+			rule->order = -1;
+			rule->index = -1;
+			rule->num_terms = 0;
+			rule->state = RBCE_RULE_ENABLED;
+			rule->target_class = NULL;
+			rule->strtab = NULL;
+			rule->classtype = -1;
+			rule->terms = NULL;
+			rule->do_opt = 1;
+			INIT_LIST_HEAD(&rule->obj.link);
+			rc = fill_rule(rule, new_terms, nterms);
+			if (rc) {
+				kfree(rule->obj.name);
+				kfree(rule);
+			} else if ((rc = insert_rule(rule,
+					rule->order, INSERT)) != 0)
+				__delete_rule(rule);
+		} else
+			kfree(rule);
+	}
+	return rc;
+}
+
+static inline struct rbce_rule_term *
+merge_terms(struct rbce_rule *rule, struct rbce_rule_term *new_terms,
+	int nterms, int new_term_mask, int *merged_nterms)
+{
+	struct rbce_rule_term *terms, *term;
+	struct rbce_rule *deprule;
+	int i, j, k, oterms, tot_terms, strlocal_len;
+	char *strlocal;
+
+	oterms = rule->num_terms;
+	tot_terms = nterms + oterms;
+	*merged_nterms = 0;
+
+	terms = kmalloc(tot_terms * sizeof(struct rbce_rule_term), GFP_ATOMIC);
+
+	if (!terms) {
+		return NULL;
+	}
+
+	/* Assume we are going to copy all strings from the original rule. */
+	strlocal_len = 0;
+	for (i = 0; i < oterms; i++) {
+		term = &gl_terms[rule->terms[i]];
+		switch (term->op) {
+		case RBCE_RULE_CMD_PATH:
+		case RBCE_RULE_CMD:
+		case RBCE_RULE_ARGS:
+		case RBCE_RULE_APP_TAG:
+		case RBCE_RULE_IPV4:
+		case RBCE_RULE_IPV6:
+			strlocal_len += strlen(term->u.string) + 1;
+			break;
+		default:
+			break;
+		}
+	}
+
+	strlocal = kmalloc(strlocal_len, GFP_ATOMIC);
+	if (!strlocal) {
+		kfree(terms);
+		return NULL;
+	}
+
+	strlocal_len = 0;
+	new_term_mask &= ~(1 << RBCE_RULE_DEP_RULE);
+	/* ignore the new deprule terms for the first iteration. */
+	/* taken care of later. */
+	for (i = 0; i < oterms; i++) {
+		term = &gl_terms[rule->terms[i]];	/* old term */
+
+		if ((1 << term->op) & new_term_mask) {
+			/* newrule has this attr/value */
+			for (j = 0; j < nterms; j++)
+				if (term->op == new_terms[j].op) {
+					terms[i].op = new_terms[j].op;
+					terms[i].operator = new_terms[j].
+					    operator;
+					terms[i].u.string =
+					    new_terms[j].u.string;
+					new_terms[j].op = RBCE_RULE_INVALID2;
+					break;
+				}
+		} else {
+			terms[i].op = term->op;
+			terms[i].operator = term->operator;
+			switch (term->op) {
+			case RBCE_RULE_CMD_PATH:
+			case RBCE_RULE_CMD:
+			case RBCE_RULE_ARGS:
+			case RBCE_RULE_APP_TAG:
+			case RBCE_RULE_IPV4:
+			case RBCE_RULE_IPV6:
+				terms[i].u.string = &strlocal[strlocal_len];
+				strcpy(terms[i].u.string, term->u.string);
+				strlocal_len = strlen(terms[i].u.string) + 1;
+				break;
+			default:
+				terms[i].u.string = term->u.string;
+				break;
+			}
+		}
+	}
+
+	i = oterms;		/* for readability */
+
+	for (j = 0; j < nterms; j++) {
+		/* handled in the previous iteration */
+		if (new_terms[j].op == RBCE_RULE_INVALID2)
+			continue;
+		if (new_terms[j].op == RBCE_RULE_DEP_RULE) {
+			if (new_terms[j].operator == TOKEN_OP_DEP) {
+				/*
+				 * "depend=rule" deletes all depends in the
+				 * original rule so, delete all depend rule
+				 * terms in the original rule
+				 */
+				for (k = 0; k < oterms; k++)
+					if (terms[k].op == RBCE_RULE_DEP_RULE)
+						terms[k].op = RBCE_RULE_INVALID;
+				/* must copy the new deprule term */
+			} else {
+				/*
+				 * delete the depend rule term if was defined
+				 * in the original rule for both +depend
+				 * and -depend
+				 */
+				deprule = find_rule_name(new_terms[j].u.string);
+				if (deprule)
+					for (k = 0; k < oterms; k++) {
+						if (terms[k].op ==
+						    RBCE_RULE_DEP_RULE
+						    && terms[k].u.deprule ==
+						    deprule) {
+							terms[k].op =
+							    RBCE_RULE_INVALID;
+							break;
+						}
+					}
+				if (new_terms[j].operator == TOKEN_OP_DEP_DEL)
+					/* No need to copy the new deprule */
+					continue;
+			}
+		}
+		terms[i].op = new_terms[j].op;
+		terms[i].operator = new_terms[j].operator;
+		terms[i].u.string = new_terms[j].u.string;
+		i++;
+		new_terms[j].op = RBCE_RULE_INVALID2;
+	}
+
+	tot_terms = i;
+
+	/* convert old deprule pointers to name pointers. */
+	for (i = 0; i < oterms; i++) {
+		if (terms[i].op != RBCE_RULE_DEP_RULE)
+			continue;
+		terms[i].u.string = terms[i].u.deprule->obj.name;
+	}
+
+	*merged_nterms = tot_terms;
+	return terms;
+}
+
+int rbce_change_rule(const char *rname, char *rdefn)
+{
+	struct rbce_rule *rule;
+	struct rbce_rule_term *new_terms = NULL, *merged_terms;
+	int nterms, new_term_mask = 0, merged_nterms, rc = -ENOMEM;
+
+	if ((nterms = rules_parse(rdefn, &new_terms, &new_term_mask)) <= 0)
+		return !nterms ? -EINVAL : nterms;
+
+	write_lock(&rbce_rwlock);
+	rule = find_rule_name(rname);
+	if (rule == NULL) {
+		rc = rbce_create_rule(new_terms, nterms, rname);
+		kfree(new_terms);
+		write_unlock(&rbce_rwlock);
+		return rc;
+	}
+
+	merged_terms = merge_terms(rule, new_terms, nterms, new_term_mask,
+			&merged_nterms);
+
+	if (merged_terms) {
+		/* release the rule */
+		__release_rule(rule);
+
+		rule->do_opt = 1;
+		rc = fill_rule(rule, merged_terms, merged_nterms);
+		if (rc == 0)
+			rc = insert_rule(rule, rule->order, REINSERT);
+		if (rc != 0)	/* rule creation/insertion failed */
+			__delete_rule(rule);
+	}
+
+	write_unlock(&rbce_rwlock);
+	kfree(merged_terms);
+	kfree(new_terms);
+	return rc;
+}
+
+/*
+ * Delete the specified rule.
+ *
+ */
+int rbce_delete_rule(const char *rname)
+{
+	int rc = 0;
+	struct rbce_rule *rule;
+
+	write_lock(&rbce_rwlock);
+
+	if ((rule = find_rule_name(rname)) != NULL)
+		rc = __delete_rule(rule);
+	write_unlock(&rbce_rwlock);
+	return rc;
+}
+
+/*
+ * copy the rule specified by rname and to the given result string.
+ *
+ */
+void rbce_get_rule(const char *rname, char *result)
+{
+	int i;
+	struct rbce_rule *rule;
+	struct rbce_rule_term *term;
+	char *cp = result, oper, idtype[3], str[5];
+
+	read_lock(&rbce_rwlock);
+
+	rule = find_rule_name(rname);
+	if (rule != NULL) {
+		for (i = 0; i < rule->num_terms; i++) {
+			term = gl_terms + rule->terms[i];
+			switch (term->op) {
+			case RBCE_RULE_REAL_UID:
+				strcpy(idtype, "u");
+				goto handleid;
+			case RBCE_RULE_REAL_GID:
+				strcpy(idtype, "g");
+				goto handleid;
+			case RBCE_RULE_EFFECTIVE_UID:
+				strcpy(idtype, "eu");
+				goto handleid;
+			case RBCE_RULE_EFFECTIVE_GID:
+				strcpy(idtype, "eg");
+			handleid:
+				switch (term->operator) {
+				case RBCE_LESS_THAN:
+					oper = '<';
+					break;
+				case RBCE_GREATER_THAN:
+					oper = '>';
+					break;
+				case RBCE_NOT:
+					oper = '!';
+					break;
+				default:
+					oper = '=';
+					break;
+				}
+				cp +=
+				    sprintf(cp, "%sid%c%ld,", idtype, oper,
+					    term->u.id);
+				break;
+			case RBCE_RULE_CMD_PATH:
+				strcpy(str, "path");
+				goto handle_str;
+			case RBCE_RULE_CMD:
+				strcpy(str, "cmd");
+				goto handle_str;
+			case RBCE_RULE_ARGS:
+				strcpy(str, "args");
+				goto handle_str;
+			case RBCE_RULE_APP_TAG:
+				strcpy(str, "tag");
+				goto handle_str;
+			case RBCE_RULE_IPV4:
+				strcpy(str, "ipv4");
+				goto handle_str;
+			case RBCE_RULE_IPV6:
+				strcpy(str, "ipv6");
+			      handle_str:
+				cp +=
+				    sprintf(cp, "%s=%s,", str, term->u.string);
+				break;
+			case RBCE_RULE_DEP_RULE:
+				cp +=
+				    sprintf(cp, "depend=%s,",
+					    term->u.deprule->obj.name);
+				break;
+			default:
+				break;
+			}
+		}
+
+		cp +=
+		    sprintf(cp, "order=%d,state=%d,", rule->order, rule->state);
+		cp +=
+		    sprintf(cp, "class=%s",
+			    rule->target_class ? rule->target_class->obj.
+			    name : "*****REMOVED*****");
+		*cp = '\0';
+	} else
+		sprintf(result, "***** Rule %s doesn't exist *****", rname);
+
+	read_unlock(&rbce_rwlock);
+	return;
+}
+
+/*
+ * Change the name of the given rule "from_rname" to "to_rname"
+ *
+ */
+int rbce_rename_rule(const char *from_rname, const char *to_rname)
+{
+	struct rbce_rule *rule;
+	int nlen, rc = -EINVAL;
+
+	if (!to_rname || !*to_rname)
+		return rc;
+	write_lock(&rbce_rwlock);
+
+	rule = find_rule_name(from_rname);
+	if (rule != NULL) {
+		if ((nlen = strlen(to_rname)) > strlen(rule->obj.name)) {
+			char *name = kmalloc(nlen + 1, GFP_ATOMIC);
+			if (!name)
+				return -ENOMEM;
+			kfree(rule->obj.name);
+			rule->obj.name = name;
+		}
+		strcpy(rule->obj.name, to_rname);
+		rc = 0;
+	}
+	write_unlock(&rbce_rwlock);
+	return rc;
+}
+
+/*
+ * Return TRUE if the given rule exists, FALSE otherwise
+ *
+ */
+int rbce_rule_exists(const char *rname)
+{
+	struct rbce_rule *rule;
+
+	read_lock(&rbce_rwlock);
+	rule = find_rule_name(rname);
+	read_unlock(&rbce_rwlock);
+	return rule != NULL;
+}
+
+/*====================== Magic file handling =======================*/
+struct rbce_private_data *create_private_data(struct rbce_private_data *a,
+						     int b)
+{
+	return NULL;
+}
+
+int rbce_set_tasktag(int pid, char *tag)
+{
+	char *tp;
+	int rc = 0;
+	struct task_struct *tsk;
+	struct rbce_private_data *pdata;
+	int len;
+
+	if (!tag)
+		return -EINVAL;
+	len = strlen(tag) + 1;
+	tp = kmalloc(len, GFP_ATOMIC);
+	if (!tp)
+		return -ENOMEM;
+	strncpy(tp,tag,len);
+
+	read_lock(&tasklist_lock);
+	if ((tsk = find_task_by_pid(pid)) == NULL) {
+		rc = -EINVAL;
+		goto out;
+	}
+
+	if (!RBCE_DATA(tsk)) {
+		RBCE_DATAP(tsk) = create_private_data(NULL, 0);
+		if (!RBCE_DATA(tsk)) {
+			rc = -ENOMEM;
+			goto out;
+		}
+	}
+	pdata = RBCE_DATA(tsk);
+	if (pdata->app_tag)
+		kfree(pdata->app_tag);
+	pdata->app_tag = tp;
+
+out:
+	read_unlock(&tasklist_lock);
+	if (rc != 0)
+		kfree(tp);
+	return rc;
 }
 
-int rbce_enabled = 1;
 /* ======================= Module definition Functions ====================== */
 
 int init_rbce(void)
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_token.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_token.c	2005-05-05 09:38:03.000000000 -0700
@@ -0,0 +1,241 @@
+/* Tokens for Rule-based Classification Engine (RBCE) and
+ * Consolidated RBCE module code (combined)
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ *           (C) Chandra Seetharaman, IBM Corp. 2003
+ *           (C) Vivek Kashyap, IBM Corp. 2004
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ *
+ */
+
+#include <linux/parser.h>
+#include <linux/ctype.h>
+#include "rbce_internal.h"
+
+int token_to_ruleop[TOKEN_INVALID + 1] = {
+	[TOKEN_PATH] = RBCE_RULE_CMD_PATH,
+	[TOKEN_CMD] = RBCE_RULE_CMD,
+	[TOKEN_ARGS] = RBCE_RULE_ARGS,
+	[TOKEN_RUID_EQ] = RBCE_RULE_REAL_UID,
+	[TOKEN_RUID_LT] = RBCE_RULE_REAL_UID,
+	[TOKEN_RUID_GT] = RBCE_RULE_REAL_UID,
+	[TOKEN_RUID_NOT] = RBCE_RULE_REAL_UID,
+	[TOKEN_RGID_EQ] = RBCE_RULE_REAL_GID,
+	[TOKEN_RGID_LT] = RBCE_RULE_REAL_GID,
+	[TOKEN_RGID_GT] = RBCE_RULE_REAL_GID,
+	[TOKEN_RGID_NOT] = RBCE_RULE_REAL_GID,
+	[TOKEN_EUID_EQ] = RBCE_RULE_EFFECTIVE_UID,
+	[TOKEN_EUID_LT] = RBCE_RULE_EFFECTIVE_UID,
+	[TOKEN_EUID_GT] = RBCE_RULE_EFFECTIVE_UID,
+	[TOKEN_EUID_NOT] = RBCE_RULE_EFFECTIVE_UID,
+	[TOKEN_EGID_EQ] = RBCE_RULE_EFFECTIVE_GID,
+	[TOKEN_EGID_LT] = RBCE_RULE_EFFECTIVE_GID,
+	[TOKEN_EGID_GT] = RBCE_RULE_EFFECTIVE_GID,
+	[TOKEN_EGID_NOT] = RBCE_RULE_EFFECTIVE_GID,
+	[TOKEN_TAG] = RBCE_RULE_APP_TAG,
+	[TOKEN_IPV4] = RBCE_RULE_IPV4,
+	[TOKEN_IPV6] = RBCE_RULE_IPV6,
+	[TOKEN_DEP] = RBCE_RULE_DEP_RULE,
+	[TOKEN_DEP_ADD] = RBCE_RULE_DEP_RULE,
+	[TOKEN_DEP_DEL] = RBCE_RULE_DEP_RULE,
+	[TOKEN_ORDER] = RBCE_RULE_INVALID,
+	[TOKEN_CLASS] = RBCE_RULE_INVALID,
+	[TOKEN_STATE] = RBCE_RULE_INVALID,
+};
+
+enum op_token token_to_operator[TOKEN_INVALID + 1] = {
+	[TOKEN_PATH] = (__force op_token_t) TOKEN_OP_EQUAL,
+	[TOKEN_CMD] = (__force op_token_t) TOKEN_OP_EQUAL,
+	[TOKEN_ARGS] = (__force op_token_t) TOKEN_OP_EQUAL,
+	[TOKEN_RUID_EQ] = (__force op_token_t) TOKEN_OP_EQUAL,
+	[TOKEN_RUID_LT] = (__force op_token_t) TOKEN_OP_LESS_THAN,
+	[TOKEN_RUID_GT] = (__force op_token_t) TOKEN_OP_GREATER_THAN,
+	[TOKEN_RUID_NOT] = (__force op_token_t) TOKEN_OP_NOT,
+	[TOKEN_RGID_EQ] = (__force op_token_t) TOKEN_OP_EQUAL,
+	[TOKEN_RGID_LT] = (__force op_token_t) TOKEN_OP_LESS_THAN,
+	[TOKEN_RGID_GT] = (__force op_token_t) TOKEN_OP_GREATER_THAN,
+	[TOKEN_RGID_NOT] = (__force op_token_t) TOKEN_OP_NOT,
+	[TOKEN_EUID_EQ] = (__force op_token_t) TOKEN_OP_EQUAL,
+	[TOKEN_EUID_LT] = (__force op_token_t) TOKEN_OP_LESS_THAN,
+	[TOKEN_EUID_GT] = (__force op_token_t) TOKEN_OP_GREATER_THAN,
+	[TOKEN_EUID_NOT] = (__force op_token_t) TOKEN_OP_NOT,
+	[TOKEN_EGID_EQ] = (__force op_token_t) TOKEN_OP_EQUAL,
+	[TOKEN_EGID_LT] = (__force op_token_t) TOKEN_OP_LESS_THAN,
+	[TOKEN_EGID_GT] = (__force op_token_t) TOKEN_OP_GREATER_THAN,
+	[TOKEN_EGID_NOT] = (__force op_token_t) TOKEN_OP_NOT,
+	[TOKEN_TAG] = (__force op_token_t) TOKEN_OP_EQUAL,
+	[TOKEN_IPV4] = (__force op_token_t) TOKEN_OP_EQUAL,
+	[TOKEN_IPV6] = (__force op_token_t) TOKEN_OP_EQUAL,
+	[TOKEN_DEP] = (__force op_token_t) TOKEN_OP_DEP,
+	[TOKEN_DEP_ADD] = (__force op_token_t) TOKEN_OP_DEP_ADD,
+	[TOKEN_DEP_DEL] = (__force op_token_t) TOKEN_OP_DEP_DEL,
+	[TOKEN_ORDER] = (__force op_token_t) TOKEN_OP_ORDER,
+	[TOKEN_CLASS] = (__force op_token_t) TOKEN_OP_CLASS,
+	[TOKEN_STATE] = (__force op_token_t) TOKEN_OP_STATE
+};
+
+static match_table_t tokens = {
+	{TOKEN_PATH, "path=%s"},
+	{TOKEN_CMD, "cmd=%s"},
+	{TOKEN_ARGS, "args=%s"},
+	{TOKEN_RUID_EQ, "uid=%d"},
+	{TOKEN_RUID_LT, "uid<%d"},
+	{TOKEN_RUID_GT, "uid>%d"},
+	{TOKEN_RUID_NOT, "uid!%d"},
+	{TOKEN_RGID_EQ, "gid=%d"},
+	{TOKEN_RGID_LT, "gid<%d"},
+	{TOKEN_RGID_GT, "gid>%d"},
+	{TOKEN_RGID_NOT, "gid!d"},
+	{TOKEN_EUID_EQ, "euid=%d"},
+	{TOKEN_EUID_LT, "euid<%d"},
+	{TOKEN_EUID_GT, "euid>%d"},
+	{TOKEN_EUID_NOT, "euid!%d"},
+	{TOKEN_EGID_EQ, "egid=%d"},
+	{TOKEN_EGID_LT, "egid<%d"},
+	{TOKEN_EGID_GT, "egid>%d"},
+	{TOKEN_EGID_NOT, "egid!%d"},
+	{TOKEN_TAG, "tag=%s"},
+	{TOKEN_IPV4, "ipv4=%s"},
+	{TOKEN_IPV6, "ipv6=%s"},
+	{TOKEN_DEP, "depend=%s"},
+	{TOKEN_DEP_ADD, "+depend=%s"},
+	{TOKEN_DEP_DEL, "-depend=%s"},
+	{TOKEN_ORDER, "order=%d"},
+	{TOKEN_CLASS, "class=%s"},
+	{TOKEN_STATE, "state=%d"},
+	{TOKEN_INVALID, NULL}
+};
+
+/*
+ * return -EINVAL in case of failures
+ * returns number of terms in terms on success.
+ * never returns 0.
+ */
+
+int
+rules_parse(char *rule_defn, struct rbce_rule_term **rterms, int *term_mask)
+{
+	char *p, *rp = rule_defn;
+	int option, i = 0, nterms;
+	struct rbce_rule_term *terms;
+
+	*rterms = NULL;
+	*term_mask = 0;
+	if (!rule_defn)
+		return -EINVAL;
+
+	nterms = 0;
+	while (*rp++)
+		if (*rp == '>' || *rp == '<' || *rp == '=' || *rp == '!')
+			nterms++;
+
+	if (!nterms)
+		return -EINVAL;
+
+	terms = kmalloc(nterms * sizeof(struct rbce_rule_term), GFP_KERNEL);
+	if (!terms)
+		return -ENOMEM;
+
+	while ((p = strsep(&rule_defn, ",")) != NULL) {
+
+		substring_t args[MAX_OPT_ARGS];
+		int token;
+
+		while (*p && isspace(*p))
+			p++;
+		if (!*p)
+			continue;
+
+		token = match_token(p, tokens, args);
+
+		terms[i].op = token_to_ruleop[token];
+		terms[i].operator = token_to_operator[token];
+		switch (token) {
+
+		case TOKEN_PATH:
+		case TOKEN_CMD:
+		case TOKEN_ARGS:
+		case TOKEN_TAG:
+		case TOKEN_IPV4:
+		case TOKEN_IPV6:
+			/* all these tokens can be specified only once */
+			if (*term_mask & (1 << terms[i].op)) {
+				nterms = -EINVAL;
+				goto out;
+			}
+		/*FALLTHRU*/
+		case TOKEN_CLASS:
+		case TOKEN_DEP:
+		case TOKEN_DEP_ADD:
+		case TOKEN_DEP_DEL:
+			terms[i].u.string = args->from;
+			break;
+
+		case TOKEN_RUID_EQ:
+		case TOKEN_RUID_LT:
+		case TOKEN_RUID_GT:
+		case TOKEN_RUID_NOT:
+		case TOKEN_RGID_EQ:
+		case TOKEN_RGID_LT:
+		case TOKEN_RGID_GT:
+		case TOKEN_RGID_NOT:
+		case TOKEN_EUID_EQ:
+		case TOKEN_EUID_LT:
+		case TOKEN_EUID_GT:
+		case TOKEN_EUID_NOT:
+		case TOKEN_EGID_EQ:
+		case TOKEN_EGID_LT:
+		case TOKEN_EGID_GT:
+		case TOKEN_EGID_NOT:
+			/* all these tokens can be specified only once */
+			if (*term_mask & (1 << terms[i].op)) {
+				nterms = -EINVAL;
+				goto out;
+			}
+		/*FALLTHRU*/
+		case TOKEN_ORDER:
+		case TOKEN_STATE:
+			if (match_int(args, &option)) {
+				nterms = -EINVAL;
+				goto out;
+			}
+			terms[i].u.id = option;
+			break;
+		default:
+			nterms = -EINVAL;
+			goto out;
+		}
+		/* Check range of term value */
+		switch(token){
+		case TOKEN_STATE:
+			if ((terms[i].u.id != 0) && (terms[i].u.id != 1)){
+				nterms = -EINVAL;
+				goto out;
+			}
+			break;
+		default: /* Non-numerical value */
+			break;
+		}
+		*term_mask |= (1 << terms[i].op);
+		i++;
+	}
+	*rterms = terms;
+
+      out:
+	if (nterms < 0) {
+		kfree(terms);
+		*term_mask = 0;
+	}
+	return nterms;
+}

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 17/21] CKRM: Rule Based Classification Engine, bitvector support for classification info
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (15 preceding siblings ...)
  2005-05-05 18:07 ` [patch 16/21] CKRM: Rule Based Classification Engine, basic " gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 18/21] CKRM: Rule Based Classification Engine, full CE gh
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=09-03-rbce_main-opt


Part 3 of 5 patches to support Rule Based Classification Engine for CKRM.
This patch provides some optimization by maintaining the classification 
information in the vectors. No classification functionality yet.

Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@us.ibm.com>
Signed-Off-By: Vivek Kashyap <vivk@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

 rbce_bitvector.h |  146 +++++++++++++++++++++++++++++++++++++++++++++++
 rbce_internal.h  |   19 +++++-
 rbce_main.c      |  168 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 330 insertions(+), 3 deletions(-)

Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_bitvector.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_bitvector.h	2005-05-05 09:38:05.000000000 -0700
@@ -0,0 +1,146 @@
+/*
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ *
+ * Bitvector package
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ */
+
+#ifndef _RBCE_BITVECTOR_H
+#define _RBCE_BITVECTOR_H
+
+typedef struct {
+	int size;		/* maxsize in longs */
+	unsigned long bits[0];	/* bit vector */
+} bitvector_t;
+
+#define BITS_2_LONGS(sz)  (((sz)+BITS_PER_LONG-1)/BITS_PER_LONG)
+#define BITS_2_BYTES(sz)  (((sz)+7)/8)
+
+#if 0
+#define CHECK_VEC(vec) (vec)	/* check against NULL */
+#else
+#define CHECK_VEC(vec) (1)	/* assume no problem */
+#endif
+
+#define CHECK_VEC_VOID(vec)   do { if (!CHECK_VEC(vec)) return; } while(0)
+#define CHECK_VEC_RC(vec, val) \
+do { if (!CHECK_VEC(vec)) return (val); } while(0)
+
+inline static void bitvector_zero(bitvector_t * bitvec)
+{
+	int sz;
+
+	CHECK_VEC_VOID(bitvec);
+	sz = BITS_2_BYTES(bitvec->size);
+	memset(bitvec->bits, 0, sz);
+	return;
+}
+
+inline static unsigned long bitvector_bytes(unsigned long size)
+{
+	return sizeof(bitvector_t) + BITS_2_BYTES(size);
+}
+
+inline static void bitvector_init(bitvector_t * bitvec, unsigned long size)
+{
+	bitvec->size = size;
+	bitvector_zero(bitvec);
+	return;
+}
+
+inline static bitvector_t *bitvector_alloc(unsigned long size)
+{
+	bitvector_t *vec =
+	    (bitvector_t *) kmalloc(bitvector_bytes(size), GFP_KERNEL);
+	if (vec) {
+		vec->size = size;
+		bitvector_zero(vec);
+	}
+	return vec;
+}
+
+inline static void bitvector_free(bitvector_t * bitvec)
+{
+	CHECK_VEC_VOID(bitvec);
+	kfree(bitvec);
+	return;
+}
+
+#define def_bitvec_op(name,mod1,op,mod2) 			\
+inline static int name(bitvector_t *res, bitvector_t *op1,	\
+		       bitvector_t *op2)			\
+{								\
+	unsigned int i, size; 					\
+								\
+	CHECK_VEC_RC(res, 0); 					\
+	CHECK_VEC_RC(op1, 0); 					\
+	CHECK_VEC_RC(op2, 0); 					\
+	size = res->size; 					\
+	if (((size != (op1)->size) || (size != (op2)->size))) { \
+		return 0;					\
+	}							\
+	size = BITS_2_LONGS(size);				\
+	for (i = 0; i < size; i++) {				\
+		(res)->bits[i] = (mod1 (op1)->bits[i]) op 	\
+					(mod2 (op2)->bits[i]);	\
+	}							\
+	return 1;						\
+}
+
+def_bitvec_op(bitvector_or,, |,);
+def_bitvec_op(bitvector_and,, &,);
+def_bitvec_op(bitvector_xor,, ^,);
+def_bitvec_op(bitvector_or_not,, |, ~);
+def_bitvec_op(bitvector_not_or, ~, |,);
+def_bitvec_op(bitvector_and_not,, &, ~);
+def_bitvec_op(bitvector_not_and, ~, &,);
+
+inline static void bitvector_set(int idx, bitvector_t * vec)
+{
+	set_bit(idx, vec->bits);
+	return;
+}
+
+inline static void bitvector_clear(int idx, bitvector_t * vec)
+{
+	clear_bit(idx, vec->bits);
+	return;
+}
+
+inline static int bitvector_test(int idx, bitvector_t * vec)
+{
+	return test_bit(idx, vec->bits);
+}
+
+#ifdef DEBUG
+inline static void bitvector_print(int flag, bitvector_t * vec)
+{
+	unsigned int i;
+	int sz;
+	extern int rbcedebug;
+
+	if ((rbcedebug & flag) == 0) {
+		return;
+	}
+	if (vec == NULL) {
+		printk("v<0>-NULL\n");
+		return;
+	}
+	printk("v<%d>-", sz = vec->size);
+	for (i = 0; i < sz; i++) {
+		printk("%c", test_bit(i, vec->bits) ? '1' : '0');
+	}
+	return;
+}
+#else
+#define bitvector_print(x, y)
+#endif
+
+#endif				/* _RBCE_BITVECTOR_H */
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_internal.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/rbce_internal.h	2005-05-05 09:38:03.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_internal.h	2005-05-05 09:38:05.000000000 -0700
@@ -51,6 +51,8 @@ struct named_obj_hdr {
 	char *name;
 };
 
+#include "rbce_bitvector.h"
+
 #define GET_REF(x) ((x)->obj.referenced)
 #define INC_REF(x) (GET_REF(x)++)
 #define DEC_REF(x) (--GET_REF(x))
@@ -185,14 +187,25 @@ enum op_token {
 	TOKEN_OP_STATE = (__force op_token_t) (TOKEN_OP_GREATER_THAN+6),
 };
 
-
 /*
- * data structure rbce_private_data to hold the app_tag for a task.
- * Expands later.
+ * data structure rbce_private_data holds the bit vector 'eval' which
+ * specifies if rules and terms of rules are evaluated against the task
+ * and if they were evaluated, bit vector 'true' holds the result of that
+ * evaluation.
+ *
+ * This data structure is maintained in a task, and the bitvectors are
+ * updated only when needed.
+ *
+ * Each rule and each term of a rule has a corresponding bit in the vector.
  *
  */
 struct rbce_private_data {
+  	int evaluate;		/* whether to evaluate rules or not ? */
+  	int rules_version;	/* rules_version at last evaluation */
+	bitvector_t *eval;
+	bitvector_t *true;
 	char *app_tag;
+  	char data[0];		/* bitvectors eval and true */
 };
 
 #define RBCE_DATA(tsk) ((struct rbce_private_data*)((tsk)->ce_data))
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_main.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/rbce_main.c	2005-05-05 09:38:03.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_main.c	2005-05-05 09:38:05.000000000 -0700
@@ -23,6 +23,7 @@
  */
 
 #include "rbce_internal.h"
+#include "rbce_bitvector.h"
 
 MODULE_DESCRIPTION(RBCE_MOD_DESCR);
 MODULE_AUTHOR("Hubertus Franke, Chandra Seetharaman (IBM)");
@@ -57,14 +58,17 @@ int termop_2_vecidx[RBCE_RULE_INVALID] =
 extern int errno;
 
 int rbce_enabled = 1;
+const int use_persistent_state = 1;
 
 static LIST_HEAD(class_list);
 static struct list_head rules_list[CKRM_MAX_CLASSTYPES];
 static int gl_num_rules;
 static int gl_action, gl_num_terms;
+static int gl_bitmap_version, gl_action, gl_num_terms;
 static int gl_allocated, gl_released;
 static struct rbce_rule_term *gl_terms;
 static int gl_rules_version;
+static bitvector_t *gl_mask_vecs[NUM_TERM_MASK_VECTOR];
 static rwlock_t rbce_rwlock = RW_LOCK_UNLOCKED;
 	/*
 	 * One lock to protect them all !!!
@@ -78,6 +82,8 @@ static rwlock_t rbce_rwlock = RW_LOCK_UN
 
 /* ======================= Helper Functions ========================= */
 
+static void optimize_policy(void);
+
 static inline struct rbce_rule *find_rule_name(const char *name)
 {
 	struct named_obj_hdr *pos;
@@ -159,6 +165,8 @@ static int insert_rule(struct rbce_rule 
 	}
 	gl_rules_version++;
 	list_add_tail(&rule->obj.link, insert);
+	if (rbce_enabled)
+		optimize_policy();
 	return 0;
 }
 
@@ -337,9 +345,150 @@ static inline int __delete_rule(struct r
 	module_put(THIS_MODULE);
 	kfree(rule->obj.name);
 	kfree(rule);
+	if (rbce_enabled && (gl_action & POLICY_ACTION_PACK_TERMS))
+		optimize_policy();
 	return 0;
 }
 
+/*
+ * Optimize the rule evaluation logic
+ *
+ * Caller must hold global_rwlock in write mode.
+ */
+static void optimize_policy(void)
+{
+	int i, ii;
+	struct rbce_rule *rule;
+	struct rbce_rule_term *terms;
+	int num_terms;
+	int bsize;
+	bitvector_t **mask_vecs;
+	int pack_terms = 0;
+	int redoall;
+
+	/*
+	 * Due to dynamic rule addition/deletion of rules the term
+	 * vector can get sparse. As a result the bitvectors grow as we don't
+	 * reuse returned indices. If it becomes sparse enough we pack them
+	 * closer.
+	 */
+
+	pack_terms = (gl_action & POLICY_ACTION_PACK_TERMS);
+
+	if (pack_terms) {
+		int nsz = ALIGN((gl_num_terms - gl_released),
+				POLICY_INC_NUMTERMS);
+		int newidx = 0;
+		struct rbce_rule_term *newterms;
+
+		terms = gl_terms;
+		newterms =
+		    kmalloc(nsz * sizeof(struct rbce_rule_term), GFP_ATOMIC);
+		if (newterms) {
+			for (ii = 0; ii < CKRM_MAX_CLASSTYPES; ii++) {
+				/* FIXME: check only for task class types */
+				list_for_each_entry_reverse(rule,
+							    &rules_list[ii],
+							    obj.link) {
+					rule->index = newidx++;
+					for (i = rule->num_terms; --i >= 0;) {
+						int idx = rule->terms[i];
+						newterms[newidx] = terms[idx];
+						rule->terms[i] = newidx++;
+					}
+				}
+			}
+			kfree(terms);
+			gl_allocated = nsz;
+			gl_released = 0;
+			gl_num_terms = newidx;
+			gl_terms = newterms;
+
+			gl_action &= ~POLICY_ACTION_PACK_TERMS;
+			gl_action |= POLICY_ACTION_NEW_VERSION;
+		}
+	}
+
+	num_terms = gl_num_terms;
+	bsize = gl_allocated / 8 + sizeof(bitvector_t);
+	mask_vecs = gl_mask_vecs;
+	terms = gl_terms;
+
+	if (gl_action & POLICY_ACTION_NEW_VERSION) {
+		/* allocate new mask vectors */
+		char *temp = kmalloc(NUM_TERM_MASK_VECTOR * bsize, GFP_ATOMIC);
+
+		if (!temp) {
+			return;
+		}
+		if (mask_vecs[0]) {/* index 0 has the alloc returned address */
+			kfree(mask_vecs[0]);
+		}
+		for (i = 0; i < NUM_TERM_MASK_VECTOR; i++) {
+			mask_vecs[i] = (bitvector_t *) (temp + i * bsize);
+			bitvector_init(mask_vecs[i], gl_allocated);
+		}
+		gl_action &= ~POLICY_ACTION_NEW_VERSION;
+		gl_action |= POLICY_ACTION_REDO_ALL;
+		gl_bitmap_version++;
+	}
+
+	/* We do two things here at once
+	 * 1) recompute the rulemask for each required rule
+	 *      we guarantee proper dependency order during creation time and
+	 *      by reversely running through this list.
+	 * 2) recompute the mask for each term and rule, if required
+	 */
+
+	redoall = gl_action & POLICY_ACTION_REDO_ALL;
+	gl_action &= ~POLICY_ACTION_REDO_ALL;
+
+	for (ii = 0; ii < CKRM_MAX_CLASSTYPES; ii++) {
+		/* FIXME: check only for task class types */
+		list_for_each_entry_reverse(rule, &rules_list[ii], obj.link) {
+			unsigned long termflag;
+
+			if (!redoall && !rule->do_opt)
+				continue;
+			termflag = 0;
+			for (i = rule->num_terms; --i >= 0;) {
+				int j, idx = rule->terms[i];
+				struct rbce_rule_term *term = &terms[idx];
+				int vecidx = termop_2_vecidx[term->op];
+
+				if (vecidx == -1) {
+					termflag |= term->u.deprule->termflag;
+					/* mark this term belonging to all
+					   contexts of deprule */
+					for (j = 0; j < NUM_TERM_MASK_VECTOR;
+					     j++) {
+						if (term->u.deprule->termflag
+						    & (1 << j)) {
+							bitvector_set(idx,
+								      mask_vecs
+								      [j]);
+						}
+					}
+				} else {
+					termflag |= TERM_2_TERMFLAG(vecidx);
+					/* mark this term belonging to
+					   a particular context */
+					bitvector_set(idx, mask_vecs[vecidx]);
+				}
+			}
+			for (i = 0; i < NUM_TERM_MASK_VECTOR; i++) {
+				if (termflag & (1 << i)) {
+					bitvector_set(rule->index,
+						      mask_vecs[i]);
+				}
+			}
+			rule->termflag = termflag;
+			rule->do_opt = 0;
+		}
+	}
+	return;
+}
+
 /* ======================= Rule related Functions ========================= */
 
 /*
@@ -984,6 +1133,25 @@ struct rbce_private_data *create_private
 	return NULL;
 }
 
+static inline
+void reset_evaluation(struct rbce_private_data *pdata,int termflag)
+{
+	/* reset TAG ruleterm evaluation results to pick up
+ 	 * on next classification event
+ 	 */
+	if (termflag >= NUM_TERM_MASK_VECTOR) {
+		printk(KERN_ERR "rbce:reset_evaluation: trying to access "
+			"past valid address\n");
+		return;
+	}
+ 	if (use_persistent_state && gl_mask_vecs[termflag]) {
+ 		bitvector_and_not( pdata->eval, pdata->eval,
+ 				   gl_mask_vecs[termflag] );
+ 		bitvector_and_not( pdata->true, pdata->true,
+ 				   gl_mask_vecs[termflag] );
+ 	}
+}
+
 int rbce_set_tasktag(int pid, char *tag)
 {
 	char *tp;

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 18/21] CKRM: Rule Based Classification Engine, full CE
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (16 preceding siblings ...)
  2005-05-05 18:07 ` [patch 17/21] CKRM: Rule Based Classification Engine, bitvector support for classification info gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 19/21] CKRM: Rule Based Classification Engine, more advanced classification engine gh
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=09-04-rbce_opt-core


Part 4 of 5 patches to support Rule Based Classification Engine for CKRM.
This patch connects RBCE with CKRM core. Full functionality of RBCE 
achieved with this patch.

Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@us.ibm.com>
Signed-Off-By: Vivek Kashyap <vivk@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

 Makefile        |    2 
 rbce_core.c     |  890 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 rbce_internal.h |   23 +
 rbce_main.c     |   67 ++--
 4 files changed, 961 insertions(+), 21 deletions(-)

Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/Makefile	2005-05-05 09:38:03.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/Makefile	2005-05-05 09:38:06.000000000 -0700
@@ -3,4 +3,4 @@
 #
 
 obj-$(CONFIG_CKRM_RBCE)	+= rbce.o
-rbce-objs := rbce_fs.o rbce_main.o rbce_token.o
+rbce-objs := rbce_fs.o rbce_main.o rbce_token.o rbce_core.o
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_core.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_core.c	2005-05-05 09:38:06.000000000 -0700
@@ -0,0 +1,890 @@
+/* Rule-based Classification Engine (RBCE) and
+ * Consolidated RBCE module code (combined)
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ *           (C) Chandra Seetharaman, IBM Corp. 2003
+ *           (C) Vivek Kashyap, IBM Corp. 2004
+ *
+ * Module for loading of classification policies and providing
+ * a user API for Class-based Kernel Resource Management (CKRM)
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+#include "rbce_internal.h"
+
+/*
+ * Callback from core when a class is deleted.
+ */
+static void
+rbce_class_deletecb(const char *classname, void *classobj, int classtype)
+{
+	static struct rbce_class *cls;
+	struct named_obj_hdr *pos;
+	struct rbce_rule *rule;
+
+	write_lock(&rbce_rwlock);
+	cls = find_class_name(classname);
+	if (cls) {
+		if (cls->classobj != classobj) {
+			printk(KERN_ERR "rbce: class %s changed identity\n",
+			       classname);
+		}
+		cls->classobj = NULL;
+		list_for_each_entry(pos, &rules_list[cls->classtype], link) {
+			rule = (struct rbce_rule *)pos;
+			if (rule->target_class) {
+				if (!strcmp
+				    (rule->target_class->obj.name, classname)) {
+					put_class(cls);
+					rule->target_class = NULL;
+					rule->classtype = -1;
+				}
+			}
+		}
+		if ((cls = find_class_name(classname)) != NULL) {
+			printk(KERN_ERR
+			       "rbce ERROR: class %s exists in rbce after "
+			       "removal in core\n", classname);
+		}
+	}
+	write_unlock(&rbce_rwlock);
+	return;
+}
+
+/*====================== Classification Functions =======================*/
+
+/*
+ * Match the given full path name with the command expression.
+ * This function treats the folowing 2 charaters as special if seen in
+ * cmd_exp, all other chanracters are compared as is:
+ *		? - compares to any one single character
+ *		* - compares to one or more single characters
+ *
+ * If fullpath is 1, tsk_comm is compared in full. otherwise only the command
+ * name (basename(tsk_comm)) is compared.
+ */
+static inline int
+match_cmd(const char *tsk_comm, const char *cmd_exp, int fullpath)
+{
+	const char *c, *t, *last_ast, *cmd = tsk_comm;
+	char next_c;
+
+	/* get the command name if we don't have to match the fullpath */
+	if (!fullpath && ((c = strrchr(tsk_comm, '/')) != NULL)) {
+		cmd = c + 1;
+	}
+
+	/* now faithfully assume the entire pathname is in cmd */
+
+	/* we now have to effectively implement a regular expression
+	 * for now assume
+	 *    '?'   any single character
+	 *    '*'   one or more '?'
+	 *    rest  must match
+	 */
+
+	c = cmd_exp;
+	t = cmd;
+	if (t == NULL || c == NULL) {
+		return 0;
+	}
+
+	last_ast = NULL;
+	next_c = '\0';
+
+	while (*c && *t) {
+		switch (*c) {
+		case '?':
+			if (*t == '/') {
+				return 0;
+			}
+			c++;
+			t++;
+			continue;
+		case '*':
+			if (*t == '/') {
+				return 0;
+			}
+			/* eat up all '*' in c */
+			while (*(c + 1) == '*')
+				c++;
+			next_c = '\0';
+			last_ast = c;
+			/*t++; 	Add this for matching '*' with "one"
+				or more chars. */
+			while (*t && (*t != *(c + 1)) && *t != '/')
+				t++;
+			if (*t == *(c + 1)) {
+				c++;
+				if (*c != '/') {
+					if (*c == '?') {
+						if (*t == '/') {
+							return 0;
+						}
+						t++;
+						c++;
+					}
+					next_c = *c;
+					if (*c) {
+						if (*t == '/') {
+							return 0;
+						}
+						t++;
+						c++;
+						if (!*c && *t)
+							c = last_ast;
+					}
+				} else {
+					last_ast = NULL;
+				}
+				continue;
+			}
+			return 0;
+		case '/':
+			next_c = '\0';
+		 /*FALLTHRU*/ default:
+			if (*t == *c && next_c != *t) {
+				c++, t++;
+				continue;
+			} else {
+				/* reset to last asterix and
+				   continue from there */
+				if (last_ast) {
+					c = last_ast;
+				} else {
+					return 0;
+				}
+			}
+		}
+	}
+
+	/* check for trailing "*" */
+	while (*c == '*')
+		c++;
+
+	return (!*c && !*t);
+}
+
+static inline void reverse(char *str, int n)
+{
+	char s;
+	int i, j = n - 1;
+
+	for (i = 0; i < j; i++, j--) {
+		s = str[i];
+		str[i] = str[j];
+		str[j] = s;
+	}
+}
+
+static inline int itoa(int n, char *str)
+{
+	int i = 0, sz = 0;
+
+	do {
+		str[i++] = n % 10 + '0';
+		sz++;
+		n = n / 10;
+	} while (n > 0);
+
+	(void)reverse(str, sz);
+	return sz;
+}
+
+static inline int v4toa(__u32 y, char *a)
+{
+	int i;
+	int size = 0;
+
+	for (i = 0; i < 4; i++) {
+		size += itoa(y & 0xff, &a[size]);
+		a[size++] = '.';
+		y >>= 8;
+	}
+	return --size;
+}
+
+static inline int match_ipv4(struct ckrm_net_struct *ns, char **string)
+{
+	char *ptr = *string;
+	int size;
+	char a4[16];
+
+	size = v4toa(ns->ns_daddrv4, a4);
+
+	*string += size;
+	return !strncmp(a4, ptr, size);
+}
+
+static inline int match_port(struct ckrm_net_struct *ns, char *ptr)
+{
+	char a[5];
+	int size = itoa(ns->ns_dport, a);
+
+	return !strncmp(a, ptr, size);
+}
+
+static int __evaluate_rule(struct task_struct *tsk, struct ckrm_net_struct *ns,
+			   struct rbce_rule *rule, bitvector_t * vec_eval,
+			   bitvector_t * vec_true, char **filename);
+/*
+ * evaluate the given task against the given rule with the vec_eval and
+ * vec_true in context. Return 1 if the task satisfies the given rule, 0
+ * otherwise.
+ *
+ * If the bit corresponding to the rule is set in the vec_eval, then the
+ * corresponding bit in vec_true is the result. If it is not set, evaluate
+ * the rule and set the bits in both the vectors accordingly.
+ *
+ * On return, filename will have the pointer to the pathname of the task's
+ * executable, if the rule had any command related terms.
+ *
+ * Caller must hold the rbce_rwlock atleast in read mode.
+ */
+static inline int
+evaluate_rule(struct task_struct *tsk, struct ckrm_net_struct *ns,
+	      struct rbce_rule *rule, bitvector_t * vec_eval,
+	      bitvector_t * vec_true, char **filename)
+{
+	int tidx = rule->index;
+
+	if (!bitvector_test(tidx, vec_eval)) {
+		if (__evaluate_rule
+		    (tsk, ns, rule, vec_eval, vec_true, filename)) {
+			bitvector_set(tidx, vec_true);
+		}
+		bitvector_set(tidx, vec_eval);
+	}
+	return bitvector_test(tidx, vec_true);
+}
+
+/*
+ * evaluate the given task against every term in the given rule with
+ * vec_eval and vec_true in context.
+ *
+ * If the bit corresponding to a rule term is set in the vec_eval, then the
+ * corresponding bit in vec_true is the result for taht particular. If it is
+ * not set, evaluate the rule term and set the bits in both the vectors
+ * accordingly.
+ *
+ * This fucntions returns true only if all terms in the rule evaluate true.
+ *
+ * On return, filename will have the pointer to the pathname of the task's
+ * executable, if the rule had any command related terms.
+ *
+ * Caller must hold the rbce_rwlock atleast in read mode.
+ */
+static int
+__evaluate_rule(struct task_struct *tsk, struct ckrm_net_struct *ns,
+		struct rbce_rule *rule, bitvector_t * vec_eval,
+		bitvector_t * vec_true, char **filename)
+{
+	int i;
+	int no_ip = 1;
+
+	for (i = rule->num_terms; --i >= 0;) {
+		int rc = 1, tidx = rule->terms[i];
+
+		if (!bitvector_test(tidx, vec_eval)) {
+			struct rbce_rule_term *term = &gl_terms[tidx];
+
+			switch (term->op) {
+
+			case RBCE_RULE_CMD_PATH:
+			case RBCE_RULE_CMD:
+#if __NOT_YET__
+				if (!*filename) {	/* get this once */
+					if (((*filename =
+					      kmalloc(NAME_MAX,
+						      GFP_ATOMIC)) == NULL)
+					    ||
+					    (get_exe_path_name
+					     (tsk, *filename, NAME_MAX) < 0)) {
+						rc = 0;
+						break;
+					}
+				}
+				rc = match_cmd(*filename, term->u.string,
+					       (term->op ==
+						RBCE_RULE_CMD_PATH));
+#else
+				rc = match_cmd(tsk->comm, term->u.string,
+					       (term->op ==
+						RBCE_RULE_CMD_PATH));
+#endif
+				break;
+			case RBCE_RULE_REAL_UID:
+				if (term->operator == RBCE_LESS_THAN) {
+					rc = (tsk->uid < term->u.id);
+				} else if (term->operator == RBCE_GREATER_THAN){
+					rc = (tsk->uid > term->u.id);
+				} else if (term->operator == RBCE_NOT) {
+					rc = (tsk->uid != term->u.id);
+				} else {
+					rc = (tsk->uid == term->u.id);
+				}
+				break;
+			case RBCE_RULE_REAL_GID:
+				if (term->operator == RBCE_LESS_THAN) {
+					rc = (tsk->gid < term->u.id);
+				} else if (term->operator == RBCE_GREATER_THAN){
+					rc = (tsk->gid > term->u.id);
+				} else if (term->operator == RBCE_NOT) {
+					rc = (tsk->gid != term->u.id);
+				} else {
+					rc = (tsk->gid == term->u.id);
+				}
+				break;
+			case RBCE_RULE_EFFECTIVE_UID:
+				if (term->operator == RBCE_LESS_THAN) {
+					rc = (tsk->euid < term->u.id);
+				} else if (term->operator == RBCE_GREATER_THAN){
+					rc = (tsk->euid > term->u.id);
+				} else if (term->operator == RBCE_NOT) {
+					rc = (tsk->euid != term->u.id);
+				} else {
+					rc = (tsk->euid == term->u.id);
+				}
+				break;
+			case RBCE_RULE_EFFECTIVE_GID:
+				if (term->operator == RBCE_LESS_THAN) {
+					rc = (tsk->egid < term->u.id);
+				} else if (term->operator == RBCE_GREATER_THAN){
+					rc = (tsk->egid > term->u.id);
+				} else if (term->operator == RBCE_NOT) {
+					rc = (tsk->egid != term->u.id);
+				} else {
+					rc = (tsk->egid == term->u.id);
+				}
+				break;
+			case RBCE_RULE_APP_TAG:
+				rc = (RBCE_DATA(tsk)
+				      && RBCE_DATA(tsk)->
+				      app_tag) ? !strcmp(RBCE_DATA(tsk)->
+							 app_tag,
+							 term->u.string) : 0;
+				break;
+			case RBCE_RULE_DEP_RULE:
+				rc = evaluate_rule(tsk, NULL, term->u.deprule,
+						   vec_eval, vec_true,
+						   filename);
+				break;
+
+			case RBCE_RULE_IPV4:
+				/* TBD: add NOT_EQUAL match. At present */
+				/* rbce recognises EQUAL matches only.  */
+				if (ns && term->operator == RBCE_EQUAL) {
+					int ma = 0;
+					int mp = 0;
+					char *ptr = term->u.string;
+
+					if (term->u.string[0] == '*')
+						ma = 1;
+					else
+						ma = match_ipv4(ns, &ptr);
+
+					if (*ptr != '\\') {
+						rc = 0;
+						break;
+					} else {
+						++ptr;
+						if (*ptr == '*')
+							mp = 1;
+						else
+							mp = match_port(ns,
+									ptr);
+					}
+					rc = mp && ma;
+				} else
+					rc = 0;
+				no_ip = 0;
+				break;
+
+			case RBCE_RULE_IPV6:	/* no support yet */
+				rc = 0;
+				no_ip = 0;
+				break;
+
+			default:
+				rc = 0;
+				printk(KERN_ERR "Error evaluate term op=%d\n",
+				       term->op);
+				break;
+			}
+			if (!rc && no_ip) {
+				bitvector_clear(tidx, vec_true);
+			} else {
+				bitvector_set(tidx, vec_true);
+			}
+			bitvector_set(tidx, vec_eval);
+		} else {
+			rc = bitvector_test(tidx, vec_true);
+		}
+		if (!rc) {
+			return 0;
+		}
+	}
+	return 1;
+}
+
+/*
+ * This is some old debug code which needs to be trimmed.
+ */
+
+#define valid_pdata(pdata) (1)
+#define store_pdata(pdata)
+#define unstore_pdata(pdata)
+
+struct rbce_private_data *create_private_data(struct rbce_private_data
+						     *src, int copy_sample)
+{
+	int vsize = 0, psize, bsize = 0;
+	struct rbce_private_data *pdata;
+
+	if (use_persistent_state) {
+		vsize = gl_allocated;
+		bsize = vsize / 8 + sizeof(bitvector_t);
+		psize = sizeof(struct rbce_private_data) + 2 * bsize;
+	} else {
+		psize = sizeof(struct rbce_private_data);
+	}
+
+	pdata = kmalloc(psize, GFP_ATOMIC);
+	if (pdata != NULL) {
+		if (use_persistent_state) {
+			pdata->bitmap_version = gl_bitmap_version;
+			pdata->eval = (bitvector_t *) & pdata->data[0];
+			pdata->true = (bitvector_t *) & pdata->data[bsize];
+			if (src && (src->bitmap_version == gl_bitmap_version)) {
+				memcpy(pdata->data, src->data, 2 * bsize);
+			} else {
+				bitvector_init(pdata->eval, vsize);
+				bitvector_init(pdata->true, vsize);
+			}
+		}
+		pdata->evaluate = 1;
+		pdata->rules_version = src ? src->rules_version : 0;
+		pdata->app_tag = NULL;
+	}
+	store_pdata(pdata);
+	return pdata;
+}
+
+static inline void free_private_data(struct rbce_private_data *pdata)
+{
+	if (valid_pdata(pdata)) {
+		unstore_pdata(pdata);
+		kfree(pdata);
+	}
+}
+
+void free_all_private_data(void)
+{
+	struct task_struct *proc, *thread;
+
+	read_lock(&tasklist_lock);
+	do_each_thread(proc, thread) {
+		struct rbce_private_data *pdata;
+
+		pdata = RBCE_DATA(thread);
+		RBCE_DATAP(thread) = NULL;
+		free_private_data(pdata);
+	} while_each_thread(proc, thread);
+	read_unlock(&tasklist_lock);
+	return;
+}
+
+/*
+ * reclassify function, which is called by all the callback functions.
+ *
+ * Takes that task to be reclassified and ruleflags that indicates the
+ * attributes that caused this reclassification request.
+ *
+ * On success, returns the core class pointer to which the given task should
+ * belong to.
+ */
+static struct ckrm_core_class *rbce_classify(struct task_struct *tsk,
+					     struct ckrm_net_struct *ns,
+					     unsigned long termflag,
+					     int classtype)
+{
+	int i;
+	struct rbce_rule *rule;
+	bitvector_t *vec_true = NULL, *vec_eval = NULL;
+	struct rbce_class *tgt = NULL;
+	struct ckrm_core_class *cls = NULL;
+	char *filename = NULL;
+
+	if (!valid_pdata(RBCE_DATA(tsk))) {
+		return NULL;
+	}
+	if (classtype >= CKRM_MAX_CLASSTYPES) {
+		/* can't handle more than CKRM_MAX_CLASSTYPES */
+		return NULL;
+	}
+	/* fast path to avoid locking in case CE is not enabled or */
+	/* if no rules are defined or if no evaluation is needed.  */
+	if (!rbce_enabled || !gl_num_rules ||
+	    (RBCE_DATA(tsk) && !RBCE_DATA(tsk)->evaluate)) {
+		return NULL;
+	}
+	/* FIXME: optimize_policy should be called from here if      */
+	/* gl_action is non-zero. Also, it has to be called with the */
+	/* rbce_rwlock held in write mode.                           */
+
+	read_lock(&rbce_rwlock);
+
+	vec_eval = vec_true = NULL;
+	if (use_persistent_state) {
+		struct rbce_private_data *pdata = RBCE_DATA(tsk);
+
+		if (!pdata
+		    || (pdata
+			&& (gl_bitmap_version != pdata->bitmap_version))) {
+			struct rbce_private_data *new_pdata =
+			    create_private_data(pdata, 1);
+
+			if (new_pdata) {
+				if (pdata) {
+					new_pdata->rules_version =
+					    pdata->rules_version;
+					new_pdata->evaluate = pdata->evaluate;
+					new_pdata->app_tag = pdata->app_tag;
+					free_private_data(pdata);
+				}
+				pdata = RBCE_DATAP(tsk) = new_pdata;
+				termflag = RBCE_TERMFLAG_ALL;
+				/* need to evaluate them all */
+			} else {
+				/*
+				 * we shouldn't free the pdata as it has more
+				 * details than the vectors. But, this
+				 * reclassification should go thru
+				 */
+				pdata = NULL;
+			}
+		}
+		if (!pdata) {
+			goto cls_determined;
+		}
+		vec_eval = pdata->eval;
+		vec_true = pdata->true;
+	} else {
+		int bsize = gl_allocated;
+
+		vec_eval = bitvector_alloc(bsize);
+		vec_true = bitvector_alloc(bsize);
+
+		if (vec_eval == NULL || vec_true == NULL) {
+			goto cls_determined;
+		}
+		termflag = RBCE_TERMFLAG_ALL;
+		/* need to evaluate all of them now */
+	}
+
+	/*
+	 * using bit ops invalidate all terms related to this termflag
+	 * context (only in per task vec)
+	 */
+
+	if (termflag == RBCE_TERMFLAG_ALL) {
+		bitvector_zero(vec_eval);
+	} else {
+		for (i = 0; i < NUM_TERM_MASK_VECTOR; i++) {
+			if (test_bit(i, &termflag)) {
+				bitvector_t *maskvec = get_gl_mask_vecs(i);
+
+				bitvector_and_not(vec_eval, vec_eval, maskvec);
+			}
+		}
+	}
+	bitvector_and(vec_true, vec_true, vec_eval);
+
+	/* run through the rules in order and see what needs evaluation */
+	list_for_each_entry(rule, &rules_list[classtype], obj.link) {
+		if (rule->state == RBCE_RULE_ENABLED &&
+		    rule->target_class &&
+		    rule->target_class->classobj &&
+		    evaluate_rule(tsk, ns, rule, vec_eval, vec_true,
+				  &filename)) {
+			tgt = rule->target_class;
+			cls = rule->target_class->classobj;
+			break;
+		}
+	}
+
+      cls_determined:
+	if (!use_persistent_state) {
+		if (vec_eval) {
+			bitvector_free(vec_eval);
+		}
+		if (vec_true) {
+			bitvector_free(vec_true);
+		}
+	}
+	ckrm_core_grab(cls);
+	read_unlock(&rbce_rwlock);
+	if (filename) {
+		kfree(filename);
+	}
+	if (RBCE_DATA(tsk)) {
+		RBCE_DATA(tsk)->rules_version = gl_rules_version;
+	}
+	return cls;
+}
+
+/*****************************************************************************
+ *
+ * Module specific utilization of core RBCE functionality
+ *
+ * Includes support for the various classtypes
+ * New classtypes will require extensions here
+ *
+ *****************************************************************************/
+
+/* helper functions that are required in the extended version */
+
+static inline void rbce_tc_manual(struct task_struct *tsk)
+{
+	read_lock(&rbce_rwlock);
+
+	if (!RBCE_DATA(tsk)) {
+		RBCE_DATAP(tsk) =
+		    (void *)create_private_data(RBCE_DATA(tsk->parent), 0);
+	}
+	if (RBCE_DATA(tsk)) {
+		RBCE_DATA(tsk)->evaluate = 0;
+	}
+	read_unlock(&rbce_rwlock);
+	return;
+}
+
+/*****************************************************************************
+ *    VARIOUS CLASSTYPES
+ *****************************************************************************/
+
+/* to enable type coercion of the function pointers */
+
+/*============================================================================
+ *    TASKCLASS CLASSTYPE
+ *============================================================================*/
+
+int tc_classtype = -1;
+
+/*
+ * fork callback to be registered with core module.
+ */
+static inline void *rbce_tc_forkcb(struct task_struct *tsk)
+{
+	int rule_version_changed = 1;
+	struct ckrm_core_class *cls;
+	read_lock(&rbce_rwlock);
+	/* dup ce_data */
+	RBCE_DATAP(tsk) =
+	    (void *)create_private_data(RBCE_DATA(tsk->parent), 0);
+	read_unlock(&rbce_rwlock);
+
+	if (RBCE_DATA(tsk->parent)) {
+		rule_version_changed =
+		    (RBCE_DATA(tsk->parent)->rules_version != gl_rules_version);
+	}
+	cls = rule_version_changed ?
+	    rbce_classify(tsk, NULL, RBCE_TERMFLAG_ALL, tc_classtype) : NULL;
+
+	/*
+	 * note the fork notification to any user client will be sent through
+	 * the guaranteed fork-reclassification
+	 */
+	return cls;
+}
+
+/*
+ * exit callback to be registered with core module.
+ */
+static void rbce_tc_exitcb(struct task_struct *tsk)
+{
+	struct rbce_private_data *pdata;
+
+	pdata = RBCE_DATA(tsk);
+	RBCE_DATAP(tsk) = NULL;
+	if (pdata) {
+		if (pdata->app_tag) {
+			kfree(pdata->app_tag);
+		}
+		free_private_data(pdata);
+	}
+	return;
+}
+
+static void *rbce_tc_classify(enum ckrm_event event, ...)
+{
+	va_list args;
+	void *cls = NULL;
+	struct task_struct *tsk;
+	struct rbce_private_data *pdata;
+
+	va_start(args, event);
+	tsk = va_arg(args, struct task_struct *);
+	va_end(args);
+
+	/* we only have to deal with events between
+	 * [ CKRM_LATCHABLE_EVENTS .. CKRM_NONLATCHABLE_EVENTS )
+	 */
+	switch (event) {
+
+	case CKRM_EVENT_FORK:
+		cls = rbce_tc_forkcb(tsk);
+		break;
+
+	case CKRM_EVENT_EXIT:
+		rbce_tc_exitcb(tsk);
+		break;
+
+	case CKRM_EVENT_EXEC:
+		cls = rbce_classify(tsk, NULL, RBCE_TERMFLAG_CMD |
+				    RBCE_TERMFLAG_UID | RBCE_TERMFLAG_GID,
+				    tc_classtype);
+		break;
+
+	case CKRM_EVENT_UID:
+		cls = rbce_classify(tsk, NULL, RBCE_TERMFLAG_UID, tc_classtype);
+		break;
+
+	case CKRM_EVENT_GID:
+		cls = rbce_classify(tsk, NULL, RBCE_TERMFLAG_GID, tc_classtype);
+		break;
+
+	case CKRM_EVENT_LOGIN:
+	case CKRM_EVENT_USERADD:
+	case CKRM_EVENT_USERDEL:
+	case CKRM_EVENT_LISTEN_START:
+	case CKRM_EVENT_LISTEN_STOP:
+	case CKRM_EVENT_APPTAG:
+		/* no interest in this events .. */
+		break;
+
+	default:
+		/* catch all */
+		break;
+
+	case CKRM_EVENT_RECLASSIFY:
+		if ((pdata = (RBCE_DATA(tsk)))) {
+			pdata->evaluate = 1;
+		}
+		cls = rbce_classify(tsk, NULL, RBCE_TERMFLAG_ALL, tc_classtype);
+		break;
+
+	}
+
+	return cls;
+}
+
+static void rbce_tc_notify(int event, void *core, struct task_struct *tsk)
+{
+	if (event != CKRM_EVENT_MANUAL)
+		return;
+	rbce_tc_manual(tsk);
+}
+
+static struct ckrm_eng_callback rbce_taskclass_ecbs = {
+	.c_interest = (unsigned long)(-1),	/* set whole bitmap */
+	.classify = (ce_classify_fct) rbce_tc_classify,
+	.class_delete = rbce_class_deletecb,
+	.n_interest = (1 << CKRM_EVENT_MANUAL),
+	.notify = (ce_notify_fct) rbce_tc_notify,
+	.always_callback = 0,
+};
+
+/*============================================================================
+ *    ACCEPTQ CLASSTYPE
+ *============================================================================*/
+
+int sc_classtype = -1;
+
+static void *rbce_sc_classify(enum ckrm_event event, ...)
+{
+	/* no special consideratation */
+	void *result;
+	va_list args;
+	struct task_struct *tsk;
+	struct ckrm_net_struct *ns;
+
+	va_start(args, event);
+	ns = va_arg(args, struct ckrm_net_struct *);
+	tsk = va_arg(args, struct task_struct *);
+	va_end(args);
+
+	result = rbce_classify(tsk, ns, RBCE_TERMFLAG_ALL, sc_classtype);
+
+	return result;
+}
+
+static struct ckrm_eng_callback rbce_acceptQclass_ecbs = {
+	.c_interest = (unsigned long)(-1),
+	.always_callback = 0,	/* enable during debugging only */
+	.classify = (ce_classify_fct) & rbce_sc_classify,
+	.class_delete = rbce_class_deletecb,
+};
+
+/*============================================================================
+ *    Module Initialization ...
+ *============================================================================*/
+
+#define TASKCLASS_NAME  "taskclass"
+#define SOCKCLASS_NAME  "socket_class"
+
+struct ce_regtable_struct {
+	const char *name;
+	struct ckrm_eng_callback *cbs;
+	int *clsvar;
+};
+
+struct ce_regtable_struct ce_regtable[] = {
+	{TASKCLASS_NAME, &rbce_taskclass_ecbs, &tc_classtype},
+	{SOCKCLASS_NAME, &rbce_acceptQclass_ecbs, &sc_classtype},
+	{NULL}
+};
+
+void unregister_classtype_engines(void)
+{
+	int rc;
+	struct ce_regtable_struct *ceptr = ce_regtable;
+
+	while (ceptr->name) {
+		if (*ceptr->clsvar >= 0) {
+			while ((rc = ckrm_unregister_engine(ceptr->name)) == -EAGAIN)
+				;
+			*ceptr->clsvar = -1;
+		}
+		ceptr++;
+	}
+}
+
+int register_classtype_engines(void)
+{
+	int rc;
+	struct ce_regtable_struct *ceptr = ce_regtable;
+
+	while (ceptr->name) {
+		rc = ckrm_register_engine(ceptr->name, ceptr->cbs);
+		if ((rc < 0) && (rc != -ENOENT)) {
+			unregister_classtype_engines();
+			return (rc);
+		}
+		if (rc != -ENOENT)
+			*ceptr->clsvar = rc;
+		ceptr++;
+	}
+	return 0;
+}
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_internal.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/rbce_internal.h	2005-05-05 09:38:05.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_internal.h	2005-05-05 09:38:06.000000000 -0700
@@ -41,6 +41,8 @@
 #include <asm/io.h>
 #include <asm/uaccess.h>
 
+#include "rbce_bitvector.h"
+
 /*
  * comman data structure used for identification of class and rules
  * in the RBCE namespace
@@ -205,14 +207,25 @@ struct rbce_private_data {
 	bitvector_t *eval;
 	bitvector_t *true;
 	char *app_tag;
+	unsigned long bitmap_version;
   	char data[0];		/* bitvectors eval and true */
 };
 
 #define RBCE_DATA(tsk) ((struct rbce_private_data*)((tsk)->ce_data))
 #define RBCE_DATAP(tsk) ((tsk)->ce_data)
 
+/* Other rbce global stuff. */
+
 extern struct rbce_eng_callback rbce_rcfs_ecbs;
 extern int rbce_enabled;
+extern struct list_head rules_list[];
+extern const int use_persistent_state;
+extern int gl_num_rules;
+extern int gl_bitmap_version;
+extern int gl_allocated;
+extern struct rbce_rule_term *gl_terms;
+extern int gl_rules_version;
+extern rwlock_t rbce_rwlock;
 
 extern int rbce_mkdir(struct inode *, struct dentry *, int);
 extern int rbce_rmdir(struct inode *, struct dentry *);
@@ -228,4 +241,14 @@ extern int rbce_rename_rule(const char *
 
 extern int rules_parse(char *, struct rbce_rule_term **, int *);
 
+extern struct rbce_private_data *create_private_data(struct rbce_private_data
+						     *, int);
+extern bitvector_t *get_gl_mask_vecs(int);
+extern struct rbce_class *find_class_name(const char *);
+extern void put_class(struct rbce_class *);
+extern void free_all_private_data(void);
+
+extern void unregister_classtype_engines(void);
+extern int register_classtype_engines(void);
+
 #endif /* _RBCE_INTERNAL_H */
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_main.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/rbce_main.c	2005-05-05 09:38:05.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_main.c	2005-05-05 09:38:06.000000000 -0700
@@ -23,7 +23,6 @@
  */
 
 #include "rbce_internal.h"
-#include "rbce_bitvector.h"
 
 MODULE_DESCRIPTION(RBCE_MOD_DESCR);
 MODULE_AUTHOR("Hubertus Franke, Chandra Seetharaman (IBM)");
@@ -57,19 +56,20 @@ int termop_2_vecidx[RBCE_RULE_INVALID] =
 
 extern int errno;
 
-int rbce_enabled = 1;
-const int use_persistent_state = 1;
-
 static LIST_HEAD(class_list);
-static struct list_head rules_list[CKRM_MAX_CLASSTYPES];
-static int gl_num_rules;
+struct list_head rules_list[CKRM_MAX_CLASSTYPES];
 static int gl_action, gl_num_terms;
-static int gl_bitmap_version, gl_action, gl_num_terms;
-static int gl_allocated, gl_released;
-static struct rbce_rule_term *gl_terms;
-static int gl_rules_version;
+static int gl_released;
 static bitvector_t *gl_mask_vecs[NUM_TERM_MASK_VECTOR];
-static rwlock_t rbce_rwlock = RW_LOCK_UNLOCKED;
+
+int rbce_enabled = 1;
+const int use_persistent_state = 1;
+int gl_num_rules;
+int gl_bitmap_version;
+int gl_allocated;
+struct rbce_rule_term *gl_terms;
+int gl_rules_version;
+rwlock_t rbce_rwlock = RW_LOCK_UNLOCKED;
 	/*
 	 * One lock to protect them all !!!
 	 * Additions, deletions to rules must
@@ -84,6 +84,12 @@ static rwlock_t rbce_rwlock = RW_LOCK_UN
 
 static void optimize_policy(void);
 
+bitvector_t *
+get_gl_mask_vecs(int index)
+{
+	return gl_mask_vecs[index];
+}
+
 static inline struct rbce_rule *find_rule_name(const char *name)
 {
 	struct named_obj_hdr *pos;
@@ -353,7 +359,7 @@ static inline int __delete_rule(struct r
 /*
  * Optimize the rule evaluation logic
  *
- * Caller must hold global_rwlock in write mode.
+ * Caller must hold rbce_rwlock in write mode.
  */
 static void optimize_policy(void)
 {
@@ -1127,12 +1133,6 @@ int rbce_rule_exists(const char *rname)
 }
 
 /*====================== Magic file handling =======================*/
-struct rbce_private_data *create_private_data(struct rbce_private_data *a,
-						     int b)
-{
-	return NULL;
-}
-
 static inline
 void reset_evaluation(struct rbce_private_data *pdata,int termflag)
 {
@@ -1197,15 +1197,25 @@ out:
 
 int init_rbce(void)
 {
-	int rc, line;
+	int rc, i, line;
 
 	printk(KERN_INFO "Installing \'%s\' module\n", modname);
 
-	rc = rcfs_register_engine(&rbce_rcfs_ecbs);
+	for (i = 0; i < CKRM_MAX_CLASSTYPES; i++) {
+		INIT_LIST_HEAD(&rules_list[i]);
+	}
+
+	rc = register_classtype_engines();
 	line = __LINE__;
 	if (rc)
 		goto out;
 
+	/* register any other class type engine here */
+  	rc = rcfs_register_engine(&rbce_rcfs_ecbs);
+  	line = __LINE__;
+  	if (rc)
+		goto out_unreg_classtype;;
+
 	if (rcfs_mounted) {
 		rc = rbce_create_config();
 		line = __LINE__;
@@ -1214,6 +1224,8 @@ int init_rbce(void)
 	}
 
 	rcfs_unregister_engine(&rbce_rcfs_ecbs);
+out_unreg_classtype:
+ 	unregister_classtype_engines();
 out:
 	printk(KERN_ERR "%s: error installing rc=%d line=%d\n",
 		__FUNCTION__, rc, line);
@@ -1222,12 +1234,27 @@ out:
 
 void exit_rbce(void)
 {
+	int i;
 	printk(KERN_INFO "Removing \'%s\' module\n", modname);
 
+	/* Print warnings if lists are not empty, which is a bug */
+	if (!list_empty(&class_list)) {
+		printk(KERN_WARNING "exit_rbce: Class list is not empty\n");
+	}
+
+	for (i = 0; i < CKRM_MAX_CLASSTYPES; i++) {
+		if (!list_empty(&rules_list[i])) {
+			printk(KERN_WARNING "exit_rbce: Rules list for "
+				"classtype %d is not empty\n", i);
+		}
+	}
+
 	if (rcfs_mounted)
 		rbce_clear_config();
 
 	rcfs_unregister_engine(&rbce_rcfs_ecbs);
+	unregister_classtype_engines();
+	free_all_private_data();
 }
 
 module_init(init_rbce);

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 19/21] CKRM: Rule Based Classification Engine, more advanced classification engine
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (17 preceding siblings ...)
  2005-05-05 18:07 ` [patch 18/21] CKRM: Rule Based Classification Engine, full CE gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 20/21] CKRM: Clean up typo in printk message gh
  2005-05-05 18:07 ` [patch 21/21] CKRM: Fix for compiler warnings gh
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=09-05-rbce_core-crbce


Part 5 of 5 patches to support Rule Based Classification Engine for CKRM.
This patch provides the enhanced RBCE, CRBCE. CRBCE allows the per-process
delay data and additioanl user level monmitoring support.

Signed-Off-By: Hubertus Franke <frankeh@us.ibm.com>
Signed-Off-By: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-Off-By: Shailabh Nagar <nagar@us.ibm.com>
Signed-Off-By: Vivek Kashyap <vivk@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

 include/linux/crbce.h            |  164 +++++++++++
 include/linux/netlink.h          |    1 
 include/linux/rbce.h             |   50 +++
 init/Kconfig                     |   12 
 kernel/ckrm/rbce/Makefile        |    6 
 kernel/ckrm/rbce/crbce_ext.c     |  580 +++++++++++++++++++++++++++++++++++++++
 kernel/ckrm/rbce/crbce_main.c    |    2 
 kernel/ckrm/rbce/rbce_core.c     |   45 ++-
 kernel/ckrm/rbce/rbce_internal.h |   10 
 kernel/ckrm/rbce/rbce_main.c     |   24 +
 10 files changed, 886 insertions(+), 8 deletions(-)

Index: linux-2.6.12-rc3-ckrm5/include/linux/crbce.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/include/linux/crbce.h	2005-05-05 09:38:07.000000000 -0700
@@ -0,0 +1,164 @@
+/*
+ * crbce.h
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ * Copyright (C) Chandra Seetharaman, IBM Corp. 2004
+ *
+ * This files contains the type definition of the record
+ * created by the CRBCE CKRM classification engine
+ *
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of version 2.1 of the GNU Lesser General Public License
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ *
+ */
+
+#ifndef _LINUX_CRBCE_H
+#define _LINUX_CRBCE_H
+
+#ifdef __KERNEL__
+#include <linux/autoconf.h>
+#else
+#define  CONFIG_CKRM
+#define  CONFIG_CRBCE
+#define  CONFIG_DELAY_ACCT
+#endif
+
+#include <linux/types.h>
+#include <linux/ckrm_events.h>
+#include <linux/ckrm_ce.h>
+
+#define CRBCE_MAX_CLASS_NAME_LEN  256
+
+/****************************************************************
+ *
+ *  CRBCE EVENT SET is an extension to the standard CKRM_EVENTS
+ *
+ ****************************************************************/
+
+typedef int __bitwise crbce_event_t;
+enum crbce_event {
+
+	/* we use the standard CKRM_EVENT_<..>
+	 * to identify reclassification cause actions
+	 * and extend by additional ones we need
+	 */
+
+	/* up event flow */
+
+	CRBCE_REC_EXIT = (__force crbce_event_t) (CKRM_NUM_EVENTS+1),
+	CRBCE_REC_DATA_DELIMITER = (__force crbce_event_t) (CRBCE_REC_EXIT+2),
+	CRBCE_REC_SAMPLE = (__force crbce_event_t) (CRBCE_REC_EXIT+3),
+	CRBCE_REC_TASKINFO = (__force crbce_event_t) (CRBCE_REC_EXIT+4),
+	CRBCE_REC_SYS_INFO = (__force crbce_event_t) (CRBCE_REC_EXIT+5),
+	CRBCE_REC_CLASS_INFO = (__force crbce_event_t) (CRBCE_REC_EXIT+6),
+	CRBCE_REC_KERNEL_CMD_DONE = (__force crbce_event_t) (CRBCE_REC_EXIT+7),
+	CRBCE_REC_UKCC_FULL = (__force crbce_event_t) (CRBCE_REC_EXIT+8),
+
+	/* down command issueance */
+	CRBCE_REC_KERNEL_CMD = (__force crbce_event_t) (CRBCE_REC_EXIT+9),
+
+	CRBCE_NUM_EVENTS = (__force crbce_event_t) (CRBCE_REC_EXIT+10)
+};
+
+struct task_sample_info {
+	uint32_t cpu_running;
+	uint32_t cpu_waiting;
+	uint32_t io_delayed;
+	uint32_t memio_delayed;
+};
+
+/*********************************************
+ *          KERNEL -> USER  records          *
+ *********************************************/
+
+/* we have records with either a time stamp or not */
+struct crbce_hdr {
+	int type;
+	pid_t pid;
+};
+
+struct crbce_hdr_ts {
+	int type;
+	pid_t pid;
+	uint32_t jiffies;
+	uint64_t cls;
+};
+
+/* individual records */
+
+struct crbce_rec_fork {
+	struct crbce_hdr_ts hdr;
+	pid_t ppid;
+};
+
+struct crbce_rec_data_delim {
+	struct crbce_hdr_ts hdr;
+	int is_stop;		/* 0 start, 1 stop */
+};
+
+struct crbce_rec_task_data {
+	struct crbce_hdr_ts hdr;
+	struct task_sample_info sample;
+	struct task_delay_info delay;
+};
+
+struct crbce_ukcc_full {
+	struct crbce_hdr_ts hdr;
+};
+
+struct crbce_class_info {
+	struct crbce_hdr_ts hdr;
+	int action;
+	int namelen;
+	char name[CRBCE_MAX_CLASS_NAME_LEN];
+};
+
+/*********************************************
+ *           USER -> KERNEL records          *
+ *********************************************/
+
+typedef int __bitwise crbce_kernel_cmd_t;
+enum crbce_kernel_cmd {
+	CRBCE_CMD_START = (__force crbce_kernel_cmd_t) 1,
+	CRBCE_CMD_STOP = (__force crbce_kernel_cmd_t) 2,
+	CRBCE_CMD_SET_TIMER = (__force crbce_kernel_cmd_t) 3,
+	CRBCE_CMD_SEND_DATA = (__force crbce_kernel_cmd_t) 4,
+};
+
+struct crbce_command {
+	int type;		/* we need this for the K->U reflection */
+	int cmd;
+	uint32_t len;	/* added in the kernel for reflection */
+};
+
+#define set_cmd_hdr(rec,tok) \
+	((rec).hdr.type=CRBCE_REC_KERNEL_CMD,(rec).hdr.cmd=(tok))
+
+struct crbce_cmd_done {
+	struct crbce_command hdr;
+	int rc;
+};
+
+struct crbce_cmd {
+	struct crbce_command hdr;
+};
+
+struct crbce_cmd_send_data {
+	struct crbce_command hdr;
+	int delta_mode;
+};
+
+struct crbce_cmd_settimer {
+	struct crbce_command hdr;
+	uint32_t interval;	/* in msec .. 0 means stop */
+};
+#endif
Index: linux-2.6.12-rc3-ckrm5/include/linux/netlink.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/include/linux/netlink.h	2005-03-01 23:38:25.000000000 -0800
+++ linux-2.6.12-rc3-ckrm5/include/linux/netlink.h	2005-05-05 09:38:07.000000000 -0700
@@ -14,6 +14,7 @@
 #define NETLINK_SELINUX		7	/* SELinux event notifications */
 #define NETLINK_ARPD		8
 #define NETLINK_AUDIT		9	/* auditing */
+#define NETLINK_CKRM		10	/* CKRM */
 #define NETLINK_ROUTE6		11	/* af_inet6 route comm channel */
 #define NETLINK_IP6_FW		13
 #define NETLINK_DNRTMSG		14	/* DECnet routing messages */
Index: linux-2.6.12-rc3-ckrm5/init/Kconfig
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/init/Kconfig	2005-05-05 09:38:01.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/init/Kconfig	2005-05-05 09:38:07.000000000 -0700
@@ -215,6 +215,18 @@ config CKRM_RBCE
 
 	  If unsure, say N.
 
+config CKRM_CRBCE
+	tristate "Enhanced Rule-based Classification Engine (RBCE)"
+	depends on CKRM && RCFS_FS && DELAY_ACCT
+	default m
+	help
+	  Provides an optional module to support creation of rules for automatic
+	  classification of kernel objects, just like RBCE above. In addition,
+	  CRBCE provides per-process delay data (requires DELAY_ACCT configured)
+	  enabled) and makes information on significant kernel events available
+	  to userspace tools through relayfs (requires RELAYFS_FS configured).
+
+	  If unsure, say N.
 endmenu
 
 config SYSCTL
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/Makefile
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/Makefile	2005-05-05 09:38:06.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/Makefile	2005-05-05 09:38:07.000000000 -0700
@@ -4,3 +4,9 @@
 
 obj-$(CONFIG_CKRM_RBCE)	+= rbce.o
 rbce-objs := rbce_fs.o rbce_main.o rbce_token.o rbce_core.o
+
+obj-$(CONFIG_CKRM_CRBCE)	+= crbce.o
+crbce-objs := rbce_fs.o crbce_main.o rbce_token.o rbce_core.o crbce_ext.o
+
+CFLAGS_crbce_main.o += -DCRBCE_EXTENSION
+CFLAGS_crbce_ext.o += -DCRBCE_EXTENSION
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/crbce_ext.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/crbce_ext.c	2005-05-05 09:38:07.000000000 -0700
@@ -0,0 +1,580 @@
+/* Data Collection Extension to Rule-based Classification Engine (RBCE) module
+ *
+ * Copyright (C) Hubertus Franke, IBM Corp. 2003
+ *
+ * Extension to be included into RBCE to collect delay and sample information
+ * Requires user daemon e.g. crbcedmn to activate.
+ *
+ * Latest version, more details at http://ckrm.sf.net
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it would be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ *
+ */
+
+/*
+ *   User-Kernel Communication Channel (UKCC)
+ *   Protocol and communication handling based on netlink
+ */
+
+#include <net/sock.h>
+#include <linux/netlink.h>
+#include "rbce_internal.h"
+
+#define PSAMPLE(pdata)    (&((pdata)->ext_data.sample))
+#define NETLINK_GFP_MASK	GFP_ATOMIC
+
+typedef int __bitwise ukcc_state_t;
+enum ukcc_state {
+	UKCC_OK = (__force ukcc_state_t) 0,
+	UKCC_STANDBY = (__force ukcc_state_t) 1,
+	UKCC_FULL =  (__force ukcc_state_t) 2
+};
+
+int ukcc_channel = -1;
+static enum ukcc_state chan_state = UKCC_STANDBY;
+static void ukcc_cmd_deliver(int rchan_id, char *from, u32 len);
+static struct sock *ukcc_sock = NULL;
+
+static inline int ukcc_ok(void)
+{
+	return (chan_state == UKCC_OK);
+}
+
+static inline int ukcc_send(void *data, int len)
+{
+	int rc;
+	char *saddr;
+	struct sk_buff *skb;
+	struct nlmsghdr *nlh = NULL;
+	int nlmessage_len = NLMSG_LENGTH(sizeof(struct nlmsghdr) + len);
+
+	skb = alloc_skb(nlmessage_len, NETLINK_GFP_MASK);
+	if (!skb)
+		return -ENOMEM;
+	nlh = NLMSG_PUT(skb, 0, 0, NETLINK_CKRM, len); /* pid, type, event */
+	saddr = NLMSG_DATA(nlh);
+	memcpy(saddr, data, len);
+
+	rc = netlink_broadcast(ukcc_sock, skb, 0, 1, NETLINK_GFP_MASK);
+	return (rc < 0) ? rc : 1;
+
+nlmsg_failure:
+	return -1;
+}
+
+static void ukcc_netlink_recv(struct sock *sk, int len)
+{
+	int skblen;
+	struct sk_buff *skb;
+	struct nlmsghdr *nlh;
+
+	while ((skb = skb_dequeue(&sk->sk_receive_queue)) != NULL) {
+		skblen = skb->len;
+		nlh = (struct nlmsghdr *)skb->data;
+		if (!NLMSG_OK(nlh, skblen))
+			return;
+		if (nlh->nlmsg_type == NLMSG_NOOP)
+			return;
+		if (nlh->nlmsg_flags & MSG_TRUNC) {
+			netlink_ack(skb, nlh, -ECOMM);
+			return;
+		}
+		if (nlh->nlmsg_flags & NLM_F_ACK)
+			netlink_ack(skb, nlh, 0);
+
+		/* write the command back to the sender
+		 * this can be probably done better through setup in
+		 * userspace, but we have no guarantee of ordering then..
+		 */
+		(void) ukcc_send(NLMSG_DATA(nlh), skblen - NLMSG_LENGTH(0));
+		ukcc_cmd_deliver(0, NLMSG_DATA(nlh), skblen - NLMSG_LENGTH(0));
+		kfree_skb(skb);
+	}
+}
+
+static int create_ukcc_channel(void)
+{
+	ukcc_sock = netlink_kernel_create(NETLINK_CKRM, ukcc_netlink_recv);
+
+	if (!ukcc_sock) {
+		printk(KERN_ERR "kevent: "
+			"unable to create netlink socket; aborting\n");
+		return -ENODEV;
+	}
+	ukcc_channel = 0;
+	return ukcc_channel;
+}
+
+static inline void close_ukcc_channel(void)
+{
+	if (ukcc_channel >= 0) {
+		sock_release(ukcc_sock->sk_socket);
+		ukcc_channel = -1;
+		chan_state = UKCC_STANDBY;
+	}
+}
+
+#define rec_set_hdr(r,t,p)      ((r)->hdr.type = (t), (r)->hdr.pid = (p))
+#define rec_set_timehdr(r,t,p,c)  (rec_set_hdr(r,t,p), \
+	(r)->hdr.jiffies = jiffies, (r)->hdr.cls=(unsigned long)(c) )
+
+#if CHANNEL_AUTO_CONT
+
+/* we only provide this for debugging.. it allows us to send records
+ * based on availability in the channel when the UKCC stalles rather
+ * going through the UKCC recovery protocol
+ */
+static inline void rec_send_len(void *r, size_t l)
+{
+	int chan_wasok = (chan_state == UKCC_OK);
+	int chan_isok = (ukcc_send(r, l) > 0);
+
+	chan_state = chan_isok ? UKCC_OK : UKCC_STANDBY;
+	if (chan_wasok && !chan_isok)
+		printk(KERN_WARN "Channel stalled\n");
+	else if (!chan_wasok && chan_isok)
+		printk(KERN_INFO "Channel continues\n");
+}
+
+#else
+
+/* Default UKCC channel protocol.
+ * Though a UKCC buffer overflow should not happen ever, it is possible iff
+ * the user daemon stops reading for some reason. Hence we provide a simple
+ * protocol based on 3 states
+ *     UKCC_OK      :=	channel is active and properly working. When a
+ *			channel write fails we move to state CHAN_FULL.
+ *     UKCC_FULL    :=	channel is active, but the last send_rec has failed.
+ *			As a result we will try to send an indication to
+ *			the daemon that this has happened. When that
+ *			succeeds, we move to state UKCC_STANDBY.
+ *     UKCC_STANDBY := 	we are waiting to be restarted by the user daemon
+ *
+ */
+
+static void ukcc_full(void)
+{
+	static spinlock_t ukcc_state_lock = SPIN_LOCK_UNLOCKED;
+	/*
+	 * protect transition from OK -> FULL to ensure only one record
+	 * is sent, rest we do not need to protect, protocol implies
+	 * that.
+	*/
+	int send = 0;
+	spin_lock(&ukcc_state_lock);
+	if ((send = (chan_state != UKCC_STANDBY)))
+		chan_state = UKCC_STANDBY;	/* assume we can send */
+	spin_unlock(&ukcc_state_lock);
+
+	if (send) {
+		struct crbce_ukcc_full rec;
+		rec_set_timehdr(&rec, CRBCE_REC_UKCC_FULL, 0, 0);
+		if (ukcc_send(&rec, sizeof(rec)) < 0)
+			/* channel is remains full .. try with next one */
+			chan_state = UKCC_FULL;
+	}
+}
+
+static inline void rec_send_len(void *r, size_t l)
+{
+	switch (chan_state) {
+	case UKCC_OK:
+		if (ukcc_send(r, l) > 0)
+			break;
+		/*FALLTHRU*/
+	case UKCC_FULL:
+		ukcc_full();
+		break;
+	default:
+		break;
+	}
+}
+
+#endif
+
+#define rec_send(r)	rec_send_len(r, sizeof(*(r)))
+
+/****************************************************************************
+ *
+ *  Callbacks for the CKRM engine.
+ *    In each we do the necessary classification and event record generation
+ *    We generate 3 kind of records in the callback
+ *    (a) FORK              	send the pid, the class and the ppid
+ *    (b) RECLASSIFICATION  	send the pid, the class and < sample data +
+ *				delay data >
+ *    (b) EXIT              	send the pid
+ *
+*****************************************************************************/
+
+int delta_mode = 0;
+
+static inline void copy_delay(struct task_delay_info *delay,
+			      struct task_struct *tsk)
+{
+	*delay = tsk->delays;
+}
+
+static inline void zero_delay(struct task_delay_info *delay)
+{
+	memset(delay, 0, sizeof(struct task_delay_info));
+	/* we need to think about doing this 64-bit atomic */
+}
+
+static inline void zero_sample(struct task_sample_info *sample)
+{
+	memset(sample, 0, sizeof(struct task_sample_info));
+	/* we need to think about doing this 64-bit atomic */
+}
+
+static inline int check_zero(void *ptr, int len)
+{
+	int iszero = 1;
+	int i;
+	unsigned long *uptr = (unsigned long *)ptr;
+
+	for (i = len / sizeof(unsigned long); i-- && iszero; uptr++)
+		/* assume its rounded */
+		iszero &= (*uptr == 0);
+	return iszero;
+}
+
+static inline int check_not_zero(void *ptr, int len)
+{
+	int i;
+	unsigned long *uptr = (unsigned long *)ptr;
+
+	for (i = len / sizeof(unsigned long); i--; uptr++)
+		/* assume its rounded */
+		if (*uptr)
+			return 1;
+	return 0;
+}
+
+static inline int sample_changed(struct task_sample_info *s)
+{
+	return check_not_zero(s, sizeof(*s));
+}
+static inline int delay_changed(struct task_delay_info *d)
+{
+	return check_not_zero(d, sizeof(*d));
+}
+
+static inline int
+send_task_record(struct task_struct *tsk, int event,
+		 struct ckrm_core_class *core, int send_forced)
+{
+	struct crbce_rec_task_data rec;
+	struct rbce_private_data *pdata;
+	int send = 0;
+
+	if (!ukcc_ok())
+		return 0;
+	pdata = RBCE_DATA(tsk);
+	if (pdata == NULL)
+		return 0;
+	if (send_forced || (delta_mode == 0)
+	    || sample_changed(PSAMPLE(RBCE_DATA(tsk)))
+	    || delay_changed(&tsk->delays)) {
+		rec_set_timehdr(&rec, event, tsk->pid,
+				core ? core : (struct ckrm_core_class *)tsk->
+				taskclass);
+		rec.sample = *PSAMPLE(RBCE_DATA(tsk));
+		copy_delay(&rec.delay, tsk);
+		rec_send(&rec);
+		if (delta_mode || send_forced) {
+			/* on reclassify or delta mode reset the counters */
+			zero_sample(PSAMPLE(RBCE_DATA(tsk)));
+			zero_delay(&tsk->delays);
+		}
+		send = 1;
+	}
+	return send;
+}
+
+void send_exit_notification(struct task_struct *tsk)
+{
+	send_task_record(tsk, CRBCE_REC_EXIT, NULL, 1);
+}
+
+void
+rbce_tc_ext_notify(int event, void *core, struct task_struct *tsk)
+{
+	struct crbce_rec_fork rec;
+
+	switch (event) {
+	case CKRM_EVENT_FORK:
+		if (ukcc_ok()) {
+			rec.ppid = tsk->parent->pid;
+			rec_set_timehdr(&rec, CKRM_EVENT_FORK, tsk->pid, core);
+			rec_send(&rec);
+		}
+		break;
+	case CKRM_EVENT_MANUAL:
+		rbce_tc_manual(tsk);
+
+	default:
+		send_task_record(tsk, event, (struct ckrm_core_class *)core, 1);
+		break;
+	}
+}
+
+/*====================== end classification engine =======================*/
+
+static void sample_task_data(unsigned long unused);
+
+struct timer_list sample_timer = {.expires = 0,.function = sample_task_data };
+unsigned long timer_interval_length = (250 * HZ) / 1000;
+
+inline void stop_sample_timer(void)
+{
+	if (sample_timer.expires > 0) {
+		del_timer_sync(&sample_timer);
+		sample_timer.expires = 0;
+	}
+}
+
+inline void start_sample_timer(void)
+{
+	if (timer_interval_length > 0) {
+		sample_timer.expires =
+		    jiffies + (timer_interval_length * HZ) / 1000;
+		add_timer(&sample_timer);
+	}
+}
+
+static void send_task_data(void)
+{
+	struct crbce_rec_data_delim limrec;
+	struct task_struct *proc, *thread;
+	int sendcnt = 0;
+	int taskcnt = 0;
+	limrec.is_stop = 0;
+	rec_set_timehdr(&limrec, CRBCE_REC_DATA_DELIMITER, 0, 0);
+	rec_send(&limrec);
+
+	read_lock(&tasklist_lock);
+	do_each_thread(proc, thread) {
+		taskcnt++;
+		task_lock(thread);
+		sendcnt += send_task_record(thread, CRBCE_REC_SAMPLE, NULL, 0);
+		task_unlock(thread);
+	} while_each_thread(proc, thread);
+	read_unlock(&tasklist_lock);
+
+	limrec.is_stop = 1;
+	rec_set_timehdr(&limrec, CRBCE_REC_DATA_DELIMITER, 0, 0);
+	rec_send(&limrec);
+
+}
+
+void notify_class_action(struct rbce_class *cls, int action)
+{
+	struct crbce_class_info cinfo;
+	int len;
+
+	rec_set_timehdr(&cinfo, CRBCE_REC_CLASS_INFO, 0, cls->classobj);
+	cinfo.action = action;
+	len = strnlen(cls->obj.name, CRBCE_MAX_CLASS_NAME_LEN - 1);
+	memcpy(&cinfo.name, cls->obj.name, len);
+	cinfo.name[len] = '\0';
+	len++;
+	cinfo.namelen = len;
+
+	len += sizeof(cinfo) - CRBCE_MAX_CLASS_NAME_LEN;
+	rec_send_len(&cinfo, len);
+}
+
+static void send_classlist(void)
+{
+	struct rbce_class *cls;
+
+	read_lock(&rbce_rwlock);
+	list_for_each_entry(cls, &class_list, obj.link)
+		notify_class_action(cls, 1);
+	read_unlock(&rbce_rwlock);
+}
+
+/*
+ *  resend_task_info
+ *
+ *  This function resends all essential task information to the client.
+ */
+static void resend_task_info(void)
+{
+	struct crbce_rec_data_delim limrec;
+	struct crbce_rec_fork rec;
+	struct task_struct *proc, *thread;
+
+	send_classlist();	/* first send available class information */
+
+	limrec.is_stop = 2;
+	rec_set_timehdr(&limrec, CRBCE_REC_DATA_DELIMITER, 0, 0);
+	rec_send(&limrec);
+
+	write_lock(&tasklist_lock);	/* avoid any mods during this phase */
+	do_each_thread(proc, thread)
+		if (ukcc_ok()) {
+			rec.ppid = thread->parent->pid;
+			rec_set_timehdr(&rec, CRBCE_REC_TASKINFO, thread->pid,
+					thread->taskclass);
+			rec_send(&rec);
+		}
+	while_each_thread(proc, thread);
+	write_unlock(&tasklist_lock);
+
+	limrec.is_stop = 3;
+	rec_set_timehdr(&limrec, CRBCE_REC_DATA_DELIMITER, 0, 0);
+	rec_send(&limrec);
+}
+
+extern int task_running_sys(struct task_struct *);
+
+static void add_all_private_data(void)
+{
+	struct task_struct *proc, *thread;
+
+	write_lock(&tasklist_lock);
+	do_each_thread(proc, thread)
+		if (RBCE_DATA(thread) == NULL)
+			RBCE_DATAP(thread) = create_private_data(NULL, 0);
+	while_each_thread(proc, thread);
+	write_unlock(&tasklist_lock);
+}
+
+static void sample_task_data(unsigned long unused)
+{
+	struct task_struct *proc, *thread;
+
+	int run = 0;
+	int wait = 0;
+	read_lock(&tasklist_lock);
+	do_each_thread(proc, thread) {
+		struct rbce_private_data *pdata = RBCE_DATA(thread);
+
+		if (pdata == NULL)
+			/* some wierdo race condition .. simply ignore */
+			continue;
+		if (thread->state == TASK_RUNNING) {
+			if (task_running_sys(thread)) {
+				atomic_inc((atomic_t *) &
+					   (PSAMPLE(pdata)->cpu_running));
+				run++;
+			} else {
+				atomic_inc((atomic_t *) &
+					   (PSAMPLE(pdata)->cpu_waiting));
+				wait++;
+			}
+		}
+		/* update IO state */
+		if (thread->flags & PF_IOWAIT) {
+			if (thread->flags & PF_MEMIO)
+				atomic_inc((atomic_t *) &
+					   (PSAMPLE(pdata)->memio_delayed));
+			else
+				atomic_inc((atomic_t *) &
+					   (PSAMPLE(pdata)->io_delayed));
+		}
+	}
+	while_each_thread(proc, thread);
+	read_unlock(&tasklist_lock);
+	start_sample_timer();
+}
+
+static void ukcc_cmd_deliver(int rchan_id, char *from, u32 len)
+{
+	struct crbce_command *cmdrec = (struct crbce_command *)from;
+	struct crbce_cmd_done cmdret;
+	int rc = 0;
+
+	cmdrec->len = len;	/*
+				 * add this to reflection so the user doesn't
+				 * accidently write the wrong length and the
+				 * protocol is getting screwed up
+				 */
+
+	if (cmdrec->type != CRBCE_REC_KERNEL_CMD) {
+		rc = EINVAL;
+		goto out;
+	}
+
+	switch (cmdrec->cmd) {
+	case CRBCE_CMD_SET_TIMER:
+		{
+			struct crbce_cmd_settimer *cptr =
+			    (struct crbce_cmd_settimer *)cmdrec;
+			if (len != sizeof(*cptr)) {
+				rc = EINVAL;
+				break;
+			}
+			stop_sample_timer();
+			timer_interval_length = cptr->interval;
+			if ((timer_interval_length > 0)
+			    && (timer_interval_length < 10))
+				timer_interval_length = 10;
+				/* anything finer can create problems */
+			printk(KERN_INFO "CRBCE set sample collect timer %lu\n",
+			       timer_interval_length);
+			start_sample_timer();
+			break;
+		}
+	case CRBCE_CMD_SEND_DATA:
+		{
+			struct crbce_cmd_send_data *cptr =
+			    (struct crbce_cmd_send_data *)cmdrec;
+			if (len != sizeof(*cptr)) {
+				rc = EINVAL;
+				break;
+			}
+			delta_mode = cptr->delta_mode;
+			send_task_data();
+			break;
+		}
+	case CRBCE_CMD_START:
+		add_all_private_data();
+		chan_state = UKCC_OK;
+		resend_task_info();
+		break;
+
+	case CRBCE_CMD_STOP:
+		chan_state = UKCC_STANDBY;
+		free_all_private_data();
+		break;
+
+	default:
+		rc = EINVAL;
+		break;
+	}
+
+out:
+	cmdret.hdr.type = CRBCE_REC_KERNEL_CMD_DONE;
+	cmdret.hdr.cmd = cmdrec->cmd;
+	cmdret.rc = rc;
+	rec_send(&cmdret);
+}
+
+int init_rbce_ext_pre(void)
+{
+	int rc;
+
+	rc = create_ukcc_channel();
+	return ((rc < 0) ? rc : 0);
+}
+
+int init_rbce_ext_post(void)
+{
+	init_timer(&sample_timer);
+	return 0;
+}
+
+void exit_rbce_ext(void)
+{
+	stop_sample_timer();
+	close_ukcc_channel();
+}
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/crbce_main.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/crbce_main.c	2005-05-05 09:38:07.000000000 -0700
@@ -0,0 +1,2 @@
+/* Easiest way to transmit a symbolic link as a patch */
+#include "rbce_main.c"
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_core.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/rbce_core.c	2005-05-05 09:38:06.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_core.c	2005-05-05 09:38:07.000000000 -0700
@@ -23,6 +23,23 @@
 
 #include "rbce_internal.h"
 
+/* Callback from core when a class is created */
+static void rbce_class_addcb(const char *classname, void *clsobj, int classtype)
+{
+	struct rbce_class *cls;
+
+	write_lock(&rbce_rwlock);
+	cls = find_class_name((char *)classname);
+	if (cls)
+		cls->classobj = clsobj;
+	else
+		cls = create_rbce_class(classname, classtype, clsobj);
+	if (cls)
+		notify_class_action(cls, 1);
+	write_unlock(&rbce_rwlock);
+	return;
+}
+
 /*
  * Callback from core when a class is deleted.
  */
@@ -40,6 +57,7 @@ rbce_class_deletecb(const char *classnam
 			printk(KERN_ERR "rbce: class %s changed identity\n",
 			       classname);
 		}
+		notify_class_action(cls, 0);
 		cls->classobj = NULL;
 		list_for_each_entry(pos, &rules_list[cls->classtype], link) {
 			rule = (struct rbce_rule *)pos;
@@ -53,9 +71,7 @@ rbce_class_deletecb(const char *classnam
 			}
 		}
 		if ((cls = find_class_name(classname)) != NULL) {
-			printk(KERN_ERR
-			       "rbce ERROR: class %s exists in rbce after "
-			       "removal in core\n", classname);
+			put_class(cls); /* propably created thru addcb */
 		}
 	}
 	write_unlock(&rbce_rwlock);
@@ -446,6 +462,16 @@ __evaluate_rule(struct task_struct *tsk,
 #define store_pdata(pdata)
 #define unstore_pdata(pdata)
 
+static inline void
+copy_ext_private_data(struct rbce_private_data *src,
+		      struct rbce_private_data *dst)
+{
+	if (src)
+		dst->ext_data = src->ext_data;
+	else
+		memset(&dst->ext_data, 0, sizeof(dst->ext_data));
+}
+
 struct rbce_private_data *create_private_data(struct rbce_private_data
 						     *src, int copy_sample)
 {
@@ -473,6 +499,7 @@ struct rbce_private_data *create_private
 				bitvector_init(pdata->true, vsize);
 			}
 		}
+		copy_ext_private_data(src, pdata);
 		pdata->evaluate = 1;
 		pdata->rules_version = src ? src->rules_version : 0;
 		pdata->app_tag = NULL;
@@ -655,7 +682,7 @@ static struct ckrm_core_class *rbce_clas
 
 /* helper functions that are required in the extended version */
 
-static inline void rbce_tc_manual(struct task_struct *tsk)
+void rbce_tc_manual(struct task_struct *tsk)
 {
 	read_lock(&rbce_rwlock);
 
@@ -716,6 +743,7 @@ static void rbce_tc_exitcb(struct task_s
 {
 	struct rbce_private_data *pdata;
 
+	send_exit_notification(tsk);
 	pdata = RBCE_DATA(tsk);
 	RBCE_DATAP(tsk) = NULL;
 	if (pdata) {
@@ -790,20 +818,29 @@ static void *rbce_tc_classify(enum ckrm_
 	return cls;
 }
 
+#ifndef CRBCE_EXTENSION
 static void rbce_tc_notify(int event, void *core, struct task_struct *tsk)
 {
 	if (event != CKRM_EVENT_MANUAL)
 		return;
 	rbce_tc_manual(tsk);
 }
+#endif
 
 static struct ckrm_eng_callback rbce_taskclass_ecbs = {
 	.c_interest = (unsigned long)(-1),	/* set whole bitmap */
 	.classify = (ce_classify_fct) rbce_tc_classify,
 	.class_delete = rbce_class_deletecb,
+	.class_add = rbce_class_addcb,
+#ifndef CRBCE_EXTENSION
 	.n_interest = (1 << CKRM_EVENT_MANUAL),
 	.notify = (ce_notify_fct) rbce_tc_notify,
 	.always_callback = 0,
+#else
+	.n_interest = (unsigned long)(-1),      /* set whole bitmap */
+	.notify = (ce_notify_fct) rbce_tc_ext_notify,
+	.always_callback = 1,
+#endif
 };
 
 /*============================================================================
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_internal.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/rbce_internal.h	2005-05-05 09:38:06.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_internal.h	2005-05-05 09:38:07.000000000 -0700
@@ -202,6 +202,7 @@ enum op_token {
  *
  */
 struct rbce_private_data {
+	struct rbce_ext_private_data ext_data;
   	int evaluate;		/* whether to evaluate rules or not ? */
   	int rules_version;	/* rules_version at last evaluation */
 	bitvector_t *eval;
@@ -219,6 +220,7 @@ struct rbce_private_data {
 extern struct rbce_eng_callback rbce_rcfs_ecbs;
 extern int rbce_enabled;
 extern struct list_head rules_list[];
+extern struct list_head class_list;
 extern const int use_persistent_state;
 extern int gl_num_rules;
 extern int gl_bitmap_version;
@@ -251,4 +253,12 @@ extern void free_all_private_data(void);
 extern void unregister_classtype_engines(void);
 extern int register_classtype_engines(void);
 
+extern struct rbce_class *create_rbce_class(const char *, int, void *);
+extern void rbce_tc_manual(struct task_struct *);
+extern void notify_class_action(struct rbce_class *, int);
+extern void send_exit_notification(struct task_struct *);
+extern void rbce_tc_ext_notify(int, void *, struct task_struct *);
+extern int init_rbce_ext_pre(void);
+extern int init_rbce_ext_post(void);
+extern void exit_rbce_ext(void);
 #endif /* _RBCE_INTERNAL_H */
Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_main.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/rbce/rbce_main.c	2005-05-05 09:38:06.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/rbce/rbce_main.c	2005-05-05 09:38:07.000000000 -0700
@@ -56,7 +56,7 @@ int termop_2_vecidx[RBCE_RULE_INVALID] =
 
 extern int errno;
 
-static LIST_HEAD(class_list);
+LIST_HEAD(class_list);
 struct list_head rules_list[CKRM_MAX_CLASSTYPES];
 static int gl_action, gl_num_terms;
 static int gl_released;
@@ -1205,11 +1205,16 @@ int init_rbce(void)
 		INIT_LIST_HEAD(&rules_list[i]);
 	}
 
-	rc = register_classtype_engines();
+	rc = init_rbce_ext_pre();
 	line = __LINE__;
 	if (rc)
 		goto out;
 
+	rc = register_classtype_engines();
+	line = __LINE__;
+	if (rc)
+		goto out_unreg_ckrm;
+
 	/* register any other class type engine here */
   	rc = rcfs_register_engine(&rbce_rcfs_ecbs);
   	line = __LINE__;
@@ -1219,13 +1224,22 @@ int init_rbce(void)
 	if (rcfs_mounted) {
 		rc = rbce_create_config();
 		line = __LINE__;
-		if (!rc)
-			goto out;
+		if (rc)
+			goto out_unreg_rcfs;
 	}
 
+	rc = init_rbce_ext_post();
+	line = __LINE__;
+	if (rc)
+		goto out_unreg_rcfs;
+	return 0;
+
+out_unreg_rcfs:
 	rcfs_unregister_engine(&rbce_rcfs_ecbs);
 out_unreg_classtype:
  	unregister_classtype_engines();
+out_unreg_ckrm:
+	exit_rbce_ext();
 out:
 	printk(KERN_ERR "%s: error installing rc=%d line=%d\n",
 		__FUNCTION__, rc, line);
@@ -1237,6 +1251,8 @@ void exit_rbce(void)
 	int i;
 	printk(KERN_INFO "Removing \'%s\' module\n", modname);
 
+	exit_rbce_ext();
+
 	/* Print warnings if lists are not empty, which is a bug */
 	if (!list_empty(&class_list)) {
 		printk(KERN_WARNING "exit_rbce: Class list is not empty\n");
Index: linux-2.6.12-rc3-ckrm5/include/linux/rbce.h
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/include/linux/rbce.h	2005-05-05 09:38:01.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/include/linux/rbce.h	2005-05-05 09:38:07.000000000 -0700
@@ -22,7 +22,57 @@
 #ifndef _LINUX_RBCE_H
 #define _LINUX_RBCE_H
 
+struct rbce_class;
+#ifndef CRBCE_EXTENSION
+
+/****************************************************************************
+ *
+ *   RBCE STANDALONE VERSION, NO CHOICE FOR DATA COLLECTION
+ *
+ ****************************************************************************/
+
 #define RBCE_MOD_DESCR "Rule Based Classification Engine Module for CKRM"
 #define RBCE_MOD_NAME  "rbce"
 
+struct rbce_ext_private_data {
+	/* no data */
+};
+static inline void send_exit_notification(struct task_struct *tsk)
+{
+}
+static inline void notify_class_action(struct rbce_class *cls, int action)
+{
+}
+/* extension initialization and destruction at module init and exit */
+static inline int init_rbce_ext_pre(void)
+{
+	return 0;
+}
+static inline int init_rbce_ext_post(void)
+{
+	return 0;
+}
+static inline void exit_rbce_ext(void)
+{
+}
+#else /* CRBCE_EXTENSION */
+
+/***************************************************************************
+ *
+ *   RBCE with User Level Notification
+ *
+ ***************************************************************************/
+
+#define RBCE_MOD_DESCR 	"Rule Based Classification Engine Module" \
+			"with Data Sampling/Delivery for CKRM"
+#define RBCE_MOD_NAME 	"crbce"
+
+#include <linux/crbce.h>
+
+struct rbce_ext_private_data {
+	struct task_sample_info sample;
+};
+
+#endif	/* CRBCE_EXTENSION */
+
 #endif	/* _LINUX_RBCE_H */

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 20/21] CKRM: Clean up typo in printk message
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (18 preceding siblings ...)
  2005-05-05 18:07 ` [patch 19/21] CKRM: Rule Based Classification Engine, more advanced classification engine gh
@ 2005-05-05 18:07 ` gh
  2005-05-05 18:07 ` [patch 21/21] CKRM: Fix for compiler warnings gh
  20 siblings, 0 replies; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=ckrm-printf-cleanup


Description: Simple typo, but makes code look incomplete.

Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Gerrit Huizenga <gh@us.ibm.com>

 ckrm.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm.c
===================================================================
--- linux-2.6.12-rc3-ckrm5.orig/kernel/ckrm/ckrm.c	2005-05-05 09:35:14.000000000 -0700
+++ linux-2.6.12-rc3-ckrm5/kernel/ckrm/ckrm.c	2005-05-05 09:43:47.000000000 -0700
@@ -598,7 +598,7 @@ ckrm_register_res_ctlr(struct ckrm_class
 		 */
 		read_lock(&ckrm_class_lock);
 		list_for_each_entry(core, &clstype->classes, clslist) {
-			printk("CKRM .. create res clsobj for resouce <%s>"
+			printk(KERN_NOTICE "CKRM .. create res clsobj for resource <%s>"
 			       "class <%s> par=%p\n", rcbs->res_name,
 			       core->name, core->hnode.parent);
 			ckrm_alloc_res_class(core, core->hnode.parent, resid);

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [patch 21/21] CKRM: Fix for compiler warnings
  2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
                   ` (19 preceding siblings ...)
  2005-05-05 18:07 ` [patch 20/21] CKRM: Clean up typo in printk message gh
@ 2005-05-05 18:07 ` gh
  2005-05-08 12:49   ` Domen Puncer
  20 siblings, 1 reply; 23+ messages in thread
From: gh @ 2005-05-05 18:07 UTC (permalink / raw)
  To: linux-kernel, ckrm-tech

--
Content-Disposition: inline; filename=compiler-warning-fix


Signed-Off-By: Vivek Kashyap <kashyapv@us.ibm.com>
Signed-Off-By: Gerrit Huizenga <gh@us.ibm.com>

The attached patch fixes warnings seen when event callback table
is initialized.

 kernel/ckrm/ckrm_sockc.c |    6 ++++--
 kernel/ckrm/ckrm_tc.c    |   21 +++++++++++++--------
 2 files changed, 17 insertions(+), 10 deletions(-)

Index: linux-2.6.12-rc3-ckrm6/kernel/ckrm/ckrm_tc.c
===================================================================
--- linux-2.6.12-rc3-ckrm6.orig/kernel/ckrm/ckrm_tc.c	2005-05-05 10:50:23.000000000 -0700
+++ linux-2.6.12-rc3-ckrm6/kernel/ckrm/ckrm_tc.c	2005-05-05 10:50:58.000000000 -0700
@@ -253,14 +253,17 @@ do {						\
 	ce_release(&ct_taskclass);              \
 } while (0)
 
-static void cb_taskclass_newtask(struct task_struct *tsk)
+static void cb_taskclass_newtask(void *tsk1)
 {
+	struct task_struct *tsk = (struct task_struct *)tsk1;
+
 	tsk->taskclass = NULL;
 	INIT_LIST_HEAD(&tsk->taskclass_link);
 }
 
-static void cb_taskclass_fork(struct task_struct *tsk)
+static void cb_taskclass_fork(void *tsk1)
 {
+	struct task_struct *tsk = (struct task_struct *)tsk1;
 	struct ckrm_task_class *cls = NULL;
 
 	pr_debug("%p:%d:%s\n", tsk, tsk->pid, tsk->comm);
@@ -281,26 +284,28 @@ static void cb_taskclass_fork(struct tas
 	ce_release(&ct_taskclass);
 }
 
-static void cb_taskclass_exit(struct task_struct *tsk)
+static void cb_taskclass_exit(void *tsk1)
 {
+	struct task_struct *tsk = (struct task_struct *)tsk1;
+
 	CE_CLASSIFY_NORET(&ct_taskclass, CKRM_EVENT_EXIT, tsk);
 	ckrm_set_taskclass(tsk, (void *)-1, NULL, CKRM_EVENT_EXIT);
 }
 
-static void cb_taskclass_exec(const char *filename)
+static void cb_taskclass_exec(void *filename)
 {
 	pr_debug("%p:%d:%s <%s>\n", current, current->pid, current->comm,
-		   filename);
+		   (const char *)filename);
 	CE_CLASSIFY_TASK_PROTECT(CKRM_EVENT_EXEC, current);
 }
 
-static void cb_taskclass_uid(void)
+static void cb_taskclass_uid(void *arg)
 {
 	pr_debug("%p:%d:%s\n", current, current->pid, current->comm);
 	CE_CLASSIFY_TASK_PROTECT(CKRM_EVENT_UID, current);
 }
 
-static void cb_taskclass_gid(void)
+static void cb_taskclass_gid(void *arg)
 {
 	pr_debug("%p:%d:%s\n", current, current->pid, current->comm);
 	CE_CLASSIFY_TASK_PROTECT(CKRM_EVENT_GID, current);
@@ -313,7 +318,7 @@ static struct ckrm_event_spec taskclass_
 	{CKRM_EVENT_EXIT, { cb_taskclass_exit, NULL }},
 	{CKRM_EVENT_UID, { cb_taskclass_uid, NULL }},
 	{CKRM_EVENT_GID, { cb_taskclass_gid, NULL }},
-	{-1, { -1, NULL }}
+	{-1, { NULL, NULL }}
 };
 
 /*
Index: linux-2.6.12-rc3-ckrm6/kernel/ckrm/ckrm_sockc.c
===================================================================
--- linux-2.6.12-rc3-ckrm6.orig/kernel/ckrm/ckrm_sockc.c	2005-05-05 10:50:23.000000000 -0700
+++ linux-2.6.12-rc3-ckrm6/kernel/ckrm/ckrm_sockc.c	2005-05-05 10:50:58.000000000 -0700
@@ -172,8 +172,9 @@ static void ckrm_sock_add_resctrl(struct
  *                   Functions called from classification points          *
  **************************************************************************/
 
-static void cb_sockclass_listen_start(struct sock *sk)
+static void cb_sockclass_listen_start(void *sk1)
 {
+	struct sock *sk = (struct sock *)sk1;
 	struct ckrm_net_struct *ns = NULL;
 	struct ckrm_sock_class *newcls = NULL;
 	struct ckrm_res_ctlr *rcbs;
@@ -243,8 +244,9 @@ static void cb_sockclass_listen_start(st
 	return;
 }
 
-static void cb_sockclass_listen_stop(struct sock *sk)
+static void cb_sockclass_listen_stop(void *sk1)
 {
+	struct sock *sk = (struct sock *)sk1;
 	struct ckrm_net_struct *ns = NULL;
 	struct ckrm_sock_class *newcls = NULL;
 

--


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [patch 21/21] CKRM: Fix for compiler warnings
  2005-05-05 18:07 ` [patch 21/21] CKRM: Fix for compiler warnings gh
@ 2005-05-08 12:49   ` Domen Puncer
  0 siblings, 0 replies; 23+ messages in thread
From: Domen Puncer @ 2005-05-08 12:49 UTC (permalink / raw)
  To: gh; +Cc: linux-kernel, ckrm-tech

On 05/05/05 11:07 -0700, gh@us.ibm.com wrote:
> -static void cb_taskclass_newtask(struct task_struct *tsk)
> +static void cb_taskclass_newtask(void *tsk1)
>  {
> +	struct task_struct *tsk = (struct task_struct *)tsk1;
> +

I see this often in this code, so I'll mention it:
There's no need to cast void pointers to other pointers.


	Domen


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2005-05-08 12:49 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-05-05 18:07 [patch 00/21] CKRM: Core patch set with Classification Engine, basic controllers gh
2005-05-05 18:07 ` [patch 01/21] CKRM: Core CKRM Event Callbacks gh
2005-05-05 18:07 ` [patch 02/21] CKRM: Processor Delay Accounting gh
2005-05-05 18:07 ` [patch 03/21] CKRM: Core infrastructure gh
2005-05-05 18:07 ` [patch 04/21] CKRM: Resource Control File System (rcfs) gh
2005-05-05 18:07 ` [patch 05/21] CKRM: Classtype definitions for task class gh
2005-05-05 18:07 ` [patch 06/21] CKRM: Classtype definitions for socket class gh
2005-05-05 18:07 ` [patch 07/21] CKRM: Numtasks Controller gh
2005-05-05 18:07 ` [patch 08/21] CKRM: Documentation gh
2005-05-05 18:07 ` [patch 09/21] CKRM: Add missing read_unlock gh
2005-05-05 18:07 ` [patch 10/21] CKRM: Move Callbacks from listenaq to socketclass gh
2005-05-05 18:07 ` [patch 11/21] CKRM: Change ipaddr_port syntax gh
2005-05-05 18:07 ` [patch 12/21] CKRM: Check to see if my guarantee is set to DONTCARE gh
2005-05-05 18:07 ` [patch 13/21] CKRM: Minor cosmetic cleanups in numtasks controller gh
2005-05-05 18:07 ` [patch 14/21] CKRM: undo removal of check in numtasks_put_ref_local gh
2005-05-05 18:07 ` [patch 15/21] CKRM: Rule Based Classification Engine, stub rcfs support gh
2005-05-05 18:07 ` [patch 16/21] CKRM: Rule Based Classification Engine, basic " gh
2005-05-05 18:07 ` [patch 17/21] CKRM: Rule Based Classification Engine, bitvector support for classification info gh
2005-05-05 18:07 ` [patch 18/21] CKRM: Rule Based Classification Engine, full CE gh
2005-05-05 18:07 ` [patch 19/21] CKRM: Rule Based Classification Engine, more advanced classification engine gh
2005-05-05 18:07 ` [patch 20/21] CKRM: Clean up typo in printk message gh
2005-05-05 18:07 ` [patch 21/21] CKRM: Fix for compiler warnings gh
2005-05-08 12:49   ` Domen Puncer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).