All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nadia.Derbey@bull.net
To: akpm@osdl.org, randy.dunlap@oracle.com
Cc: linux-kernel@vger.kernel.org, Nadia Derbey <Nadia.Derbey@bull.net>
Subject: [PATCH 1/6] AKT - Tunable structure and registration routines
Date: Tue, 30 Jan 2007 11:11:44 +0100	[thread overview]
Message-ID: <20070130102908.781457000@bull.net> (raw)
In-Reply-To: 20070130101143.296619000@bull.net

[-- Attachment #1: tunables_registration.patch --]
[-- Type: text/plain, Size: 32230 bytes --]

[PATCH 01/06]

Defines the auto_tune structure: this is the structure that contains the
information needed by the adjustment routine for a given tunable.
Also defines the registration routines.

The fork kernel component defines a tunable structure for the threads-max
tunable and registers it.


Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>


---
 Documentation/00-INDEX      |    2 
 Documentation/auto_tune.txt |  333 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/akt.h         |  166 +++++++++++++++++++++
 include/linux/akt_ops.h     |  109 ++++++++++++++
 init/Kconfig                |    2 
 init/main.c                 |    2 
 kernel/Makefile             |    1 
 kernel/autotune/Kconfig     |   26 +++
 kernel/autotune/Makefile    |    7 
 kernel/autotune/akt.c       |  119 +++++++++++++++
 kernel/fork.c               |   18 ++
 11 files changed, 785 insertions(+)

Index: linux-2.6.20-rc4/Documentation/auto_tune.txt
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/Documentation/auto_tune.txt	2007-01-29 12:54:09.000000000 +0100
@@ -0,0 +1,333 @@
+			Automatic Kernel Tunables
+                        =========================
+
+		   Nadia Derbey (Nadia.Derbey@bull.net)
+
+
+
+This feature aims at making the kernel automatically change the tunables
+values as it sees resources running out.
+
+The AKT framework is made of 2 parts:
+
+1) Kernel part:
+Interfaces are provided to the kernel subsystems, to (un)register the
+tunables that might be automatically tuned in the future.
+
+Registering a tunable consists of the following steps:
+- a structure is declared and filled by the kernel subsystem for the
+registered tunable
+- that tunable structure is registered into sysfs
+
+Registration should be done during the kernel subsystem initialization step.
+
+Unregistering a tunable is the reverse operation. It should not be necessary
+for the kernel subsystems: it is only useful when unloading modules that would
+have registered a tunable during their loading step.
+
+The routines interfaces are the following:
+
+1.1) Declaring a tunable:
+
+A tunable structure should be declared and defined by the kernel subsystems as
+follows:
+
+DEFINE_TUNABLE(structure_name, threshold, min, max,
+		tunable_variable_ptr, checked_variable_ptr,
+		tunable_variable_type);
+
+Parameters:
+- structure_name: this is the name of the tunable structure
+
+- threshold: percentage to apply to the tunable value to detect if adjustment
+is needed
+
+- min: minimum value the tunable can ever reach (needed when adjusting down
+the tunable)
+
+- max: maximum value the tunable can ever reach (needed when adjusting up the
+tunable)
+
+- tunable_variable_ptr: address of the tunable that will be adjusted if
+needed.
+(ex: in kernel/fork.c it is max_threads's address)
+
+- checked_variable_ptr: address of the variable that is controlled by the
+tunable. This is the calling subsystem's object counter.
+(ex: in kernel/fork.c it is nr_threads's address: nr_threads should
+always remain < max_threads)
+
+- tunable_variable_type: this type is important since it helps choosing the
+appropriate automatic tuning routine.
+Presently, it can be one of int / size_t and this can easily be enhanced.
+
+The automatic tuning routine (i.e. the routine that should be called when
+automatic tuning is activated) is set to the default one:
+default_auto_tuning_<type>().
+<type> is chosen according to the tunable_variable_type parameters.
+All the previously listed parameters are useful to this routine.
+Refer to the description of the automatic adjustment routine to see how
+these parameters are actually used.
+
+Refer to "Updating the auto-tuning function pointer" to know how to set
+this routine to another one.
+
+
+1.2) Updating a tunable's characteristics
+
+1.2.1) Updating min / max values:
+
+Sometimes, when calling DEFINE_TUNABLE(), the min and max values are not
+exactly known, yet. In that case, the following routine should be called
+once these values are known:
+
+set_tunable_min_max(structure_name, new_min, new_max)
+
+Parameters:
+- structure_name: this is the name of the tunable structure
+
+- new_min: minimum value the tunable can ever reach
+
+- new_max: maximum value the tunable can ever reach
+
+1.2.2) Updating the auto-tuning function pointer:
+
+If the default auto-tuning routine doesn't fit your needs, you can define
+another one and associate it to the tunable using the following routine:
+
+set_autotuning_routine(structure_name, auto_tune)
+
+Parameters:
+- structure_name: this is the name of the tunable structure
+
+- auto_tune: routine that should be called when automatic tuning is activated.
+If this parameter is not NULL, it should be set to a function pointer defined
+by the kernel subsystem caller. See 1.5) for the routine prototype. See also
+maxfiles_auto_tuning() in fs/file_table.c for an example.
+
+
+1.3) Registering a tunable:
+
+Once declared and its min / max / auto_tuning routine updated, the tunable
+structure should be registered using the following routine:
+
+int register_tunable(struct auto_tune *tunable_addr);
+
+Parameters:
+- tunable_addr: address of the tunable structure previsouly declared.
+
+Return value:
+- 0 : successful
+- < 0 : failure
+
+
+Registering a tunable makes it potentially automatically adjustable:
+the tunable is viewed as a kobject with 3 attributes (i.e. 3 files at sysfs
+level):
+- autotune (rw): enables to (de)activate the auto tuning for that tunable
+- min (rw): enables to play with the min tunable value
+- max (rw): enables to play with the max tunable value
+
+The only way to make a registered tunable automatically adjustable is through
+sysfs (see the sysfs part for more details).
+
+
+
+1.4) Unregistering a tunable:
+
+int unregister_tunable(struct auto_tune *reg_tun_addr);
+
+Parameters:
+- reg_tun_addr: address of the tunable structure to unregister
+
+
+This routine is only useful for modules: when unloading, they should
+unregister any previously registered tunable.
+
+
+
+1.5) Automatic tuning routine:
+
+The 2nd main service provided by the kernel part is a function pointer
+(auto_tune_func): it points to the routine that actually automatically
+adjusts the tunable passed in as a parameter.
+
+This is accomplished by one of the following:
+- if an automatic tuning routine has been provided during the tunable
+declaration, that routine will actually be called.
+- if no automatic tuning routine has been provided, the default one is called.
+NOTE: it can process one of the following types, depending on the type used
+	when declaring the tunable (see DEFINE_TUNABLE above): int, size_t.
+
+
+If the automatic tuning routine is provided by the kernel subsystem caller,
+it should be declared as follows:
+
+int <routine_name>(int cmd, struct auto_tune *params);
+
+Parameters:
+- cmd: tuning direction
+	. AKT_UP: the tunable will be adjusted upwards (i.e. its value is
+		increased if needed)
+	. AKT_DOWN: the tunable is adjusted downwards (i.e. its value is
+		decreased if needed)
+- params: pointer to the previously registered tunable structure
+
+
+Any kernel subsystem that has registered a tunable should call
+auto_tune_func() as follows:
+
++-------------------------+--------------------------------------------+
+| Step                    | Routine to call                            |
++-------------------------+--------------------------------------------+
+| Declaration phase       | DEFINE_TUNABLE(name, values...);           |
++-------------------------+--------------------------------------------+
+| Initialization routine  | set_tunable_min_max(name, min, max);       |
+|                         | set_autotuning_routine(name, routine);     |
+|                         | register_tunable(&name);                   |
+| Note: the 1st 2 calls   |                                            |
+|       are optional      |                                            |
++-------------------------+--------------------------------------------+
+| Alloc                   | activate_auto_tuning(AKT_UP, &name);       |
++-------------------------+--------------------------------------------+
+| Free                    | activate_auto_tuning(AKT_DOWN, &name);     |
++-------------------------+--------------------------------------------+
+| module_exit() routine   | unregister_tunable(&name);                 |
++-------------------------+--------------------------------------------+
+
+activate_auto_tuning is a static inline defined in akt.h, that does the
+following:
+. if <tunable is registered> and <auto tuning is allowed for tunable>
+.   call the routine stored in tunable->auto_tune
+
+
+The effect of the default automatic tuning routine is the following:
+
+           +----------------------------------------------------------------+
+           |                 Tunable automatically adjustable               |
+           +---------------+------------------------------------------------+
+           |      NO       |                      YES                       |
++----------+---------------+------------------------------------------------+
+| AKT_UP   | No effect     | If the tunable value exceeds the specified     |
+|          |               | threshold, that value is increased up to a     |
+|          |               | maximum value.                                 |
+|          |               | The maximum value is specified during the      |
+|          |               | tunable declaration and can be changed at any  |
+|          |               | time through sysfs                             |
++----------+---------------+------------------------------------------------+
+| AKT_DOWN | No effect     | If the tunable value falls under the specified |
+|          |               | threshold, that value is decreased down to a   |
+|          |               | minimum value.                                 |
+|          |               | The minimum value is specified during the      |
+|          |               | tunable declaration and can be changed at any  |
+|          |               | time through sysfs                             |
++----------+---------------+------------------------------------------------+
+
+
+1.6. Default automatic adjustment routine
+
+The last service provided by AKT at the kernel level is the default automatic
+adjustment routine. As seen, above, this routine supports various tunables
+types. It works as follows (only the AKT_UP direction is described here -
+AKT_DOWN does the reverse operation):
+
+The 2nd parameter passed in to this routine is a pointer to a previously
+registered tunable structure. That structure contains the following fields
+(see 1.1 for the detailed description):
+- threshold
+- key
+- min
+- max
+- tunable
+- checked
+
+When this routine is entered, it does the following:
+1. <*checked> is compared to <*tunable> * threshold
+2. if <*checked> is greater, <*tunable> is set to:
+	<*tunable> + (<*tunable> * (100 - threshold) / 100)
+
+
+
+1.6) akt and sysfs:
+
+AKT uses sysfs to enable the tunables management from the user world (mainly
+making them automatic or manual).
+
+akt uses sysfs in the following way:
+- a tunables subsystem (tunables_subsys) is declared and registered during akt
+initialization.
+- registering a tunable is equivalent to registering the corresponding kobject
+within that subsystem.
+- each tunable kobject has 3 associated attributes, all with a RW mode (i.e.
+the show() and store() methods are provided for them):
+	. autotune: enables to (de)activate automatic tuning for the tunable
+	. max: enables to set a new maximum value for the tunable
+	. min: enables to set a new minimum value for the tunable
+
+
+1.7) tunables that are namespace dependent
+
+In this paragraph, the particular case of tunables that are namespace
+dependent is presented.
+
+1.7.1) Declaring a tunable:
+
+The tunable structure for such tunables should be declared in the namespace
+structure that contains the associated tunable (ex: the tunable structure for
+msg_ctlmni should be declared in the ipc_namespace structure).
+
+The tunable structure should be declared as follows:
+
+DECLARE_TUNABLE(structure_name);
+
+Parameters:
+- structure_name: this is the name of the tunable structure
+
+1.7.2) Initializing the tunable structure
+
+Then the tunable structure should be initialized by calling the following
+routine:
+
+init_tunable_ipcns(namespace_ptr, structure_name, threshold, min, max,
+		tunable_variable_ptr, checked_variable_ptr,
+		tunable_variable_type);
+
+Parameters:
+- namespace_ptr: pointer to the namespace the tunable belongs to.
+
+See DEFINE_TUNABLE for the other parameters.
+
+1.7.3) Registering the tunable structure
+
+register_tunable should be called, giving it the tunable structure address
+that belongs to the init namespace.
+
+This applies to activate_auto_tuning too.
+
+All the routines that show/store attributes or that do the auto tuning are
+namespace dependent.
+
+
+2) User part:
+
+As seen above, the only way to activate automatic tuning is from user side:
+- the directory /sys/tunables is created during the init phase.
+- each time a tunable is registered by a kernel subsystem, a directory is
+created for it under /sys/tunables.
+- This directory contains 1 file for each tunable kobject attribute:
++-----------+---------------+-------------------+--------------------------+
+| attribute | default value | how to set it     | effect                   |
++-----------+---------------+-------------------+--------------------------+
+| autotune  | 0             | echo 1 > autotune | makes the tunable        |
+|           |               |                   | automatic                |
+|           |               | echo 0 > autotune | makes the tunable manual |
++-----------+---------------+-------------------+--------------------------+
+| max       | max value set | echo <M> > max    | sets the tunable max     |
+|           | during tunable|                   | value to <M>             |
+|           | definition    |                   |                          |
++-----------+---------------+-------------------+--------------------------+
+| min       | min value set | echo <m> > min    | sets the tunable min     |
+|           | during tunable|                   | value to <m>             |
+|           | definition    |                   |                          |
++-----------+---------------+-------------------+--------------------------+
+
Index: linux-2.6.20-rc4/Documentation/00-INDEX
===================================================================
--- linux-2.6.20-rc4.orig/Documentation/00-INDEX	2007-01-29 12:39:29.000000000 +0100
+++ linux-2.6.20-rc4/Documentation/00-INDEX	2007-01-29 12:55:49.000000000 +0100
@@ -52,6 +52,8 @@ applying-patches.txt
 	- description of various trees and how to apply their patches.
 arm/
 	- directory with info about Linux on the ARM architecture.
+auto_tune.txt
+	- info on the Automatic Kernel Tunables (AKT) feature.
 basic_profiling.txt
 	- basic instructions for those who wants to profile Linux kernel.
 binfmt_misc.txt
Index: linux-2.6.20-rc4/include/linux/akt.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/include/linux/akt.h	2007-01-29 14:59:38.000000000 +0100
@@ -0,0 +1,166 @@
+/*
+ * linux/include/akt.h
+ *
+ * Automatic Kernel Tunables support for Linux.
+ * This file contains structures definitions and prototypes needed for AKT
+ * support.
+ *
+ * Copyright (C) 2006 Bull S.A.S
+ *
+ * Author: Nadia Derbey <Nadia.Derbey@bull.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef AKT_H
+#define AKT_H
+
+#include <linux/types.h>
+#include <linux/kobject.h>
+
+
+
+/*
+ * First parameter passed to the adjustment routine
+ */
+#define AKT_UP   0   /* adjustment "up" */
+#define AKT_DOWN 1   /* adjustment "down" */
+
+
+struct auto_tune;
+/*
+ * Automatic adjustment routine.
+ * Returns 0, if the tunable value has not been changed, 1 else
+ */
+typedef int (*auto_tune_fn)(int, struct auto_tune *);
+
+
+/*
+ * Structure used to describe the min / max values for a tunable inside the
+ * auto_tune structure.
+ */
+struct tunable_limit {
+	ulong value;
+};
+
+
+
+/*
+ * This is the structure that describes a tunable. One of these structures is
+ * allocated for each registered tunable, and the associated kobject exported
+ * via sysfs.
+ *
+ * The structure lock (tunable_lck) protects
+ * against concurrent accesses to tunable and checked pointers
+ *
+ * A pointer to this structure is passed in to  the automatic adjustment
+ * routine.
+ * automatic adjustment principle is the following:
+ *    AKT_UP:
+ *       1. *checked is compared to *tunable * threshold
+ *       2. if *checked is greater, the tunable is adjusted up
+ *    AKT_DOWN: reverse operation
+ */
+struct auto_tune {
+	spinlock_t tunable_lck; /* serializes access to the stucture fields */
+	auto_tune_fn auto_tune; /* auto tuning routine registered by the */
+				/* calling kernel susbsystem. If NULL, the */
+				/* auto tuning routine that will be called */
+				/* is the default one that processes uints */
+	const char *name;
+	unsigned char flags;	/* Only 2 bits are meaningful: */
+				/* bit 0: 1 if the associated tunable can */
+				/*        be automatically adjusted */
+				/* bits 1: 1 if the tunable has been */
+				/*         registered */
+				/* bits 2-7: unused */
+	char threshold;	/* threshold to enable the adjustment expressed as */
+			/* a %age */
+	struct tunable_limit min;	/* min value the tunable can ever */
+					/* reach */
+	struct tunable_limit max;	/* max value the tunable can ever */
+					/* reach */
+	void *tunable;	/* address of the tunable to adjust */
+	void *checked;	/* address of the variable that is controlled by */
+			/* the tunable. This is the calling subsystem's */
+			/* object counter */
+};
+
+
+/*
+ * Flags for a registered tunable
+ */
+#define TUNABLE_REGISTERED  0x02
+
+
+/*
+ * When calling this routine the tunable lock should be held
+ */
+static inline int is_tunable_registered(struct auto_tune *tunable)
+{
+	return (tunable->flags & TUNABLE_REGISTERED) == TUNABLE_REGISTERED;
+}
+
+
+#ifdef CONFIG_AKT
+
+
+
+#define TUNABLE_INIT(_name, _thresh, _min, _max, _tun, _chk, type)	\
+	{								\
+		.tunable_lck	= SPIN_LOCK_UNLOCKED,			\
+		.auto_tune	= default_auto_tuning_##type,		\
+		.name		= (_name),				\
+		.flags		= 0,					\
+		.threshold	= (_thresh),				\
+		.min	= {						\
+			.value		= (_min),			\
+		},							\
+		.max	= {						\
+			.value		= (_max),			\
+		},							\
+		.tunable	= (_tun),				\
+		.checked	= (_chk),				\
+	}
+
+
+#define DEFINE_TUNABLE(s, thr, min, max, tun, chk, type)		\
+	struct auto_tune s = TUNABLE_INIT(#s, thr, min, max, tun, chk, type)
+
+#define set_tunable_min_max(s, _min, _max)	\
+	do {					\
+		(s).min.value = _min;		\
+		(s).max.value = _max;		\
+	} while (0)
+
+
+extern int register_tunable(struct auto_tune *);
+extern int unregister_tunable(struct auto_tune *);
+
+
+#else	/* CONFIG_AKT */
+
+
+#define DEFINE_TUNABLE(s, thresh, min, max, tun, chk, type)
+#define set_tunable_min_max(s, min, max)         do { } while (0)
+
+#define register_tunable(a)                 0
+#define unregister_tunable(a)               0
+
+#endif	/* CONFIG_AKT */
+
+extern void fork_late_init(void);
+
+#endif /* AKT_H */
Index: linux-2.6.20-rc4/include/linux/akt_ops.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/include/linux/akt_ops.h	2007-01-29 13:34:45.000000000 +0100
@@ -0,0 +1,109 @@
+/*
+ * linux/include/akt_ops.h
+ *
+ * Automatic Kernel Tunables support for Linux.
+ * This file contains the definitions for the type dependent routines
+ * needed for AKT support.
+ *
+ * Copyright (C) 2006 Bull S.A.S
+ *
+ * Author: Nadia Derbey <Nadia.Derbey@bull.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef AKT_OPS_H
+#define AKT_OPS_H
+
+
+#ifdef CONFIG_AKT
+
+/**
+ * default_auto_tuning - Default automatic tuning routine
+ * @direction:	controls the adjustment direction (up / down)
+ * @p:		registered tunable structure
+ *
+ * This is the routine called to accomplish auto tuning if none has been
+ * specified for a tunable.
+ * It can be called by any kernel subsystem that is allocating or freeing an
+ * object whose maximum value is controlled by a tunable.
+ * ex: max # of semaphore ids is controlled by sc_semmni
+ * ==> this routine might be called by sys_semget() to "adjust up"
+ *     and by semctl_down() to "adjust down"
+ *
+ * Upwards adjustment:
+ *	Adjustment is needed if the checked variable has reached
+ *	(threshold / 100 * tunable)
+ *	In that case, tunable is set to
+ *	(tunable + tunable * (100 - threshold) / 100)
+ *
+ * Downards adjustment:
+ *	Adjustment is needed if the checked variable has fallen under
+ *	(threshold / 100 * tunable previous value)
+ *	In that case tunable is set back to its previous value, i.e. to
+ *	(tunable * 100 / (200 - threshold))
+ *
+ * NOTES:
+ *	1. This routine should be called with the p->tunable_lck lock held
+ *	2. Type independent - can be one of int / size_t
+ *	   This list of types can easily be enhanced as needed.
+ *
+ * Returns:	1 - the tunable has been adjusted
+ *		0 - else
+ */
+#define __default_auto_tuning(direction, p, type)			\
+( {									\
+	int __rc;							\
+	ulong _chk = (ulong) *((type *) p->checked);			\
+	ulong _tun = (ulong) *((type *) p->tunable);			\
+	ulong _thr = p->threshold;					\
+	ulong _min = p->min.value;					\
+	ulong _max = p->max.value;					\
+									\
+	if (direction == AKT_UP) {					\
+		if ((_chk >= (_tun * _thr) / 100) && (_tun < _max)) {	\
+			ulong ___x = (_tun * (200 - _thr)) / 100;	\
+			*((type *) p->tunable) = min((type) _max,	\
+							(type) ___x);	\
+			__rc = 1;					\
+		} else							\
+			__rc = 0;					\
+	} else {							\
+		if ((_chk < (_tun * _thr) / (200 - _thr)) && (_tun>_min)) { \
+			ulong ___x = (_tun * 100) / (200 - _thr);	\
+			*((type *) p->tunable) = max((type) _min,	\
+							(type) ___x);	\
+			__rc = 1;					\
+		} else							\
+			__rc = 0;					\
+	}								\
+	__rc;								\
+} )
+
+static inline int default_auto_tuning_int(int dir, struct auto_tune *p)
+{
+	return __default_auto_tuning(dir, p, int);
+}
+
+static inline int default_auto_tuning_size_t(int dir, struct auto_tune *p)
+{
+	return __default_auto_tuning(dir, p, size_t);
+}
+
+
+
+#endif /* CONFIG_AKT */
+
+#endif /* AKT_OPS_H */
Index: linux-2.6.20-rc4/init/Kconfig
===================================================================
--- linux-2.6.20-rc4.orig/init/Kconfig	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/init/Kconfig	2007-01-29 13:35:53.000000000 +0100
@@ -466,6 +466,8 @@ config VM_EVENT_COUNTERS
 	  on EMBEDDED systems.  /proc/vmstat will only show page counts
 	  if VM event counters are disabled.
 
+source "kernel/autotune/Kconfig"
+
 endmenu		# General setup
 
 config RT_MUTEXES
Index: linux-2.6.20-rc4/init/main.c
===================================================================
--- linux-2.6.20-rc4.orig/init/main.c	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/init/main.c	2007-01-29 13:36:41.000000000 +0100
@@ -54,6 +54,7 @@
 #include <linux/pid_namespace.h>
 #include <linux/compile.h>
 #include <linux/device.h>
+#include <linux/akt.h>
 
 #include <asm/io.h>
 #include <asm/bugs.h>
@@ -613,6 +614,7 @@ asmlinkage void __init start_kernel(void
 	signals_init();
 	/* rootfs populating might need page-writeback */
 	page_writeback_init();
+	fork_late_init();
 #ifdef CONFIG_PROC_FS
 	proc_root_init();
 #endif
Index: linux-2.6.20-rc4/kernel/Makefile
===================================================================
--- linux-2.6.20-rc4.orig/kernel/Makefile	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/kernel/Makefile	2007-01-29 13:37:39.000000000 +0100
@@ -50,6 +50,7 @@ obj-$(CONFIG_RELAY) += relay.o
 obj-$(CONFIG_UTS_NS) += utsname.o
 obj-$(CONFIG_TASK_DELAY_ACCT) += delayacct.o
 obj-$(CONFIG_TASKSTATS) += taskstats.o tsacct.o
+obj-$(CONFIG_AKT) += autotune/
 
 ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y)
 # According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
Index: linux-2.6.20-rc4/kernel/autotune/Kconfig
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/kernel/autotune/Kconfig	2007-01-29 13:38:19.000000000 +0100
@@ -0,0 +1,26 @@
+#
+# Automatic Kernel Tunables
+#
+
+config AKT
+	bool "Automatic kernel tunables support (AKT)"
+	depends on PROC_FS && SYSFS
+	help
+	  This is a functionality that enables automatic adjustment of kernel
+	  tunables: when this feature is enabled the kernel can automatically
+	  change the tunables values as it sees resources running out.
+
+	  The list of kernel tunables that can potentially be automatically
+	  adjusted can found under /sys/tunables.
+
+	  In order to make a tunable actually automatic, issue the following
+	  command:
+	  echo 1 > /sys/tunables/<tunable_name>/autotune
+
+	  In order to make it manual, issue the following command:
+	  echo 0 > /sys/tunables/<tunable_name>/autotune
+
+	  See Documentation/auto_tune.txt for more details.
+
+	  If unsure, say N.
+
Index: linux-2.6.20-rc4/kernel/autotune/akt.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/kernel/autotune/akt.c	2007-01-29 14:03:16.000000000 +0100
@@ -0,0 +1,119 @@
+/*
+ * linux/kernel/autotune/akt.c
+ *
+ * Automatic Kernel Tunables for Linux - Kernel support
+ *
+ * Copyright (C) 2006 Bull S.A.S
+ *
+ * Author: Nadia Derbey <Nadia.Derbey@bull.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+/*
+ *   FUNCTIONS:
+ *              register_tunable           (exported)
+ *              unregister_tunable         (exported)
+ */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/akt.h>
+
+
+/**
+ * register_tunable - Inserts a tunable structure into sysfs
+ * @tun:	tunable structure to be registered
+ *
+ * Checks the tunable structure fields and inserts it into sysfs.
+ * This routine is called by any kernel subsystem that wants to use akt
+ * services (automatic tunables adjustment) in the future
+ *
+ * NOTE: when calling this routine, the tunable structure should have already
+ *       been filled by defining it with DEFINE_TUNABLE()
+ *
+ * Returns:	0 - successful
+ *		<0 - failure
+ */
+int register_tunable(struct auto_tune *tun)
+{
+	if (tun == NULL) {
+		printk(KERN_ERR
+			"AKT: Bad tunable structure pointer (NULL)\n");
+		return -EINVAL;
+	}
+
+	if (tun->threshold <= 0 || tun->threshold >= 100) {
+		printk(KERN_ERR "AKT: Bad threshold (%d) value - should be in"
+			" the [1-99] interval\n", tun->threshold);
+		return -EINVAL;
+	}
+
+	if (tun->tunable == NULL) {
+		printk(KERN_ERR "AKT: Bad tunable pointer (NULL)\n");
+		return -EINVAL;
+	}
+
+	if (tun->checked == NULL) {
+		printk(KERN_ERR "AKT: Bad checked value pointer (NULL)\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * Check the min / max value
+	 */
+	if (tun->min.value > tun->max.value) {
+		printk(KERN_ERR "AKT: Bad min / max values\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+EXPORT_SYMBOL_GPL(register_tunable);
+
+
+/**
+ * unregister_tunable - Removes a tunable structure from sysfs
+ * @reg_tun:	registered tunable structure to be removed
+ *
+ * This routine is called by any kernel subsystem that doesn't need the akt
+ * services anymore
+ *
+ * NOTE: @reg_tun should point to a previously registered tunable
+ *
+ * Returns:	0 - successful
+ *		<0 - failure
+ */
+int unregister_tunable(struct auto_tune *reg_tun)
+{
+	if (reg_tun == NULL) {
+		printk(KERN_ERR "AKT: Bad tunable address (NULL)\n");
+		return -EINVAL;
+	}
+
+	spin_lock(&reg_tun->tunable_lck);
+
+	BUG_ON(!is_tunable_registered(reg_tun));
+
+	reg_tun->flags = 0;
+
+	spin_unlock(&reg_tun->tunable_lck);
+
+	return 0;
+}
+
+EXPORT_SYMBOL_GPL(unregister_tunable);
Index: linux-2.6.20-rc4/kernel/autotune/Makefile
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/kernel/autotune/Makefile	2007-01-29 13:40:06.000000000 +0100
@@ -0,0 +1,7 @@
+#
+# Makefile for akt
+#
+
+obj-y := akt.o
+
+
Index: linux-2.6.20-rc4/kernel/fork.c
===================================================================
--- linux-2.6.20-rc4.orig/kernel/fork.c	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/kernel/fork.c	2007-01-29 13:41:44.000000000 +0100
@@ -49,6 +49,8 @@
 #include <linux/delayacct.h>
 #include <linux/taskstats_kern.h>
 #include <linux/random.h>
+#include <linux/akt.h>
+#include <linux/akt_ops.h>
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -65,6 +67,13 @@ int nr_threads; 		/* The idle threads do
 
 int max_threads;		/* tunable limit on nr_threads */
 
+#define THREADTHRESH 80
+/*
+ * The actual values for min and max will be known during fork_init
+ */
+DEFINE_TUNABLE(max_threads_akt, THREADTHRESH, 0, 0, &max_threads,
+		&nr_threads, int);
+
 DEFINE_PER_CPU(unsigned long, process_counts) = 0;
 
 __cacheline_aligned DEFINE_RWLOCK(tasklist_lock);  /* outer */
@@ -152,12 +161,21 @@ void __init fork_init(unsigned long memp
 	if(max_threads < 20)
 		max_threads = 20;
 
+	set_tunable_min_max(max_threads_akt, max_threads, mempages / 2);
+
 	init_task.signal->rlim[RLIMIT_NPROC].rlim_cur = max_threads/2;
 	init_task.signal->rlim[RLIMIT_NPROC].rlim_max = max_threads/2;
 	init_task.signal->rlim[RLIMIT_SIGPENDING] =
 		init_task.signal->rlim[RLIMIT_NPROC];
 }
 
+void __init fork_late_init(void)
+{
+	if (register_tunable(&max_threads_akt))
+		printk(KERN_WARNING
+			"AKT: Failed registering tunable max_threads\n");
+}
+
 static struct task_struct *dup_task_struct(struct task_struct *orig)
 {
 	struct task_struct *tsk;

--

  reply	other threads:[~2007-01-30 10:26 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-30 10:11 [PATCH 0/6] AKT - Automatic Kernel Tunables Nadia.Derbey
2007-01-30 10:11 ` Nadia.Derbey [this message]
2007-02-12 15:07   ` [PATCH 1/6] AKT - Tunable structure and registration routines Andi Kleen
2007-02-13 10:18     ` Nadia Derbey
2007-02-13 10:51       ` Andi Kleen
2007-01-30 10:11 ` [PATCH 2/6] AKT - auto_tuning activation Nadia.Derbey
2007-01-30 10:11 ` [PATCH 3/6] AKT - tunables associated kobjects Nadia.Derbey
2007-01-30 10:11 ` [PATCH 4/6] AKT - min and max kobjects Nadia.Derbey
2007-01-30 10:11 ` [PATCH 5/6] AKT - per namespace tunables Nadia.Derbey
2007-01-30 10:11 ` [PATCH 6/6] AKT - automatic tuning applied to some kernel components Nadia.Derbey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070130102908.781457000@bull.net \
    --to=nadia.derbey@bull.net \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=randy.dunlap@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.