All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] AKT - Automatic Kernel Tunables
@ 2007-01-30 10:11 Nadia.Derbey
  2007-01-30 10:11 ` [PATCH 1/6] AKT - Tunable structure and registration routines Nadia.Derbey
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Nadia.Derbey @ 2007-01-30 10:11 UTC (permalink / raw)
  To: akpm, randy.dunlap; +Cc: linux-kernel

Re-sending the series of patches for the automatic kernel tunables feature:
have done some fixes after the remarks sent back by Andrew and Randy.

1) All the type independent macros have been removed, except for the automatic
tuning routine: it manages pointers to the tunable and to the value to be
checked against that tunable, so it should rmain type independent IMHO.
Now, I only left the auto-tuning routines for types int and size_t since these
are the types of the tunables the framework is applied to.
It will be easy to add the other types as needed in the future.
This makes the code much lighter.

2) CONFIG_AKT has been moved from the FS menu to the "general setup" one.

+ all the other minor changes.


--- Reminder

This is a series of patches that introduces a feature that makes the kernel
automatically change the tunables values as it sees resources running out.

The AKT framework is made of 2 parts:

1) Kernel part:
Interfaces are provided to the kernel subsystems, to (un)register the
tunables that might be automatically tuned in the future.

Registering a tunable consists in the following steps:
- a structure is declared and filled by the kernel subsystem for the
registered tunable
- that tunable structure is registered into sysfs

Registration should be done during the kernel subsystem initialization step.


Another interface is provided to the kernel subsystems, to activate the
automatic tuning for a registered tunable. It can be called during resource
allocation to tune up, and during resource freeing to tune down the registered
tunable. The automatic tuning routine is called only if the tunable has
been enabled to be automatically tuning in sysfs.

2) User part:

AKT uses sysfs to enable the tunables management from the user world (mainly
making them automatic or manual).

akt uses sysfs in the following way:
- a tunables subsystem (tunables_subsys) is declared and registered during akt
initialization.
- registering a tunable is equivalent to registering the corresponding kobject
within that subsystem.
- each tunable kobject has 3 associated attributes, all with a RW mode (i.e.
the show() and store() methods are provided for them):
        . autotune: enables to (de)activate automatic tuning for the tunable
        . max: enables to set a new maximum value for the tunable
        . min: enables to set a new minimum value for the tunable

The only way to activate automatic tuning is from user side:
- the directory /sys/tunables is created during the init phase.
- each time a tunable is registered by a kernel subsystem, a directory is
created for it under /sys/tunables.
- This directory contains 1 file for each tunable kobject attribute



These patches should be applied to 2.6.20-rc4, in the following order:

[PATCH 1/6]: tunables_registration.patch
[PATCH 2/6]: auto_tuning_activation.patch
[PATCH 3/6]: auto_tuning_kobjects.patch
[PATCH 4/6]: tunable_min_max_kobjects.patch
[PATCH 5/6]: per_namespace_tunables.patch
[PATCH 6/6]: auto_tune_applied.patch


--

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/6] AKT - Tunable structure and registration routines
  2007-01-30 10:11 [PATCH 0/6] AKT - Automatic Kernel Tunables Nadia.Derbey
@ 2007-01-30 10:11 ` Nadia.Derbey
  2007-02-12 15:07   ` Andi Kleen
  2007-01-30 10:11 ` [PATCH 2/6] AKT - auto_tuning activation Nadia.Derbey
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Nadia.Derbey @ 2007-01-30 10:11 UTC (permalink / raw)
  To: akpm, randy.dunlap; +Cc: linux-kernel, Nadia Derbey

[-- Attachment #1: tunables_registration.patch --]
[-- Type: text/plain, Size: 32230 bytes --]

[PATCH 01/06]

Defines the auto_tune structure: this is the structure that contains the
information needed by the adjustment routine for a given tunable.
Also defines the registration routines.

The fork kernel component defines a tunable structure for the threads-max
tunable and registers it.


Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>


---
 Documentation/00-INDEX      |    2 
 Documentation/auto_tune.txt |  333 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/akt.h         |  166 +++++++++++++++++++++
 include/linux/akt_ops.h     |  109 ++++++++++++++
 init/Kconfig                |    2 
 init/main.c                 |    2 
 kernel/Makefile             |    1 
 kernel/autotune/Kconfig     |   26 +++
 kernel/autotune/Makefile    |    7 
 kernel/autotune/akt.c       |  119 +++++++++++++++
 kernel/fork.c               |   18 ++
 11 files changed, 785 insertions(+)

Index: linux-2.6.20-rc4/Documentation/auto_tune.txt
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/Documentation/auto_tune.txt	2007-01-29 12:54:09.000000000 +0100
@@ -0,0 +1,333 @@
+			Automatic Kernel Tunables
+                        =========================
+
+		   Nadia Derbey (Nadia.Derbey@bull.net)
+
+
+
+This feature aims at making the kernel automatically change the tunables
+values as it sees resources running out.
+
+The AKT framework is made of 2 parts:
+
+1) Kernel part:
+Interfaces are provided to the kernel subsystems, to (un)register the
+tunables that might be automatically tuned in the future.
+
+Registering a tunable consists of the following steps:
+- a structure is declared and filled by the kernel subsystem for the
+registered tunable
+- that tunable structure is registered into sysfs
+
+Registration should be done during the kernel subsystem initialization step.
+
+Unregistering a tunable is the reverse operation. It should not be necessary
+for the kernel subsystems: it is only useful when unloading modules that would
+have registered a tunable during their loading step.
+
+The routines interfaces are the following:
+
+1.1) Declaring a tunable:
+
+A tunable structure should be declared and defined by the kernel subsystems as
+follows:
+
+DEFINE_TUNABLE(structure_name, threshold, min, max,
+		tunable_variable_ptr, checked_variable_ptr,
+		tunable_variable_type);
+
+Parameters:
+- structure_name: this is the name of the tunable structure
+
+- threshold: percentage to apply to the tunable value to detect if adjustment
+is needed
+
+- min: minimum value the tunable can ever reach (needed when adjusting down
+the tunable)
+
+- max: maximum value the tunable can ever reach (needed when adjusting up the
+tunable)
+
+- tunable_variable_ptr: address of the tunable that will be adjusted if
+needed.
+(ex: in kernel/fork.c it is max_threads's address)
+
+- checked_variable_ptr: address of the variable that is controlled by the
+tunable. This is the calling subsystem's object counter.
+(ex: in kernel/fork.c it is nr_threads's address: nr_threads should
+always remain < max_threads)
+
+- tunable_variable_type: this type is important since it helps choosing the
+appropriate automatic tuning routine.
+Presently, it can be one of int / size_t and this can easily be enhanced.
+
+The automatic tuning routine (i.e. the routine that should be called when
+automatic tuning is activated) is set to the default one:
+default_auto_tuning_<type>().
+<type> is chosen according to the tunable_variable_type parameters.
+All the previously listed parameters are useful to this routine.
+Refer to the description of the automatic adjustment routine to see how
+these parameters are actually used.
+
+Refer to "Updating the auto-tuning function pointer" to know how to set
+this routine to another one.
+
+
+1.2) Updating a tunable's characteristics
+
+1.2.1) Updating min / max values:
+
+Sometimes, when calling DEFINE_TUNABLE(), the min and max values are not
+exactly known, yet. In that case, the following routine should be called
+once these values are known:
+
+set_tunable_min_max(structure_name, new_min, new_max)
+
+Parameters:
+- structure_name: this is the name of the tunable structure
+
+- new_min: minimum value the tunable can ever reach
+
+- new_max: maximum value the tunable can ever reach
+
+1.2.2) Updating the auto-tuning function pointer:
+
+If the default auto-tuning routine doesn't fit your needs, you can define
+another one and associate it to the tunable using the following routine:
+
+set_autotuning_routine(structure_name, auto_tune)
+
+Parameters:
+- structure_name: this is the name of the tunable structure
+
+- auto_tune: routine that should be called when automatic tuning is activated.
+If this parameter is not NULL, it should be set to a function pointer defined
+by the kernel subsystem caller. See 1.5) for the routine prototype. See also
+maxfiles_auto_tuning() in fs/file_table.c for an example.
+
+
+1.3) Registering a tunable:
+
+Once declared and its min / max / auto_tuning routine updated, the tunable
+structure should be registered using the following routine:
+
+int register_tunable(struct auto_tune *tunable_addr);
+
+Parameters:
+- tunable_addr: address of the tunable structure previsouly declared.
+
+Return value:
+- 0 : successful
+- < 0 : failure
+
+
+Registering a tunable makes it potentially automatically adjustable:
+the tunable is viewed as a kobject with 3 attributes (i.e. 3 files at sysfs
+level):
+- autotune (rw): enables to (de)activate the auto tuning for that tunable
+- min (rw): enables to play with the min tunable value
+- max (rw): enables to play with the max tunable value
+
+The only way to make a registered tunable automatically adjustable is through
+sysfs (see the sysfs part for more details).
+
+
+
+1.4) Unregistering a tunable:
+
+int unregister_tunable(struct auto_tune *reg_tun_addr);
+
+Parameters:
+- reg_tun_addr: address of the tunable structure to unregister
+
+
+This routine is only useful for modules: when unloading, they should
+unregister any previously registered tunable.
+
+
+
+1.5) Automatic tuning routine:
+
+The 2nd main service provided by the kernel part is a function pointer
+(auto_tune_func): it points to the routine that actually automatically
+adjusts the tunable passed in as a parameter.
+
+This is accomplished by one of the following:
+- if an automatic tuning routine has been provided during the tunable
+declaration, that routine will actually be called.
+- if no automatic tuning routine has been provided, the default one is called.
+NOTE: it can process one of the following types, depending on the type used
+	when declaring the tunable (see DEFINE_TUNABLE above): int, size_t.
+
+
+If the automatic tuning routine is provided by the kernel subsystem caller,
+it should be declared as follows:
+
+int <routine_name>(int cmd, struct auto_tune *params);
+
+Parameters:
+- cmd: tuning direction
+	. AKT_UP: the tunable will be adjusted upwards (i.e. its value is
+		increased if needed)
+	. AKT_DOWN: the tunable is adjusted downwards (i.e. its value is
+		decreased if needed)
+- params: pointer to the previously registered tunable structure
+
+
+Any kernel subsystem that has registered a tunable should call
+auto_tune_func() as follows:
+
++-------------------------+--------------------------------------------+
+| Step                    | Routine to call                            |
++-------------------------+--------------------------------------------+
+| Declaration phase       | DEFINE_TUNABLE(name, values...);           |
++-------------------------+--------------------------------------------+
+| Initialization routine  | set_tunable_min_max(name, min, max);       |
+|                         | set_autotuning_routine(name, routine);     |
+|                         | register_tunable(&name);                   |
+| Note: the 1st 2 calls   |                                            |
+|       are optional      |                                            |
++-------------------------+--------------------------------------------+
+| Alloc                   | activate_auto_tuning(AKT_UP, &name);       |
++-------------------------+--------------------------------------------+
+| Free                    | activate_auto_tuning(AKT_DOWN, &name);     |
++-------------------------+--------------------------------------------+
+| module_exit() routine   | unregister_tunable(&name);                 |
++-------------------------+--------------------------------------------+
+
+activate_auto_tuning is a static inline defined in akt.h, that does the
+following:
+. if <tunable is registered> and <auto tuning is allowed for tunable>
+.   call the routine stored in tunable->auto_tune
+
+
+The effect of the default automatic tuning routine is the following:
+
+           +----------------------------------------------------------------+
+           |                 Tunable automatically adjustable               |
+           +---------------+------------------------------------------------+
+           |      NO       |                      YES                       |
++----------+---------------+------------------------------------------------+
+| AKT_UP   | No effect     | If the tunable value exceeds the specified     |
+|          |               | threshold, that value is increased up to a     |
+|          |               | maximum value.                                 |
+|          |               | The maximum value is specified during the      |
+|          |               | tunable declaration and can be changed at any  |
+|          |               | time through sysfs                             |
++----------+---------------+------------------------------------------------+
+| AKT_DOWN | No effect     | If the tunable value falls under the specified |
+|          |               | threshold, that value is decreased down to a   |
+|          |               | minimum value.                                 |
+|          |               | The minimum value is specified during the      |
+|          |               | tunable declaration and can be changed at any  |
+|          |               | time through sysfs                             |
++----------+---------------+------------------------------------------------+
+
+
+1.6. Default automatic adjustment routine
+
+The last service provided by AKT at the kernel level is the default automatic
+adjustment routine. As seen, above, this routine supports various tunables
+types. It works as follows (only the AKT_UP direction is described here -
+AKT_DOWN does the reverse operation):
+
+The 2nd parameter passed in to this routine is a pointer to a previously
+registered tunable structure. That structure contains the following fields
+(see 1.1 for the detailed description):
+- threshold
+- key
+- min
+- max
+- tunable
+- checked
+
+When this routine is entered, it does the following:
+1. <*checked> is compared to <*tunable> * threshold
+2. if <*checked> is greater, <*tunable> is set to:
+	<*tunable> + (<*tunable> * (100 - threshold) / 100)
+
+
+
+1.6) akt and sysfs:
+
+AKT uses sysfs to enable the tunables management from the user world (mainly
+making them automatic or manual).
+
+akt uses sysfs in the following way:
+- a tunables subsystem (tunables_subsys) is declared and registered during akt
+initialization.
+- registering a tunable is equivalent to registering the corresponding kobject
+within that subsystem.
+- each tunable kobject has 3 associated attributes, all with a RW mode (i.e.
+the show() and store() methods are provided for them):
+	. autotune: enables to (de)activate automatic tuning for the tunable
+	. max: enables to set a new maximum value for the tunable
+	. min: enables to set a new minimum value for the tunable
+
+
+1.7) tunables that are namespace dependent
+
+In this paragraph, the particular case of tunables that are namespace
+dependent is presented.
+
+1.7.1) Declaring a tunable:
+
+The tunable structure for such tunables should be declared in the namespace
+structure that contains the associated tunable (ex: the tunable structure for
+msg_ctlmni should be declared in the ipc_namespace structure).
+
+The tunable structure should be declared as follows:
+
+DECLARE_TUNABLE(structure_name);
+
+Parameters:
+- structure_name: this is the name of the tunable structure
+
+1.7.2) Initializing the tunable structure
+
+Then the tunable structure should be initialized by calling the following
+routine:
+
+init_tunable_ipcns(namespace_ptr, structure_name, threshold, min, max,
+		tunable_variable_ptr, checked_variable_ptr,
+		tunable_variable_type);
+
+Parameters:
+- namespace_ptr: pointer to the namespace the tunable belongs to.
+
+See DEFINE_TUNABLE for the other parameters.
+
+1.7.3) Registering the tunable structure
+
+register_tunable should be called, giving it the tunable structure address
+that belongs to the init namespace.
+
+This applies to activate_auto_tuning too.
+
+All the routines that show/store attributes or that do the auto tuning are
+namespace dependent.
+
+
+2) User part:
+
+As seen above, the only way to activate automatic tuning is from user side:
+- the directory /sys/tunables is created during the init phase.
+- each time a tunable is registered by a kernel subsystem, a directory is
+created for it under /sys/tunables.
+- This directory contains 1 file for each tunable kobject attribute:
++-----------+---------------+-------------------+--------------------------+
+| attribute | default value | how to set it     | effect                   |
++-----------+---------------+-------------------+--------------------------+
+| autotune  | 0             | echo 1 > autotune | makes the tunable        |
+|           |               |                   | automatic                |
+|           |               | echo 0 > autotune | makes the tunable manual |
++-----------+---------------+-------------------+--------------------------+
+| max       | max value set | echo <M> > max    | sets the tunable max     |
+|           | during tunable|                   | value to <M>             |
+|           | definition    |                   |                          |
++-----------+---------------+-------------------+--------------------------+
+| min       | min value set | echo <m> > min    | sets the tunable min     |
+|           | during tunable|                   | value to <m>             |
+|           | definition    |                   |                          |
++-----------+---------------+-------------------+--------------------------+
+
Index: linux-2.6.20-rc4/Documentation/00-INDEX
===================================================================
--- linux-2.6.20-rc4.orig/Documentation/00-INDEX	2007-01-29 12:39:29.000000000 +0100
+++ linux-2.6.20-rc4/Documentation/00-INDEX	2007-01-29 12:55:49.000000000 +0100
@@ -52,6 +52,8 @@ applying-patches.txt
 	- description of various trees and how to apply their patches.
 arm/
 	- directory with info about Linux on the ARM architecture.
+auto_tune.txt
+	- info on the Automatic Kernel Tunables (AKT) feature.
 basic_profiling.txt
 	- basic instructions for those who wants to profile Linux kernel.
 binfmt_misc.txt
Index: linux-2.6.20-rc4/include/linux/akt.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/include/linux/akt.h	2007-01-29 14:59:38.000000000 +0100
@@ -0,0 +1,166 @@
+/*
+ * linux/include/akt.h
+ *
+ * Automatic Kernel Tunables support for Linux.
+ * This file contains structures definitions and prototypes needed for AKT
+ * support.
+ *
+ * Copyright (C) 2006 Bull S.A.S
+ *
+ * Author: Nadia Derbey <Nadia.Derbey@bull.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef AKT_H
+#define AKT_H
+
+#include <linux/types.h>
+#include <linux/kobject.h>
+
+
+
+/*
+ * First parameter passed to the adjustment routine
+ */
+#define AKT_UP   0   /* adjustment "up" */
+#define AKT_DOWN 1   /* adjustment "down" */
+
+
+struct auto_tune;
+/*
+ * Automatic adjustment routine.
+ * Returns 0, if the tunable value has not been changed, 1 else
+ */
+typedef int (*auto_tune_fn)(int, struct auto_tune *);
+
+
+/*
+ * Structure used to describe the min / max values for a tunable inside the
+ * auto_tune structure.
+ */
+struct tunable_limit {
+	ulong value;
+};
+
+
+
+/*
+ * This is the structure that describes a tunable. One of these structures is
+ * allocated for each registered tunable, and the associated kobject exported
+ * via sysfs.
+ *
+ * The structure lock (tunable_lck) protects
+ * against concurrent accesses to tunable and checked pointers
+ *
+ * A pointer to this structure is passed in to  the automatic adjustment
+ * routine.
+ * automatic adjustment principle is the following:
+ *    AKT_UP:
+ *       1. *checked is compared to *tunable * threshold
+ *       2. if *checked is greater, the tunable is adjusted up
+ *    AKT_DOWN: reverse operation
+ */
+struct auto_tune {
+	spinlock_t tunable_lck; /* serializes access to the stucture fields */
+	auto_tune_fn auto_tune; /* auto tuning routine registered by the */
+				/* calling kernel susbsystem. If NULL, the */
+				/* auto tuning routine that will be called */
+				/* is the default one that processes uints */
+	const char *name;
+	unsigned char flags;	/* Only 2 bits are meaningful: */
+				/* bit 0: 1 if the associated tunable can */
+				/*        be automatically adjusted */
+				/* bits 1: 1 if the tunable has been */
+				/*         registered */
+				/* bits 2-7: unused */
+	char threshold;	/* threshold to enable the adjustment expressed as */
+			/* a %age */
+	struct tunable_limit min;	/* min value the tunable can ever */
+					/* reach */
+	struct tunable_limit max;	/* max value the tunable can ever */
+					/* reach */
+	void *tunable;	/* address of the tunable to adjust */
+	void *checked;	/* address of the variable that is controlled by */
+			/* the tunable. This is the calling subsystem's */
+			/* object counter */
+};
+
+
+/*
+ * Flags for a registered tunable
+ */
+#define TUNABLE_REGISTERED  0x02
+
+
+/*
+ * When calling this routine the tunable lock should be held
+ */
+static inline int is_tunable_registered(struct auto_tune *tunable)
+{
+	return (tunable->flags & TUNABLE_REGISTERED) == TUNABLE_REGISTERED;
+}
+
+
+#ifdef CONFIG_AKT
+
+
+
+#define TUNABLE_INIT(_name, _thresh, _min, _max, _tun, _chk, type)	\
+	{								\
+		.tunable_lck	= SPIN_LOCK_UNLOCKED,			\
+		.auto_tune	= default_auto_tuning_##type,		\
+		.name		= (_name),				\
+		.flags		= 0,					\
+		.threshold	= (_thresh),				\
+		.min	= {						\
+			.value		= (_min),			\
+		},							\
+		.max	= {						\
+			.value		= (_max),			\
+		},							\
+		.tunable	= (_tun),				\
+		.checked	= (_chk),				\
+	}
+
+
+#define DEFINE_TUNABLE(s, thr, min, max, tun, chk, type)		\
+	struct auto_tune s = TUNABLE_INIT(#s, thr, min, max, tun, chk, type)
+
+#define set_tunable_min_max(s, _min, _max)	\
+	do {					\
+		(s).min.value = _min;		\
+		(s).max.value = _max;		\
+	} while (0)
+
+
+extern int register_tunable(struct auto_tune *);
+extern int unregister_tunable(struct auto_tune *);
+
+
+#else	/* CONFIG_AKT */
+
+
+#define DEFINE_TUNABLE(s, thresh, min, max, tun, chk, type)
+#define set_tunable_min_max(s, min, max)         do { } while (0)
+
+#define register_tunable(a)                 0
+#define unregister_tunable(a)               0
+
+#endif	/* CONFIG_AKT */
+
+extern void fork_late_init(void);
+
+#endif /* AKT_H */
Index: linux-2.6.20-rc4/include/linux/akt_ops.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/include/linux/akt_ops.h	2007-01-29 13:34:45.000000000 +0100
@@ -0,0 +1,109 @@
+/*
+ * linux/include/akt_ops.h
+ *
+ * Automatic Kernel Tunables support for Linux.
+ * This file contains the definitions for the type dependent routines
+ * needed for AKT support.
+ *
+ * Copyright (C) 2006 Bull S.A.S
+ *
+ * Author: Nadia Derbey <Nadia.Derbey@bull.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#ifndef AKT_OPS_H
+#define AKT_OPS_H
+
+
+#ifdef CONFIG_AKT
+
+/**
+ * default_auto_tuning - Default automatic tuning routine
+ * @direction:	controls the adjustment direction (up / down)
+ * @p:		registered tunable structure
+ *
+ * This is the routine called to accomplish auto tuning if none has been
+ * specified for a tunable.
+ * It can be called by any kernel subsystem that is allocating or freeing an
+ * object whose maximum value is controlled by a tunable.
+ * ex: max # of semaphore ids is controlled by sc_semmni
+ * ==> this routine might be called by sys_semget() to "adjust up"
+ *     and by semctl_down() to "adjust down"
+ *
+ * Upwards adjustment:
+ *	Adjustment is needed if the checked variable has reached
+ *	(threshold / 100 * tunable)
+ *	In that case, tunable is set to
+ *	(tunable + tunable * (100 - threshold) / 100)
+ *
+ * Downards adjustment:
+ *	Adjustment is needed if the checked variable has fallen under
+ *	(threshold / 100 * tunable previous value)
+ *	In that case tunable is set back to its previous value, i.e. to
+ *	(tunable * 100 / (200 - threshold))
+ *
+ * NOTES:
+ *	1. This routine should be called with the p->tunable_lck lock held
+ *	2. Type independent - can be one of int / size_t
+ *	   This list of types can easily be enhanced as needed.
+ *
+ * Returns:	1 - the tunable has been adjusted
+ *		0 - else
+ */
+#define __default_auto_tuning(direction, p, type)			\
+( {									\
+	int __rc;							\
+	ulong _chk = (ulong) *((type *) p->checked);			\
+	ulong _tun = (ulong) *((type *) p->tunable);			\
+	ulong _thr = p->threshold;					\
+	ulong _min = p->min.value;					\
+	ulong _max = p->max.value;					\
+									\
+	if (direction == AKT_UP) {					\
+		if ((_chk >= (_tun * _thr) / 100) && (_tun < _max)) {	\
+			ulong ___x = (_tun * (200 - _thr)) / 100;	\
+			*((type *) p->tunable) = min((type) _max,	\
+							(type) ___x);	\
+			__rc = 1;					\
+		} else							\
+			__rc = 0;					\
+	} else {							\
+		if ((_chk < (_tun * _thr) / (200 - _thr)) && (_tun>_min)) { \
+			ulong ___x = (_tun * 100) / (200 - _thr);	\
+			*((type *) p->tunable) = max((type) _min,	\
+							(type) ___x);	\
+			__rc = 1;					\
+		} else							\
+			__rc = 0;					\
+	}								\
+	__rc;								\
+} )
+
+static inline int default_auto_tuning_int(int dir, struct auto_tune *p)
+{
+	return __default_auto_tuning(dir, p, int);
+}
+
+static inline int default_auto_tuning_size_t(int dir, struct auto_tune *p)
+{
+	return __default_auto_tuning(dir, p, size_t);
+}
+
+
+
+#endif /* CONFIG_AKT */
+
+#endif /* AKT_OPS_H */
Index: linux-2.6.20-rc4/init/Kconfig
===================================================================
--- linux-2.6.20-rc4.orig/init/Kconfig	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/init/Kconfig	2007-01-29 13:35:53.000000000 +0100
@@ -466,6 +466,8 @@ config VM_EVENT_COUNTERS
 	  on EMBEDDED systems.  /proc/vmstat will only show page counts
 	  if VM event counters are disabled.
 
+source "kernel/autotune/Kconfig"
+
 endmenu		# General setup
 
 config RT_MUTEXES
Index: linux-2.6.20-rc4/init/main.c
===================================================================
--- linux-2.6.20-rc4.orig/init/main.c	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/init/main.c	2007-01-29 13:36:41.000000000 +0100
@@ -54,6 +54,7 @@
 #include <linux/pid_namespace.h>
 #include <linux/compile.h>
 #include <linux/device.h>
+#include <linux/akt.h>
 
 #include <asm/io.h>
 #include <asm/bugs.h>
@@ -613,6 +614,7 @@ asmlinkage void __init start_kernel(void
 	signals_init();
 	/* rootfs populating might need page-writeback */
 	page_writeback_init();
+	fork_late_init();
 #ifdef CONFIG_PROC_FS
 	proc_root_init();
 #endif
Index: linux-2.6.20-rc4/kernel/Makefile
===================================================================
--- linux-2.6.20-rc4.orig/kernel/Makefile	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/kernel/Makefile	2007-01-29 13:37:39.000000000 +0100
@@ -50,6 +50,7 @@ obj-$(CONFIG_RELAY) += relay.o
 obj-$(CONFIG_UTS_NS) += utsname.o
 obj-$(CONFIG_TASK_DELAY_ACCT) += delayacct.o
 obj-$(CONFIG_TASKSTATS) += taskstats.o tsacct.o
+obj-$(CONFIG_AKT) += autotune/
 
 ifneq ($(CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER),y)
 # According to Alan Modra <alan@linuxcare.com.au>, the -fno-omit-frame-pointer is
Index: linux-2.6.20-rc4/kernel/autotune/Kconfig
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/kernel/autotune/Kconfig	2007-01-29 13:38:19.000000000 +0100
@@ -0,0 +1,26 @@
+#
+# Automatic Kernel Tunables
+#
+
+config AKT
+	bool "Automatic kernel tunables support (AKT)"
+	depends on PROC_FS && SYSFS
+	help
+	  This is a functionality that enables automatic adjustment of kernel
+	  tunables: when this feature is enabled the kernel can automatically
+	  change the tunables values as it sees resources running out.
+
+	  The list of kernel tunables that can potentially be automatically
+	  adjusted can found under /sys/tunables.
+
+	  In order to make a tunable actually automatic, issue the following
+	  command:
+	  echo 1 > /sys/tunables/<tunable_name>/autotune
+
+	  In order to make it manual, issue the following command:
+	  echo 0 > /sys/tunables/<tunable_name>/autotune
+
+	  See Documentation/auto_tune.txt for more details.
+
+	  If unsure, say N.
+
Index: linux-2.6.20-rc4/kernel/autotune/akt.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/kernel/autotune/akt.c	2007-01-29 14:03:16.000000000 +0100
@@ -0,0 +1,119 @@
+/*
+ * linux/kernel/autotune/akt.c
+ *
+ * Automatic Kernel Tunables for Linux - Kernel support
+ *
+ * Copyright (C) 2006 Bull S.A.S
+ *
+ * Author: Nadia Derbey <Nadia.Derbey@bull.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+/*
+ *   FUNCTIONS:
+ *              register_tunable           (exported)
+ *              unregister_tunable         (exported)
+ */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/akt.h>
+
+
+/**
+ * register_tunable - Inserts a tunable structure into sysfs
+ * @tun:	tunable structure to be registered
+ *
+ * Checks the tunable structure fields and inserts it into sysfs.
+ * This routine is called by any kernel subsystem that wants to use akt
+ * services (automatic tunables adjustment) in the future
+ *
+ * NOTE: when calling this routine, the tunable structure should have already
+ *       been filled by defining it with DEFINE_TUNABLE()
+ *
+ * Returns:	0 - successful
+ *		<0 - failure
+ */
+int register_tunable(struct auto_tune *tun)
+{
+	if (tun == NULL) {
+		printk(KERN_ERR
+			"AKT: Bad tunable structure pointer (NULL)\n");
+		return -EINVAL;
+	}
+
+	if (tun->threshold <= 0 || tun->threshold >= 100) {
+		printk(KERN_ERR "AKT: Bad threshold (%d) value - should be in"
+			" the [1-99] interval\n", tun->threshold);
+		return -EINVAL;
+	}
+
+	if (tun->tunable == NULL) {
+		printk(KERN_ERR "AKT: Bad tunable pointer (NULL)\n");
+		return -EINVAL;
+	}
+
+	if (tun->checked == NULL) {
+		printk(KERN_ERR "AKT: Bad checked value pointer (NULL)\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * Check the min / max value
+	 */
+	if (tun->min.value > tun->max.value) {
+		printk(KERN_ERR "AKT: Bad min / max values\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+EXPORT_SYMBOL_GPL(register_tunable);
+
+
+/**
+ * unregister_tunable - Removes a tunable structure from sysfs
+ * @reg_tun:	registered tunable structure to be removed
+ *
+ * This routine is called by any kernel subsystem that doesn't need the akt
+ * services anymore
+ *
+ * NOTE: @reg_tun should point to a previously registered tunable
+ *
+ * Returns:	0 - successful
+ *		<0 - failure
+ */
+int unregister_tunable(struct auto_tune *reg_tun)
+{
+	if (reg_tun == NULL) {
+		printk(KERN_ERR "AKT: Bad tunable address (NULL)\n");
+		return -EINVAL;
+	}
+
+	spin_lock(&reg_tun->tunable_lck);
+
+	BUG_ON(!is_tunable_registered(reg_tun));
+
+	reg_tun->flags = 0;
+
+	spin_unlock(&reg_tun->tunable_lck);
+
+	return 0;
+}
+
+EXPORT_SYMBOL_GPL(unregister_tunable);
Index: linux-2.6.20-rc4/kernel/autotune/Makefile
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/kernel/autotune/Makefile	2007-01-29 13:40:06.000000000 +0100
@@ -0,0 +1,7 @@
+#
+# Makefile for akt
+#
+
+obj-y := akt.o
+
+
Index: linux-2.6.20-rc4/kernel/fork.c
===================================================================
--- linux-2.6.20-rc4.orig/kernel/fork.c	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/kernel/fork.c	2007-01-29 13:41:44.000000000 +0100
@@ -49,6 +49,8 @@
 #include <linux/delayacct.h>
 #include <linux/taskstats_kern.h>
 #include <linux/random.h>
+#include <linux/akt.h>
+#include <linux/akt_ops.h>
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -65,6 +67,13 @@ int nr_threads; 		/* The idle threads do
 
 int max_threads;		/* tunable limit on nr_threads */
 
+#define THREADTHRESH 80
+/*
+ * The actual values for min and max will be known during fork_init
+ */
+DEFINE_TUNABLE(max_threads_akt, THREADTHRESH, 0, 0, &max_threads,
+		&nr_threads, int);
+
 DEFINE_PER_CPU(unsigned long, process_counts) = 0;
 
 __cacheline_aligned DEFINE_RWLOCK(tasklist_lock);  /* outer */
@@ -152,12 +161,21 @@ void __init fork_init(unsigned long memp
 	if(max_threads < 20)
 		max_threads = 20;
 
+	set_tunable_min_max(max_threads_akt, max_threads, mempages / 2);
+
 	init_task.signal->rlim[RLIMIT_NPROC].rlim_cur = max_threads/2;
 	init_task.signal->rlim[RLIMIT_NPROC].rlim_max = max_threads/2;
 	init_task.signal->rlim[RLIMIT_SIGPENDING] =
 		init_task.signal->rlim[RLIMIT_NPROC];
 }
 
+void __init fork_late_init(void)
+{
+	if (register_tunable(&max_threads_akt))
+		printk(KERN_WARNING
+			"AKT: Failed registering tunable max_threads\n");
+}
+
 static struct task_struct *dup_task_struct(struct task_struct *orig)
 {
 	struct task_struct *tsk;

--

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/6] AKT - auto_tuning activation
  2007-01-30 10:11 [PATCH 0/6] AKT - Automatic Kernel Tunables Nadia.Derbey
  2007-01-30 10:11 ` [PATCH 1/6] AKT - Tunable structure and registration routines Nadia.Derbey
@ 2007-01-30 10:11 ` Nadia.Derbey
  2007-01-30 10:11 ` [PATCH 3/6] AKT - tunables associated kobjects Nadia.Derbey
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Nadia.Derbey @ 2007-01-30 10:11 UTC (permalink / raw)
  To: akpm, randy.dunlap; +Cc: linux-kernel, Nadia Derbey

[-- Attachment #1: auto_tuning_activation.patch --]
[-- Type: text/plain, Size: 4226 bytes --]

[PATCH 02/06]

Introduces the auto-tuning activation routine

The auto-tuning routine is called by the fork kernel component


Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>


---
 include/linux/akt.h |   51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 kernel/exit.c       |   11 +++++++++++
 kernel/fork.c       |    2 ++
 3 files changed, 64 insertions(+)

Index: linux-2.6.20-rc4/include/linux/akt.h
===================================================================
--- linux-2.6.20-rc4.orig/include/linux/akt.h	2007-01-29 14:59:38.000000000 +0100
+++ linux-2.6.20-rc4/include/linux/akt.h	2007-01-29 15:07:54.000000000 +0100
@@ -102,12 +102,22 @@ struct auto_tune {
 /*
  * Flags for a registered tunable
  */
+#define AUTO_TUNE_ENABLE  0x01
 #define TUNABLE_REGISTERED  0x02
 
 
 /*
  * When calling this routine the tunable lock should be held
  */
+static inline int is_auto_tune_enabled(struct auto_tune *tunable)
+{
+	return (tunable->flags & AUTO_TUNE_ENABLE) == AUTO_TUNE_ENABLE;
+}
+
+
+/*
+ * When calling this routine the tunable lock should be held
+ */
 static inline int is_tunable_registered(struct auto_tune *tunable)
 {
 	return (tunable->flags & TUNABLE_REGISTERED) == TUNABLE_REGISTERED;
@@ -146,6 +156,44 @@ static inline int is_tunable_registered(
 	} while (0)
 
 
+static inline void set_autotuning_routine(struct auto_tune *tunable,
+					auto_tune_fn fn)
+{
+	if (fn != NULL)
+		tunable->auto_tune = fn;
+}
+
+
+/*
+ * direction may be one of:
+ *    AKT_UP: adjust up (i.e. increase tunable value when needed)
+ *    AKT_DOWN: adjust down (i.e. decrease tunable value when needed)
+ */
+static inline int activate_auto_tuning(int direction,
+					struct auto_tune *tunable)
+{
+	int ret = 0;
+
+	BUG_ON(direction != AKT_UP && direction != AKT_DOWN);
+
+	if (tunable == NULL)
+		return 0;
+
+	spin_lock(&tunable->tunable_lck);
+
+	if (!is_auto_tune_enabled(tunable) ||
+					!is_tunable_registered(tunable)) {
+		spin_unlock(&tunable->tunable_lck);
+		return 0;
+	}
+
+	ret = tunable->auto_tune(direction, tunable);
+
+	spin_unlock(&tunable->tunable_lck);
+	return ret;
+}
+
+
 extern int register_tunable(struct auto_tune *);
 extern int unregister_tunable(struct auto_tune *);
 
@@ -155,6 +203,9 @@ extern int unregister_tunable(struct aut
 
 #define DEFINE_TUNABLE(s, thresh, min, max, tun, chk, type)
 #define set_tunable_min_max(s, min, max)         do { } while (0)
+#define set_autotuning_routine(s, fn)            do { } while (0)
+
+#define activate_auto_tuning(direction, tunable) ( { 0; } )
 
 #define register_tunable(a)                 0
 #define unregister_tunable(a)               0
Index: linux-2.6.20-rc4/kernel/exit.c
===================================================================
--- linux-2.6.20-rc4.orig/kernel/exit.c	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/kernel/exit.c	2007-01-29 15:10:22.000000000 +0100
@@ -42,12 +42,15 @@
 #include <linux/audit.h> /* for audit_free() */
 #include <linux/resource.h>
 #include <linux/blkdev.h>
+#include <linux/akt.h>
 
 #include <asm/uaccess.h>
 #include <asm/unistd.h>
 #include <asm/pgtable.h>
 #include <asm/mmu_context.h>
 
+extern struct auto_tune max_threads_akt;
+
 extern void sem_exit (void);
 
 static void exit_mm(struct task_struct * tsk);
@@ -172,6 +175,14 @@ repeat:
 
 	sched_exit(p);
 	write_unlock_irq(&tasklist_lock);
+
+	/*
+	 * nr_threads has been decremented in __unhash_process: adjust
+	 * max_threads down if needed
+	 * We do it here to avoid calling activate_auto_tuning under lock
+	 */
+	activate_auto_tuning(AKT_DOWN, &max_threads_akt);
+
 	proc_flush_task(p);
 	release_thread(p);
 	call_rcu(&p->rcu, delayed_put_task_struct);
Index: linux-2.6.20-rc4/kernel/fork.c
===================================================================
--- linux-2.6.20-rc4.orig/kernel/fork.c	2007-01-29 13:41:44.000000000 +0100
+++ linux-2.6.20-rc4/kernel/fork.c	2007-01-29 15:11:07.000000000 +0100
@@ -995,6 +995,8 @@ static struct task_struct *copy_process(
 	if ((clone_flags & CLONE_SIGHAND) && !(clone_flags & CLONE_VM))
 		return ERR_PTR(-EINVAL);
 
+	activate_auto_tuning(AKT_UP, &max_threads_akt);
+
 	retval = security_task_create(clone_flags);
 	if (retval)
 		goto fork_out;

--

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/6] AKT - tunables associated kobjects
  2007-01-30 10:11 [PATCH 0/6] AKT - Automatic Kernel Tunables Nadia.Derbey
  2007-01-30 10:11 ` [PATCH 1/6] AKT - Tunable structure and registration routines Nadia.Derbey
  2007-01-30 10:11 ` [PATCH 2/6] AKT - auto_tuning activation Nadia.Derbey
@ 2007-01-30 10:11 ` Nadia.Derbey
  2007-01-30 10:11 ` [PATCH 4/6] AKT - min and max kobjects Nadia.Derbey
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Nadia.Derbey @ 2007-01-30 10:11 UTC (permalink / raw)
  To: akpm, randy.dunlap; +Cc: linux-kernel, Nadia Derbey

[-- Attachment #1: auto_tuning_kobjects.patch --]
[-- Type: text/plain, Size: 13109 bytes --]

[PATCH 03/06]    


Introduces the kobjects associated to each tunable and the sysfs registration


Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>


---
 include/linux/akt.h         |   25 ++++-
 init/main.c                 |    1 
 kernel/autotune/Makefile    |    2 
 kernel/autotune/akt.c       |   91 ++++++++++++++++++
 kernel/autotune/akt_sysfs.c |  214 ++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 330 insertions(+), 3 deletions(-)

Index: linux-2.6.20-rc4/include/linux/akt.h
===================================================================
--- linux-2.6.20-rc4.orig/include/linux/akt.h	2007-01-29 15:07:54.000000000 +0100
+++ linux-2.6.20-rc4/include/linux/akt.h	2007-01-29 15:32:48.000000000 +0100
@@ -48,6 +48,16 @@ typedef int (*auto_tune_fn)(int, struct 
 
 
 /*
+ * for sysfs support
+ */
+struct tunable_kobject {
+	struct kobject kobj;
+	struct auto_tune *tun;
+};
+
+
+
+/*
  * Structure used to describe the min / max values for a tunable inside the
  * auto_tune structure.
  */
@@ -62,7 +72,12 @@ struct tunable_limit {
  * allocated for each registered tunable, and the associated kobject exported
  * via sysfs.
  *
- * The structure lock (tunable_lck) protects
+ * This structure may be accessed in 2 ways:
+ *   . directly from inside the kernel susbsystem that uses it (during tunable
+ *     automatic adjustment)
+ *   . from sysfs, while updating the kobject attributes
+ *
+ * In both cases, the structure lock (tunable_lck) is taken: it protects
  * against concurrent accesses to tunable and checked pointers
  *
  * A pointer to this structure is passed in to  the automatic adjustment
@@ -92,6 +107,7 @@ struct auto_tune {
 					/* reach */
 	struct tunable_limit max;	/* max value the tunable can ever */
 					/* reach */
+	struct tunable_kobject    tun_kobj;	/* used for sysfs support */
 	void *tunable;	/* address of the tunable to adjust */
 	void *checked;	/* address of the variable that is controlled by */
 			/* the tunable. This is the calling subsystem's */
@@ -141,6 +157,7 @@ static inline int is_tunable_registered(
 		.max	= {						\
 			.value		= (_max),			\
 		},							\
+		.tun_kobj	= { .tun = NULL, },			\
 		.tunable	= (_tun),				\
 		.checked	= (_chk),				\
 	}
@@ -194,8 +211,12 @@ static inline int activate_auto_tuning(i
 }
 
 
+extern void init_auto_tuning(void);
 extern int register_tunable(struct auto_tune *);
 extern int unregister_tunable(struct auto_tune *);
+extern int tunable_sysfs_setup(struct auto_tune *);
+extern ssize_t show_tuning_mode(struct auto_tune *, char *);
+extern ssize_t store_tuning_mode(struct auto_tune *, const char *, size_t);
 
 
 #else	/* CONFIG_AKT */
@@ -210,6 +231,8 @@ extern int unregister_tunable(struct aut
 #define register_tunable(a)                 0
 #define unregister_tunable(a)               0
 
+static inline void init_auto_tuning(void)   { }
+
 #endif	/* CONFIG_AKT */
 
 extern void fork_late_init(void);
Index: linux-2.6.20-rc4/init/main.c
===================================================================
--- linux-2.6.20-rc4.orig/init/main.c	2007-01-29 13:36:41.000000000 +0100
+++ linux-2.6.20-rc4/init/main.c	2007-01-29 15:33:43.000000000 +0100
@@ -614,6 +614,7 @@ asmlinkage void __init start_kernel(void
 	signals_init();
 	/* rootfs populating might need page-writeback */
 	page_writeback_init();
+	init_auto_tuning();
 	fork_late_init();
 #ifdef CONFIG_PROC_FS
 	proc_root_init();
Index: linux-2.6.20-rc4/kernel/autotune/Makefile
===================================================================
--- linux-2.6.20-rc4.orig/kernel/autotune/Makefile	2007-01-29 13:40:06.000000000 +0100
+++ linux-2.6.20-rc4/kernel/autotune/Makefile	2007-01-29 15:34:30.000000000 +0100
@@ -2,6 +2,6 @@
 # Makefile for akt
 #
 
-obj-y := akt.o
+obj-y := akt.o akt_sysfs.o
 
 
Index: linux-2.6.20-rc4/kernel/autotune/akt.c
===================================================================
--- linux-2.6.20-rc4.orig/kernel/autotune/akt.c	2007-01-29 14:03:16.000000000 +0100
+++ linux-2.6.20-rc4/kernel/autotune/akt.c	2007-01-29 15:42:55.000000000 +0100
@@ -26,6 +26,8 @@
  *   FUNCTIONS:
  *              register_tunable           (exported)
  *              unregister_tunable         (exported)
+ *              show_tuning_mode           (exported)
+ *              store_tuning_mode          (exported)
  */
 
 #include <linux/init.h>
@@ -34,6 +36,9 @@
 #include <linux/akt.h>
 
 
+#define AKT_AUTO   1
+#define AKT_MANUAL 0
+
 /**
  * register_tunable - Inserts a tunable structure into sysfs
  * @tun:	tunable structure to be registered
@@ -50,6 +55,8 @@
  */
 int register_tunable(struct auto_tune *tun)
 {
+	int rc = 0;
+
 	if (tun == NULL) {
 		printk(KERN_ERR
 			"AKT: Bad tunable structure pointer (NULL)\n");
@@ -80,7 +87,10 @@ int register_tunable(struct auto_tune *t
 		return -EINVAL;
 	}
 
-	return 0;
+	if (!(rc = tunable_sysfs_setup(tun)))
+		tun->flags |= TUNABLE_REGISTERED;
+
+	return rc;
 }
 
 EXPORT_SYMBOL_GPL(register_tunable);
@@ -117,3 +127,82 @@ int unregister_tunable(struct auto_tune 
 }
 
 EXPORT_SYMBOL_GPL(unregister_tunable);
+
+
+/**
+ * show_tuning_mode - Outputs the tuning mode of a given tunable
+ * @tun_addr:	registered tunable structure to check
+ * @buf:	output buffer
+ *
+ * This is the get operation called by tunable_attr_show (i.e. when the file
+ * /sys/tunables/<tunable>/autotune is displayed).
+ * Outputs "1" if the corresponding tunable is automatically adjustable,
+ * "0" else
+ *
+ * Returns:	>0 - output string length (including the '\0')
+ *		<0 - failure
+ */
+ssize_t show_tuning_mode(struct auto_tune *tun_addr, char *buf)
+{
+	int valid;
+
+	if (tun_addr == NULL) {
+		printk(KERN_ERR "AKT: tunable address is invalid\n");
+		return -EINVAL;
+	}
+
+	spin_lock(&tun_addr->tunable_lck);
+
+	valid = is_auto_tune_enabled(tun_addr);
+
+	spin_unlock(&tun_addr->tunable_lck);
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", valid);
+}
+
+
+/**
+ * store_tuning_mode - Sets the tuning mode of a given tunable
+ * @tun_addr:	registered tunable structure to set
+ * @buf:	input buffer
+ * @count:	input buffer length (including the '\0')
+ *
+ * This is the set operation called by tunable_attr_store (i.e. when a string
+ * is stored into /sys/tunables/<tunable>/autotune).
+ * "1" makes the corresponding tunable automatically adjustable
+ * "0" makes the corresponding tunable manually adjustable
+ *
+ * Returns:	>0 - number of characters used from the input buffer
+ *		<0 - failure
+ */
+ssize_t store_tuning_mode(struct auto_tune *tun_addr, const char *buffer,
+			size_t count)
+{
+	int new_value;
+
+	if (sscanf(buffer, "%d", &new_value) != 1)
+		return -EINVAL;
+
+	if (new_value != AKT_AUTO && new_value != AKT_MANUAL)
+		return -EINVAL;
+
+	if (tun_addr == NULL) {
+		printk(KERN_ERR "AKT: NULL pointer  passed in\n");
+		return -EINVAL;
+	}
+
+	spin_lock(&tun_addr->tunable_lck);
+
+	switch (new_value) {
+	case AKT_AUTO:
+		tun_addr->flags |= AUTO_TUNE_ENABLE;
+		break;
+	case AKT_MANUAL:
+		tun_addr->flags &= ~AUTO_TUNE_ENABLE;
+		break;
+	}
+
+	spin_unlock(&tun_addr->tunable_lck);
+
+	return strnlen(buffer, PAGE_SIZE);
+}
Index: linux-2.6.20-rc4/kernel/autotune/akt_sysfs.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.20-rc4/kernel/autotune/akt_sysfs.c	2007-01-29 15:39:05.000000000 +0100
@@ -0,0 +1,214 @@
+/*
+ * linux/kernel/autotune/akt_sysfs.c
+ *
+ * Automatic Kernel Tunables for Linux
+ * sysfs bindings for AKT
+ *
+ * Copyright (C) 2006 Bull S.A.S
+ *
+ * Author: Nadia Derbey <Nadia.Derbey@bull.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+/*
+ * FUNCTIONS:
+ *            tunable_attr_show      (static)
+ *            tunable_attr_store     (static)
+ *            tunable_sysfs_setup
+ *            add_tunable_attrs      (static)
+ *            init_auto_tuning
+ */
+
+
+#include <linux/init.h>
+#include <linux/stat.h>
+#include <linux/module.h>
+#include <linux/akt.h>
+
+
+
+
+struct tunable_attribute {
+	struct attribute attr;
+	ssize_t (*show)(struct auto_tune *, char *);
+	ssize_t (*store)(struct auto_tune *, const char *, size_t);
+};
+
+#define TUNABLE_ATTR(_name, _mode, _show, _store)	\
+struct tunable_attribute tun_attr_##_name = __ATTR(_name, _mode, _show, _store)
+
+
+static TUNABLE_ATTR(autotune, S_IWUSR | S_IRUGO, show_tuning_mode,
+		store_tuning_mode);
+
+static struct tunable_attribute *tunable_sysfs_attrs[] = {
+	&tun_attr_autotune,	/* to (de)activate auto tuning */
+	NULL,
+};
+
+
+
+#define to_tunable_kobj(obj)  container_of(obj, struct tunable_kobject, kobj)
+#define to_tunable(obj)       container_of(obj, struct auto_tune, tun_kobj)
+#define to_tunable_attr(_attr)	\
+	container_of(_attr, struct tunable_attribute, attr)
+
+
+static int add_tunable_attrs(struct auto_tune *);
+
+
+/**
+ * tunable_attr_show - Show method for the tunables subsystem
+ * @kobj:	tunable associated kobject
+ * @attr:	tunable attribute to read. Can be one of:
+ *			tun_attr_autotune
+ * @buf:	output buffer
+ *
+ * Forwards any read call to the show method of the owning attribute
+ *
+ * Returns:	>0 - output string length (including the '\0')
+ *		<0 - failure
+ */
+static ssize_t tunable_attr_show(struct kobject *kobj,
+				struct attribute *attr,
+				char *buf)
+{
+	struct tunable_attribute *tun_attr = to_tunable_attr(attr);
+	struct tunable_kobject *tkobj = to_tunable_kobj(kobj);
+	struct auto_tune *tunable = to_tunable(tkobj);
+	ssize_t count = -EIO;
+
+	if (tun_attr->show)
+		count = tun_attr->show(tunable, buf);
+	return count;
+}
+
+
+/**
+ * tunable_attr_store - Store method for the tunables subsystem
+ * @kobj:	tunable associated kobject
+ * @attr:	tunable attribute to update. Can be one of:
+ *			tun_attr_autotune
+ * @buf:	input buffer
+ * @count:	input buffer length (including the '\0')
+ *
+ * Forwards any write call to the store method of the owning attribute
+ *
+ * Returns:	>0 - number of characters used from the input buffer
+ *		<0 - failure
+ */
+static ssize_t tunable_attr_store(struct kobject *kobj,
+				struct attribute *attr,
+				const char *buf,
+				size_t count)
+{
+	struct tunable_attribute *tun_attr = to_tunable_attr(attr);
+	struct tunable_kobject *tkobj = to_tunable_kobj(kobj);
+	struct auto_tune *tunable = to_tunable(tkobj);
+	ssize_t ret = -EIO;
+
+	if (tun_attr->store)
+		ret = tun_attr->store(tunable, buf, count);
+	return ret;
+}
+
+
+static struct sysfs_ops tunables_sysfs_ops = {
+	.show	= tunable_attr_show,
+	.store	= tunable_attr_store,
+};
+
+
+static struct kobj_type tunables_ktype = {
+	.sysfs_ops	= &tunables_sysfs_ops,
+};
+
+
+decl_subsys(tunables, &tunables_ktype, NULL);
+
+
+/**
+ * tunable_sysfs_setup - Registers one tunable into sysfs
+ * @tunable:	tunable structure to be registered
+ *
+ * Called by register_tunable()
+ * The tunable is a kobject with 1 attributes:
+ *	autotune (rw): enables to (de)activate the auto tuning for the tunable
+ *
+ * Returns:	0 - successful
+ *		<0 - failure
+ */
+
+#define tunable_kobj(t) t->tun_kobj.kobj
+
+int tunable_sysfs_setup(struct auto_tune *tunable)
+{
+	int err = 0;
+
+	memset(&(tunable_kobj(tunable)), 0, sizeof(tunable_kobj(tunable)));
+	if ((err = kobject_set_name(&(tunable_kobj(tunable)), "%s",
+							tunable->name)))
+		return err;
+
+	kobj_set_kset_s(&(tunable->tun_kobj), tunables_subsys);
+	tunable->tun_kobj.tun = tunable;
+
+	if ((err = kobject_register(&(tunable_kobj(tunable)))))
+		return err;
+
+	if ((err = add_tunable_attrs(tunable)))
+		kobject_unregister(&(tunable_kobj(tunable)));
+
+	return err;
+}
+
+
+/**
+ * add_tunable_attrs - Creates the attributes for a tunable
+ * @tunable:	tunable structure being registered
+ *
+ * Called by tunable_sysfs_setup()
+ * Adds the set of predefined attributes for a tunable being registered
+ *
+ * Returns:	0 - successful
+ *		<0 - failure
+ */
+static int add_tunable_attrs(struct auto_tune *tunable)
+{
+	struct tunable_attribute *attr;
+	int error = 0;
+	int i;
+
+	for (i = 0; (attr = tunable_sysfs_attrs[i]) && !error; i++) {
+		error = sysfs_create_file(&(tunable_kobj(tunable)),
+			&(attr->attr));
+	}
+
+	return error;
+}
+
+
+/**
+ * init_auto_tuning - registers the tunables subssystem in sysfs
+ */
+void __init init_auto_tuning(void)
+{
+	int error = subsystem_register(&tunables_subsys);
+
+	if (error)
+		printk(KERN_ERR
+			"AKT: Failed registering tunables subsystem\n");
+}

--

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 4/6] AKT - min and max kobjects
  2007-01-30 10:11 [PATCH 0/6] AKT - Automatic Kernel Tunables Nadia.Derbey
                   ` (2 preceding siblings ...)
  2007-01-30 10:11 ` [PATCH 3/6] AKT - tunables associated kobjects Nadia.Derbey
@ 2007-01-30 10:11 ` Nadia.Derbey
  2007-01-30 10:11 ` [PATCH 5/6] AKT - per namespace tunables Nadia.Derbey
  2007-01-30 10:11 ` [PATCH 6/6] AKT - automatic tuning applied to some kernel components Nadia.Derbey
  5 siblings, 0 replies; 10+ messages in thread
From: Nadia.Derbey @ 2007-01-30 10:11 UTC (permalink / raw)
  To: akpm, randy.dunlap; +Cc: linux-kernel, Nadia Derbey

[-- Attachment #1: tunable_min_max_kobjects.patch --]
[-- Type: text/plain, Size: 8582 bytes --]

[PATCH 04/06]


Introduces the kobjects associated to each tunable min and max value


Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>


---
 include/linux/akt.h         |   14 ++++
 kernel/autotune/akt.c       |  148 ++++++++++++++++++++++++++++++++++++++++++++
 kernel/autotune/akt_sysfs.c |   16 ++++
 3 files changed, 177 insertions(+), 1 deletion(-)

Index: linux-2.6.20-rc4/include/linux/akt.h
===================================================================
--- linux-2.6.20-rc4.orig/include/linux/akt.h	2007-01-29 15:32:48.000000000 +0100
+++ linux-2.6.20-rc4/include/linux/akt.h	2007-01-29 15:47:40.000000000 +0100
@@ -60,9 +60,15 @@ struct tunable_kobject {
 /*
  * Structure used to describe the min / max values for a tunable inside the
  * auto_tune structure.
+ * The abs_value field is used to check that we are not:
+ *   . falling under the very 1st min value when updating the min value
+ *     through sysfs
+ *   . going over the very 1st max value when updating the max value
+ *     through sysfs
  */
 struct tunable_limit {
 	ulong value;
+	ulong abs_value;
 };
 
 
@@ -153,9 +159,11 @@ static inline int is_tunable_registered(
 		.threshold	= (_thresh),				\
 		.min	= {						\
 			.value		= (_min),			\
+			.abs_value	= (_min),			\
 		},							\
 		.max	= {						\
 			.value		= (_max),			\
+			.abs_value	= (_max),			\
 		},							\
 		.tun_kobj	= { .tun = NULL, },			\
 		.tunable	= (_tun),				\
@@ -169,7 +177,9 @@ static inline int is_tunable_registered(
 #define set_tunable_min_max(s, _min, _max)	\
 	do {					\
 		(s).min.value = _min;		\
+		(s).min.abs_value = _min;	\
 		(s).max.value = _max;		\
+		(s).max.abs_value = _max;	\
 	} while (0)
 
 
@@ -217,6 +227,10 @@ extern int unregister_tunable(struct aut
 extern int tunable_sysfs_setup(struct auto_tune *);
 extern ssize_t show_tuning_mode(struct auto_tune *, char *);
 extern ssize_t store_tuning_mode(struct auto_tune *, const char *, size_t);
+extern ssize_t show_tunable_min(struct auto_tune *, char *);
+extern ssize_t store_tunable_min(struct auto_tune *, const char *, size_t);
+extern ssize_t show_tunable_max(struct auto_tune *, char *);
+extern ssize_t store_tunable_max(struct auto_tune *, const char *, size_t);
 
 
 #else	/* CONFIG_AKT */
Index: linux-2.6.20-rc4/kernel/autotune/akt.c
===================================================================
--- linux-2.6.20-rc4.orig/kernel/autotune/akt.c	2007-01-29 15:42:55.000000000 +0100
+++ linux-2.6.20-rc4/kernel/autotune/akt.c	2007-01-29 15:50:31.000000000 +0100
@@ -28,6 +28,10 @@
  *              unregister_tunable         (exported)
  *              show_tuning_mode           (exported)
  *              store_tuning_mode          (exported)
+ *              show_tunable_min           (exported)
+ *              store_tunable_min          (exported)
+ *              show_tunable_max           (exported)
+ *              store_tunable_max          (exported)
  */
 
 #include <linux/init.h>
@@ -206,3 +210,147 @@ ssize_t store_tuning_mode(struct auto_tu
 
 	return strnlen(buffer, PAGE_SIZE);
 }
+
+
+/**
+ * show_tunable_min - Outputs the minimum value of a given tunable
+ * @tun_addr:	registered tunable structure to check
+ * @buf:	output buffer
+ *
+ * This is the get operation called by tunable_attr_show (i.e. when the file
+ * /sys/tunables/<tunable>/min is displayed).
+ * Outputs the current tunable minimum value
+ *
+ * Returns:	>0 - output string length (including the '\0')
+ *		<0 - failure
+ */
+ssize_t show_tunable_min(struct auto_tune *tun_addr, char *buf)
+{
+	ssize_t rc;
+
+	if (tun_addr == NULL) {
+		printk(KERN_ERR "AKT: tunable address is invalid\n");
+		return -EINVAL;
+	}
+
+	spin_lock(&tun_addr->tunable_lck);
+
+	rc = snprintf(buf, PAGE_SIZE, "%lu\n", tun_addr->min.value);
+
+	spin_unlock(&tun_addr->tunable_lck);
+
+	return rc;
+}
+
+
+/**
+ * store_tunable_min - Sets the minimum value of a given tunable
+ * @tun_addr:	registered tunable structure to set
+ * @buf:	input buffer
+ * @count:	input buffer length (including the '\0')
+ *
+ * This is the set operation called by tunable_attr_store (i.e. when a string
+ * is stored into /sys/tunables/<tunable>/min).
+ *
+ * Returns:	>0 - number of characters used from the input buffer
+ *		<0 - failure
+ */
+ssize_t store_tunable_min(struct auto_tune *tun_addr, const char *buf,
+			size_t count)
+{
+	ssize_t rc;
+	ulong new_value;
+
+	if (sscanf(buf, "%lu", &new_value) != 1)
+		return -EINVAL;
+
+	if (tun_addr == NULL) {
+		printk(KERN_ERR "AKT: tunable address is invalid\n");
+		return -EINVAL;
+	}
+
+	spin_lock(&tun_addr->tunable_lck);
+
+	if (new_value >= tun_addr->min.abs_value &&
+					new_value < tun_addr->max.value) {
+		tun_addr->min.value = new_value;
+		rc = strnlen(buf, PAGE_SIZE);
+	} else
+		rc = -EINVAL;
+
+	spin_unlock(&tun_addr->tunable_lck);
+
+	return rc;
+}
+
+
+/**
+ * show_tunable_max - Outputs the maximum value of a given tunable
+ * @tun_addr:	registered tunable structure to check
+ * @buf:	output buffer
+ *
+ * This is the get operation called by tunable_attr_show (i.e. when the file
+ * /sys/tunables/<tunable>/max is displayed).
+ * Outputs the current tunable maximum value
+ *
+ * Returns:	>0 - output string length (including the '\0')
+ *		<0 - failure
+ */
+ssize_t show_tunable_max(struct auto_tune *tun_addr, char *buf)
+{
+	ssize_t rc;
+
+	if (tun_addr == NULL) {
+		printk(KERN_ERR "AKT: tunable address is invalid\n");
+		return -EINVAL;
+	}
+
+	spin_lock(&tun_addr->tunable_lck);
+
+	rc = snprintf(buf, PAGE_SIZE, "%lu\n", tun_addr->max.value);
+
+	spin_unlock(&tun_addr->tunable_lck);
+
+	return rc;
+}
+
+
+/**
+ * store_tunable_max - Sets the maximum value of a given tunable
+ * @tun_addr:	registered tunable structure to set
+ * @buf:	input buffer
+ * @count:	input buffer length (including the '\0')
+ *
+ * This is the set operation called by tunable_attr_store (i.e. when a string
+ * is stored into /sys/tunables/<tunable>/max).
+ *
+ * Returns:	>0 - number of characters used from the input buffer
+ *		<0 - failure
+ */
+ssize_t store_tunable_max(struct auto_tune *tun_addr, const char *buf,
+			size_t count)
+{
+	ssize_t rc;
+	ulong new_value;
+
+	if (sscanf(buf, "%lu", &new_value) != 1)
+		return -EINVAL;
+
+	if (tun_addr == NULL) {
+		printk(KERN_ERR "AKT: tunable address is invalid\n");
+		return -EINVAL;
+	}
+
+	spin_lock(&tun_addr->tunable_lck);
+
+	if (new_value <= tun_addr->max.abs_value &&
+					new_value > tun_addr->min.value) {
+		tun_addr->max.value = new_value;
+		rc = strnlen(buf, PAGE_SIZE);
+	} else
+		rc = -EINVAL;
+
+	spin_unlock(&tun_addr->tunable_lck);
+
+	return rc;
+}
Index: linux-2.6.20-rc4/kernel/autotune/akt_sysfs.c
===================================================================
--- linux-2.6.20-rc4.orig/kernel/autotune/akt_sysfs.c	2007-01-29 15:39:05.000000000 +0100
+++ linux-2.6.20-rc4/kernel/autotune/akt_sysfs.c	2007-01-29 15:52:09.000000000 +0100
@@ -54,8 +54,16 @@ struct tunable_attribute tun_attr_##_nam
 static TUNABLE_ATTR(autotune, S_IWUSR | S_IRUGO, show_tuning_mode,
 		store_tuning_mode);
 
+static TUNABLE_ATTR(min, S_IWUSR | S_IRUGO, show_tunable_min,
+		store_tunable_min);
+
+static TUNABLE_ATTR(max, S_IWUSR | S_IRUGO, show_tunable_max,
+		store_tunable_max);
+
 static struct tunable_attribute *tunable_sysfs_attrs[] = {
 	&tun_attr_autotune,	/* to (de)activate auto tuning */
+	&tun_attr_min,		/* to play with the tunable min value */
+	&tun_attr_max,		/* to play with the tunable max value */
 	NULL,
 };
 
@@ -75,6 +83,8 @@ static int add_tunable_attrs(struct auto
  * @kobj:	tunable associated kobject
  * @attr:	tunable attribute to read. Can be one of:
  *			tun_attr_autotune
+ *			tun_attr_min
+ *			tun_attr_max
  * @buf:	output buffer
  *
  * Forwards any read call to the show method of the owning attribute
@@ -102,6 +112,8 @@ static ssize_t tunable_attr_show(struct 
  * @kobj:	tunable associated kobject
  * @attr:	tunable attribute to update. Can be one of:
  *			tun_attr_autotune
+ *			tun_attr_min
+ *			tun_attr_max
  * @buf:	input buffer
  * @count:	input buffer length (including the '\0')
  *
@@ -145,8 +157,10 @@ decl_subsys(tunables, &tunables_ktype, N
  * @tunable:	tunable structure to be registered
  *
  * Called by register_tunable()
- * The tunable is a kobject with 1 attributes:
+ * The tunable is a kobject with 3 attributes:
  *	autotune (rw): enables to (de)activate the auto tuning for the tunable
+ *	min (rw): enables to play with the min tunable value
+ *	max (rw): enables to play with the max tunable value
  *
  * Returns:	0 - successful
  *		<0 - failure

--

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 5/6] AKT - per namespace tunables
  2007-01-30 10:11 [PATCH 0/6] AKT - Automatic Kernel Tunables Nadia.Derbey
                   ` (3 preceding siblings ...)
  2007-01-30 10:11 ` [PATCH 4/6] AKT - min and max kobjects Nadia.Derbey
@ 2007-01-30 10:11 ` Nadia.Derbey
  2007-01-30 10:11 ` [PATCH 6/6] AKT - automatic tuning applied to some kernel components Nadia.Derbey
  5 siblings, 0 replies; 10+ messages in thread
From: Nadia.Derbey @ 2007-01-30 10:11 UTC (permalink / raw)
  To: akpm, randy.dunlap; +Cc: linux-kernel, Nadia Derbey

[-- Attachment #1: per_namespace_tunables.patch --]
[-- Type: text/plain, Size: 7381 bytes --]

[PATCH 05/06]


This patch introduces all that is needed to process per namespace tunables.


Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>


---
 include/linux/akt.h   |   12 ++++++
 kernel/autotune/akt.c |   94 +++++++++++++++++++++++++++++++++++++-------------
 2 files changed, 83 insertions(+), 23 deletions(-)

Index: linux-2.6.20-rc4/include/linux/akt.h
===================================================================
--- linux-2.6.20-rc4.orig/include/linux/akt.h	2007-01-29 15:47:40.000000000 +0100
+++ linux-2.6.20-rc4/include/linux/akt.h	2007-01-29 15:57:59.000000000 +0100
@@ -126,6 +126,7 @@ struct auto_tune {
  */
 #define AUTO_TUNE_ENABLE  0x01
 #define TUNABLE_REGISTERED  0x02
+#define TUNABLE_IPC_NS      0x04
 
 
 /*
@@ -171,6 +172,8 @@ static inline int is_tunable_registered(
 	}
 
 
+#define DECLARE_TUNABLE(s)	struct auto_tune s;
+
 #define DEFINE_TUNABLE(s, thr, min, max, tun, chk, type)		\
 	struct auto_tune s = TUNABLE_INIT(#s, thr, min, max, tun, chk, type)
 
@@ -182,6 +185,13 @@ static inline int is_tunable_registered(
 		(s).max.abs_value = _max;	\
 	} while (0)
 
+#define init_tunable_ipcns(ns, s, thr, min, max, tun, chk, type)	\
+	do {								\
+		DEFINE_TUNABLE(s, thr, min, max, tun, chk, type);	\
+		s.flags |= TUNABLE_IPC_NS;				\
+		ns->s = s;						\
+	} while (0)
+
 
 static inline void set_autotuning_routine(struct auto_tune *tunable,
 					auto_tune_fn fn)
@@ -236,7 +246,9 @@ extern ssize_t store_tunable_max(struct 
 #else	/* CONFIG_AKT */
 
 
+#define DECLARE_TUNABLE(s)
 #define DEFINE_TUNABLE(s, thresh, min, max, tun, chk, type)
+#define init_tunable_ipcns(ns, s, th, m, M, tun, chk, type)  do { } while (0)
 #define set_tunable_min_max(s, min, max)         do { } while (0)
 #define set_autotuning_routine(s, fn)            do { } while (0)
 
Index: linux-2.6.20-rc4/kernel/autotune/akt.c
===================================================================
--- linux-2.6.20-rc4.orig/kernel/autotune/akt.c	2007-01-29 15:50:31.000000000 +0100
+++ linux-2.6.20-rc4/kernel/autotune/akt.c	2007-01-29 16:02:10.000000000 +0100
@@ -32,6 +32,7 @@
  *              store_tunable_min          (exported)
  *              show_tunable_max           (exported)
  *              store_tunable_max          (exported)
+ *              get_ns_tunable             (static)
  */
 
 #include <linux/init.h>
@@ -43,6 +44,10 @@
 #define AKT_AUTO   1
 #define AKT_MANUAL 0
 
+static struct auto_tune *get_ns_tunable(struct auto_tune *);
+
+
+
 /**
  * register_tunable - Inserts a tunable structure into sysfs
  * @tun:	tunable structure to be registered
@@ -149,17 +154,20 @@ EXPORT_SYMBOL_GPL(unregister_tunable);
 ssize_t show_tuning_mode(struct auto_tune *tun_addr, char *buf)
 {
 	int valid;
+	struct auto_tune *which;
 
 	if (tun_addr == NULL) {
 		printk(KERN_ERR "AKT: tunable address is invalid\n");
 		return -EINVAL;
 	}
 
-	spin_lock(&tun_addr->tunable_lck);
+	which = get_ns_tunable(tun_addr);
 
-	valid = is_auto_tune_enabled(tun_addr);
+	spin_lock(&which->tunable_lck);
 
-	spin_unlock(&tun_addr->tunable_lck);
+	valid = is_auto_tune_enabled(which);
+
+	spin_unlock(&which->tunable_lck);
 
 	return snprintf(buf, PAGE_SIZE, "%d\n", valid);
 }
@@ -183,6 +191,7 @@ ssize_t store_tuning_mode(struct auto_tu
 			size_t count)
 {
 	int new_value;
+	struct auto_tune *which;
 
 	if (sscanf(buffer, "%d", &new_value) != 1)
 		return -EINVAL;
@@ -195,18 +204,20 @@ ssize_t store_tuning_mode(struct auto_tu
 		return -EINVAL;
 	}
 
-	spin_lock(&tun_addr->tunable_lck);
+	which = get_ns_tunable(tun_addr);
+
+	spin_lock(&which->tunable_lck);
 
 	switch (new_value) {
 	case AKT_AUTO:
-		tun_addr->flags |= AUTO_TUNE_ENABLE;
+		which->flags |= AUTO_TUNE_ENABLE;
 		break;
 	case AKT_MANUAL:
-		tun_addr->flags &= ~AUTO_TUNE_ENABLE;
+		which->flags &= ~AUTO_TUNE_ENABLE;
 		break;
 	}
 
-	spin_unlock(&tun_addr->tunable_lck);
+	spin_unlock(&which->tunable_lck);
 
 	return strnlen(buffer, PAGE_SIZE);
 }
@@ -227,17 +238,20 @@ ssize_t store_tuning_mode(struct auto_tu
 ssize_t show_tunable_min(struct auto_tune *tun_addr, char *buf)
 {
 	ssize_t rc;
+	struct auto_tune *which;
 
 	if (tun_addr == NULL) {
 		printk(KERN_ERR "AKT: tunable address is invalid\n");
 		return -EINVAL;
 	}
 
-	spin_lock(&tun_addr->tunable_lck);
+	which = get_ns_tunable(tun_addr);
+
+	spin_lock(&which->tunable_lck);
 
-	rc = snprintf(buf, PAGE_SIZE, "%lu\n", tun_addr->min.value);
+	rc = snprintf(buf, PAGE_SIZE, "%lu\n", which->min.value);
 
-	spin_unlock(&tun_addr->tunable_lck);
+	spin_unlock(&which->tunable_lck);
 
 	return rc;
 }
@@ -259,6 +273,7 @@ ssize_t store_tunable_min(struct auto_tu
 			size_t count)
 {
 	ssize_t rc;
+	struct auto_tune *which;
 	ulong new_value;
 
 	if (sscanf(buf, "%lu", &new_value) != 1)
@@ -269,16 +284,18 @@ ssize_t store_tunable_min(struct auto_tu
 		return -EINVAL;
 	}
 
-	spin_lock(&tun_addr->tunable_lck);
+	which = get_ns_tunable(tun_addr);
 
-	if (new_value >= tun_addr->min.abs_value &&
-					new_value < tun_addr->max.value) {
-		tun_addr->min.value = new_value;
+	spin_lock(&which->tunable_lck);
+
+	if (new_value >= which->min.abs_value &&
+					new_value < which->max.value) {
+		which->min.value = new_value;
 		rc = strnlen(buf, PAGE_SIZE);
 	} else
 		rc = -EINVAL;
 
-	spin_unlock(&tun_addr->tunable_lck);
+	spin_unlock(&which->tunable_lck);
 
 	return rc;
 }
@@ -299,17 +316,20 @@ ssize_t store_tunable_min(struct auto_tu
 ssize_t show_tunable_max(struct auto_tune *tun_addr, char *buf)
 {
 	ssize_t rc;
+	struct auto_tune *which;
 
 	if (tun_addr == NULL) {
 		printk(KERN_ERR "AKT: tunable address is invalid\n");
 		return -EINVAL;
 	}
 
-	spin_lock(&tun_addr->tunable_lck);
+	which = get_ns_tunable(tun_addr);
+
+	spin_lock(&which->tunable_lck);
 
-	rc = snprintf(buf, PAGE_SIZE, "%lu\n", tun_addr->max.value);
+	rc = snprintf(buf, PAGE_SIZE, "%lu\n", which->max.value);
 
-	spin_unlock(&tun_addr->tunable_lck);
+	spin_unlock(&which->tunable_lck);
 
 	return rc;
 }
@@ -331,6 +351,7 @@ ssize_t store_tunable_max(struct auto_tu
 			size_t count)
 {
 	ssize_t rc;
+	struct auto_tune *which;
 	ulong new_value;
 
 	if (sscanf(buf, "%lu", &new_value) != 1)
@@ -341,16 +362,43 @@ ssize_t store_tunable_max(struct auto_tu
 		return -EINVAL;
 	}
 
-	spin_lock(&tun_addr->tunable_lck);
+	which = get_ns_tunable(tun_addr);
+
+	spin_lock(&which->tunable_lck);
 
-	if (new_value <= tun_addr->max.abs_value &&
-					new_value > tun_addr->min.value) {
-		tun_addr->max.value = new_value;
+	if (new_value <= which->max.abs_value &&
+					new_value > which->min.value) {
+		which->max.value = new_value;
 		rc = strnlen(buf, PAGE_SIZE);
 	} else
 		rc = -EINVAL;
 
-	spin_unlock(&tun_addr->tunable_lck);
+	spin_unlock(&which->tunable_lck);
 
 	return rc;
 }
+
+
+/**
+ * get_ns_tunable - Gets the tunable structure for the current namespace
+ * @p:	pointer to the tunable for the init namespace
+ *
+ * This routine gets the actual auto_tune structure for the tunables that are
+ * per namespace (presently only ipc ones).
+ *
+ * Returns:	>0 - pointer to the tunable structure for the current ns
+ */
+static struct auto_tune *get_ns_tunable(struct auto_tune *p)
+{
+	if (p->flags & TUNABLE_IPC_NS) {
+		char *shift = (char *) p;
+		struct ipc_namespace *ns = current->nsproxy->ipc_ns;
+
+		shift = (shift - (char *) &init_ipc_ns) + (char *) ns;
+
+		return (struct auto_tune *) shift;
+	}
+
+	return p;
+}
+

--

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 6/6] AKT - automatic tuning applied to some kernel components
  2007-01-30 10:11 [PATCH 0/6] AKT - Automatic Kernel Tunables Nadia.Derbey
                   ` (4 preceding siblings ...)
  2007-01-30 10:11 ` [PATCH 5/6] AKT - per namespace tunables Nadia.Derbey
@ 2007-01-30 10:11 ` Nadia.Derbey
  5 siblings, 0 replies; 10+ messages in thread
From: Nadia.Derbey @ 2007-01-30 10:11 UTC (permalink / raw)
  To: akpm, randy.dunlap; +Cc: linux-kernel, Nadia Derbey

[-- Attachment #1: auto_tune_applied.patch --]
[-- Type: text/plain, Size: 14631 bytes --]

[PATCH 06/06]


The following kernel components register a tunable structure and call the
auto-tuning routine:
  . file system
  . shared memory (per namespace)
  . semaphore (per namespace)
  . message queues (per namespace)


Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>


---
 fs/file_table.c     |   82 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/akt.h |    1 
 include/linux/ipc.h |    6 +++
 init/main.c         |    1 
 ipc/msg.c           |   20 ++++++++++++
 ipc/sem.c           |   43 +++++++++++++++++++++++++++
 ipc/shm.c           |   76 +++++++++++++++++++++++++++++++++++++++++++++---
 7 files changed, 224 insertions(+), 5 deletions(-)

Index: linux-2.6.20-rc4/include/linux/akt.h
===================================================================
--- linux-2.6.20-rc4.orig/include/linux/akt.h	2007-01-29 15:57:59.000000000 +0100
+++ linux-2.6.20-rc4/include/linux/akt.h	2007-01-29 16:05:58.000000000 +0100
@@ -262,5 +262,6 @@ static inline void init_auto_tuning(void
 #endif	/* CONFIG_AKT */
 
 extern void fork_late_init(void);
+extern void files_late_init(void);
 
 #endif /* AKT_H */
Index: linux-2.6.20-rc4/include/linux/ipc.h
===================================================================
--- linux-2.6.20-rc4.orig/include/linux/ipc.h	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/include/linux/ipc.h	2007-01-29 16:07:37.000000000 +0100
@@ -52,6 +52,7 @@ struct ipc_perm
 #ifdef __KERNEL__
 
 #include <linux/kref.h>
+#include <linux/akt.h>
 
 #define IPCMNI 32768  /* <= MAX_INT limit for ipc arrays (including sysctl changes) */
 
@@ -77,15 +78,20 @@ struct ipc_namespace {
 
 	int		sem_ctls[4];
 	int		used_sems;
+	DECLARE_TUNABLE(semmni_akt);
+	DECLARE_TUNABLE(semmns_akt);
 
 	int		msg_ctlmax;
 	int		msg_ctlmnb;
 	int		msg_ctlmni;
+	DECLARE_TUNABLE(msgmni_akt);
 
 	size_t		shm_ctlmax;
 	size_t		shm_ctlall;
 	int		shm_ctlmni;
 	int		shm_tot;
+	DECLARE_TUNABLE(shmmni_akt);
+	DECLARE_TUNABLE(shmall_akt);
 };
 
 extern struct ipc_namespace init_ipc_ns;
Index: linux-2.6.20-rc4/init/main.c
===================================================================
--- linux-2.6.20-rc4.orig/init/main.c	2007-01-29 15:33:43.000000000 +0100
+++ linux-2.6.20-rc4/init/main.c	2007-01-29 16:08:31.000000000 +0100
@@ -616,6 +616,7 @@ asmlinkage void __init start_kernel(void
 	page_writeback_init();
 	init_auto_tuning();
 	fork_late_init();
+	files_late_init();
 #ifdef CONFIG_PROC_FS
 	proc_root_init();
 #endif
Index: linux-2.6.20-rc4/fs/file_table.c
===================================================================
--- linux-2.6.20-rc4.orig/fs/file_table.c	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/fs/file_table.c	2007-01-29 16:10:37.000000000 +0100
@@ -21,6 +21,8 @@
 #include <linux/fsnotify.h>
 #include <linux/sysctl.h>
 #include <linux/percpu_counter.h>
+#include <linux/akt.h>
+#include <linux/akt_ops.h>
 
 #include <asm/atomic.h>
 
@@ -34,6 +36,71 @@ __cacheline_aligned_in_smp DEFINE_SPINLO
 
 static struct percpu_counter nr_files __cacheline_aligned_in_smp;
 
+#ifdef CONFIG_AKT
+
+static int get_nr_files(void);
+
+/********** automatic tuning **********/
+#define FILPTHRESH 80		/* threshold = 80% */
+
+/*
+ * FUNCTION:    This is the routine called to accomplish auto tuning for the
+ *              max_files tunable.
+ *
+ *              Upwards adjustment:
+ *                  Adjustment is needed if nr_files has reached
+ *                  (threshold / 100 * max_files)
+ *                  In that case, max_files is set to
+ *                  (tunable + max_files * (100 - threshold) / 100)
+ *
+ *              Downards adjustment:
+ *                   Adjustment is needed if nr_files has fallen under
+ *                   (threshold / 100 * max_files previous value)
+ *                   In that case max_files is set back to its previous value,
+ *                   i.e. to (max_files * 100 / (200 - threshold))
+ *
+ * PARAMETERS:  cmd: controls the adjustment direction (up / down)
+ *              params: pointer to the registered tunable structure
+ *
+ * EXECUTION ENVIRONMENT: This routine should be called with the
+ *                        params->tunable_lck lock held
+ *
+ * RETURN VALUE: 1 if tunable has been adjusted
+ *               0 else
+ */
+static inline int maxfiles_auto_tuning(int cmd, struct auto_tune *params)
+{
+	int thr = params->threshold;
+	int min = params->min.value;
+	int max = params->max.value;
+	int tun = files_stat.max_files;
+
+	if (cmd == AKT_UP) {
+		if (get_nr_files() >= tun * thr / 100 && tun < max) {
+			int new = tun * (200 - thr) / 100;
+
+			files_stat.max_files = min(max, new);
+			return 1;
+		} else
+			return 0;
+	}
+
+	if (get_nr_files() < tun * thr / (200 - thr) && tun > min) {
+		int new = tun * 100 / (200 - thr);
+
+		files_stat.max_files = max(min, new);
+		return 1;
+	} else
+		return 0;
+}
+
+#endif /* CONFIG_AKT */
+
+/* The maximum value will be known later on */
+DEFINE_TUNABLE(maxfiles_akt, FILPTHRESH, 0, 0, &files_stat.max_files,
+		&nr_files, int);
+
+
 static inline void file_free_rcu(struct rcu_head *head)
 {
 	struct file *f =  container_of(head, struct file, f_u.fu_rcuhead);
@@ -44,6 +111,8 @@ static inline void file_free(struct file
 {
 	percpu_counter_dec(&nr_files);
 	call_rcu(&f->f_u.fu_rcuhead, file_free_rcu);
+
+	activate_auto_tuning(AKT_DOWN, &maxfiles_akt);
 }
 
 /*
@@ -91,6 +160,8 @@ struct file *get_empty_filp(void)
 	static int old_max;
 	struct file * f;
 
+	activate_auto_tuning(AKT_UP, &maxfiles_akt);
+
 	/*
 	 * Privileged users can go above max_files
 	 */
@@ -299,6 +370,17 @@ void __init files_init(unsigned long mem
 	files_stat.max_files = n; 
 	if (files_stat.max_files < NR_FILE)
 		files_stat.max_files = NR_FILE;
+
+	set_tunable_min_max(maxfiles_akt, n, n * 2);
+	set_autotuning_routine(&maxfiles_akt, maxfiles_auto_tuning);
+
 	files_defer_init();
 	percpu_counter_init(&nr_files, 0);
 } 
+
+void __init files_late_init(void)
+{
+	if (register_tunable(&maxfiles_akt))
+		printk(KERN_WARNING
+			"AKT: Failed registering tunable file-max\n");
+}
Index: linux-2.6.20-rc4/ipc/msg.c
===================================================================
--- linux-2.6.20-rc4.orig/ipc/msg.c	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/ipc/msg.c	2007-01-29 16:11:57.000000000 +0100
@@ -36,6 +36,8 @@
 #include <linux/seq_file.h>
 #include <linux/mutex.h>
 #include <linux/nsproxy.h>
+#include <linux/akt.h>
+#include <linux/akt_ops.h>
 
 #include <asm/current.h>
 #include <asm/uaccess.h>
@@ -94,6 +96,11 @@ static void __ipc_init __msg_init_ns(str
 	ns->msg_ctlmnb = MSGMNB;
 	ns->msg_ctlmni = MSGMNI;
 	ipc_init_ids(ids, ns->msg_ctlmni);
+
+#define MSGTHRESH 80
+
+	init_tunable_ipcns(ns, msgmni_akt, MSGTHRESH, MSGMNI, IPCMNI,
+		&ns->msg_ctlmni, &ids->in_use, int);
 }
 
 #ifdef CONFIG_IPC_NS
@@ -133,6 +140,11 @@ void msg_exit_ns(struct ipc_namespace *n
 void __init msg_init(void)
 {
 	__msg_init_ns(&init_ipc_ns, &init_msg_ids);
+
+	if (register_tunable(&init_ipc_ns.msgmni_akt))
+		printk(KERN_WARNING
+			"AKT: Failed registering tunable msgmni\n");
+
 	ipc_init_proc_interface("sysvipc/msg",
 				"       key      msqid perms      cbytes       qnum lspid lrpid   uid   gid  cuid  cgid      stime      rtime      ctime\n",
 				IPC_MSG_IDS, sysvipc_msg_proc_show);
@@ -262,6 +274,8 @@ asmlinkage long sys_msgget(key_t key, in
 
 	ns = current->nsproxy->ipc_ns;
 	
+	activate_auto_tuning(AKT_UP, &ns->msgmni_akt);
+
 	mutex_lock(&msg_ids(ns).mutex);
 	if (key == IPC_PRIVATE) 
 		ret = newque(ns, key, msgflg);
@@ -391,6 +405,7 @@ asmlinkage long sys_msgctl(int msqid, in
 	struct msg_queue *msq;
 	int err, version;
 	struct ipc_namespace *ns;
+	int destroyed = 0;
 
 	if (msqid < 0 || cmd < 0)
 		return -EINVAL;
@@ -555,11 +570,16 @@ asmlinkage long sys_msgctl(int msqid, in
 	}
 	case IPC_RMID:
 		freeque(ns, msq, msqid);
+		destroyed = 1;
 		break;
 	}
 	err = 0;
 out_up:
 	mutex_unlock(&msg_ids(ns).mutex);
+
+	if (destroyed)
+		activate_auto_tuning(AKT_DOWN, &ns->msgmni_akt);
+
 	return err;
 out_unlock_up:
 	msg_unlock(msq);
Index: linux-2.6.20-rc4/ipc/shm.c
===================================================================
--- linux-2.6.20-rc4.orig/ipc/shm.c	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/ipc/shm.c	2007-01-29 16:12:22.000000000 +0100
@@ -37,6 +37,8 @@
 #include <linux/seq_file.h>
 #include <linux/mutex.h>
 #include <linux/nsproxy.h>
+#include <linux/akt.h>
+#include <linux/akt_ops.h>
 
 #include <asm/uaccess.h>
 
@@ -75,17 +77,27 @@ static void __ipc_init __shm_init_ns(str
 	ns->shm_ctlmni = SHMMNI;
 	ns->shm_tot = 0;
 	ipc_init_ids(ids, 1);
+
+#define SHMTHRESH 80
+	init_tunable_ipcns(ns, shmmni_akt, SHMTHRESH, SHMMNI, IPCMNI,
+		&ns->shm_ctlmni, &ids->in_use, int);
+	init_tunable_ipcns(ns, shmall_akt, SHMTHRESH, SHMALL,
+		SHMMAX / PAGE_SIZE * (IPCMNI / 16), &ns->shm_ctlall,
+		&ns->shm_tot, size_t);
 }
 
-static void do_shm_rmid(struct ipc_namespace *ns, struct shmid_kernel *shp)
+static int do_shm_rmid(struct ipc_namespace *ns, struct shmid_kernel *shp)
 {
 	if (shp->shm_nattch){
 		shp->shm_perm.mode |= SHM_DEST;
 		/* Do not find it any more */
 		shp->shm_perm.key = IPC_PRIVATE;
 		shm_unlock(shp);
-	} else
+		return 0;
+	} else {
 		shm_destroy(ns, shp);
+		return 1;
+	}
 }
 
 #ifdef CONFIG_IPC_NS
@@ -125,6 +137,15 @@ void shm_exit_ns(struct ipc_namespace *n
 void __init shm_init (void)
 {
 	__shm_init_ns(&init_ipc_ns, &init_shm_ids);
+
+	if (register_tunable(&init_ipc_ns.shmmni_akt))
+		printk(KERN_WARNING
+			"AKT: Failed registering tunable shmmni\n");
+
+	if (register_tunable(&init_ipc_ns.shmall_akt))
+		printk(KERN_WARNING
+			"AKT: Failed registering tunable shmall\n");
+
 	ipc_init_proc_interface("sysvipc/shm",
 				"       key      shmid perms       size  cpid  lpid nattch   uid   gid  cuid  cgid      atime      dtime      ctime\n",
 				IPC_SHM_IDS, sysvipc_shm_proc_show);
@@ -206,6 +227,7 @@ static void shm_close (struct vm_area_st
 	int id = file->f_path.dentry->d_inode->i_ino;
 	struct shmid_kernel *shp;
 	struct ipc_namespace *ns;
+	int destroyed = 0;
 
 	ns = shm_file_ns(file);
 
@@ -217,11 +239,27 @@ static void shm_close (struct vm_area_st
 	shp->shm_dtim = get_seconds();
 	shp->shm_nattch--;
 	if(shp->shm_nattch == 0 &&
-	   shp->shm_perm.mode & SHM_DEST)
+	   shp->shm_perm.mode & SHM_DEST) {
 		shm_destroy(ns, shp);
-	else
+		destroyed = 1;
+	} else
 		shm_unlock(shp);
 	mutex_unlock(&shm_ids(ns).mutex);
+
+	if (destroyed) {
+		int rc;
+
+		rc = activate_auto_tuning(AKT_DOWN, &ns->shmmni_akt);
+		if (rc)
+			/*
+			 * shm_ctlmni has been adjusted == > change
+			 * shm_ctlall value
+			 */
+			ns->shm_ctlall = ns->shm_ctlmax / PAGE_SIZE *
+				(ns->shm_ctlmni / 16);
+
+		activate_auto_tuning(AKT_DOWN, &ns->shmall_akt);
+	}
 }
 
 static int shm_mmap(struct file * file, struct vm_area_struct * vma)
@@ -355,9 +393,20 @@ asmlinkage long sys_shmget (key_t key, s
 	struct shmid_kernel *shp;
 	int err, id = 0;
 	struct ipc_namespace *ns;
+	int rc;
 
 	ns = current->nsproxy->ipc_ns;
 
+	rc = activate_auto_tuning(AKT_UP, &ns->shmmni_akt);
+	if (rc)
+		/*
+		 * shm_ctlmni has been adjusted == > change shm_ctlall value
+		 */
+		ns->shm_ctlall = ns->shm_ctlmax / PAGE_SIZE
+				* (ns->shm_ctlmni / 16);
+
+	activate_auto_tuning(AKT_UP, &ns->shmall_akt);
+
 	mutex_lock(&shm_ids(ns).mutex);
 	if (key == IPC_PRIVATE) {
 		err = newseg(ns, key, shmflg, size);
@@ -516,6 +565,7 @@ asmlinkage long sys_shmctl (int shmid, i
 	struct shmid_kernel *shp;
 	int err, version;
 	struct ipc_namespace *ns;
+	int destroyed;
 
 	if (cmd < 0 || shmid < 0) {
 		err = -EINVAL;
@@ -701,8 +751,24 @@ asmlinkage long sys_shmctl (int shmid, i
 		if (err)
 			goto out_unlock_up;
 
-		do_shm_rmid(ns, shp);
+		destroyed = do_shm_rmid(ns, shp);
 		mutex_unlock(&shm_ids(ns).mutex);
+
+		if (destroyed) {
+			int rc;
+
+			rc = activate_auto_tuning(AKT_DOWN, &ns->shmmni_akt);
+			if (rc)
+				/*
+				 * shm_ctlmni has been adjusted == > change
+				 * shm_ctlall value
+				 */
+				ns->shm_ctlall = ns->shm_ctlmax / PAGE_SIZE *
+					(ns->shm_ctlmni / 16);
+
+			activate_auto_tuning(AKT_DOWN, &ns->shmall_akt);
+		}
+
 		goto out;
 	}
 
Index: linux-2.6.20-rc4/ipc/sem.c
===================================================================
--- linux-2.6.20-rc4.orig/ipc/sem.c	2007-01-29 12:39:30.000000000 +0100
+++ linux-2.6.20-rc4/ipc/sem.c	2007-01-29 16:12:47.000000000 +0100
@@ -83,6 +83,8 @@
 #include <linux/seq_file.h>
 #include <linux/mutex.h>
 #include <linux/nsproxy.h>
+#include <linux/akt.h>
+#include <linux/akt_ops.h>
 
 #include <asm/uaccess.h>
 #include "util.h"
@@ -131,6 +133,12 @@ static void __ipc_init __sem_init_ns(str
 	ns->sc_semmni = SEMMNI;
 	ns->used_sems = 0;
 	ipc_init_ids(ids, ns->sc_semmni);
+
+#define SEMTHRESH 80
+	init_tunable_ipcns(ns, semmni_akt, SEMTHRESH, SEMMNI, IPCMNI,
+		&(ns->sc_semmni), &ids->in_use, int);
+	init_tunable_ipcns(ns, semmns_akt, SEMTHRESH, SEMMNS,
+		IPCMNI * SEMMSL, &(ns->sc_semmns), &ns->used_sems, int);
 }
 
 #ifdef CONFIG_IPC_NS
@@ -170,6 +178,15 @@ void sem_exit_ns(struct ipc_namespace *n
 void __init sem_init (void)
 {
 	__sem_init_ns(&init_ipc_ns, &init_sem_ids);
+
+	if (register_tunable(&init_ipc_ns.semmni_akt))
+		printk(KERN_WARNING
+			"AKT: Failed registering tunable semmni\n");
+
+	if (register_tunable(&init_ipc_ns.semmns_akt))
+		printk(KERN_WARNING
+			"AKT: Failed registering tunable semmns\n");
+
 	ipc_init_proc_interface("sysvipc/sem",
 				"       key      semid perms      nsems   uid   gid  cuid  cgid      otime      ctime\n",
 				IPC_SEM_IDS, sysvipc_sem_proc_show);
@@ -263,11 +280,22 @@ asmlinkage long sys_semget (key_t key, i
 	int id, err = -EINVAL;
 	struct sem_array *sma;
 	struct ipc_namespace *ns;
+	int rc;
 
 	ns = current->nsproxy->ipc_ns;
 
 	if (nsems < 0 || nsems > ns->sc_semmsl)
 		return -EINVAL;
+
+	rc = activate_auto_tuning(AKT_UP, &ns->semmni_akt);
+	if (rc)
+		/*
+		 * sc_semmni has been adjusted == > change sc_semmns value
+		 */
+		ns->sc_semmns = ns->sc_semmni * ns->sc_semmsl;
+
+	activate_auto_tuning(AKT_UP, &ns->semmns_akt);
+
 	mutex_lock(&sem_ids(ns).mutex);
 	
 	if (key == IPC_PRIVATE) {
@@ -899,6 +927,21 @@ static int semctl_down(struct ipc_namesp
 	case IPC_RMID:
 		freeary(ns, sma, semid);
 		err = 0;
+
+		{
+			int rc;
+
+			rc = activate_auto_tuning(AKT_DOWN, &ns->semmni_akt);
+			if (rc)
+				/*
+				 * sc_semmni has been adjusted ==>
+				 * change sc_semmns value
+				 */
+				ns->sc_semmns = ns->sc_semmni * ns->sc_semmsl;
+
+			activate_auto_tuning(AKT_DOWN, &ns->semmns_akt);
+		}
+
 		break;
 	case IPC_SET:
 		ipcp->uid = setbuf.uid;

--

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/6] AKT - Tunable structure and registration routines
  2007-01-30 10:11 ` [PATCH 1/6] AKT - Tunable structure and registration routines Nadia.Derbey
@ 2007-02-12 15:07   ` Andi Kleen
  2007-02-13 10:18     ` Nadia Derbey
  0 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2007-02-12 15:07 UTC (permalink / raw)
  To: Nadia.Derbey; +Cc: akpm, randy.dunlap, linux-kernel

Nadia.Derbey@bull.net writes:
> +
> +This feature aims at making the kernel automatically change the tunables
> +values as it sees resources running out.

The only reason we have resource limit is to avoid DOS when one
resource consumes too much memory.  When there is no such danger then
there isn't any reason to have a limit at all and it could be just
eliminated (or set to unlimited by default)

Your feature doesn't address the DOS and without that there isn't
any reason to have limits at all. So what's the point? 

I agree that some of the default limits we have are not very useful
on modern machines. I guess you're trying to address that.

I would suggest the following strategy:

- Review any limits we have and make sure they make sense.

- Figure out if they actually serve a useful purpose 
e.g. what happens when they are exceeded, is there a DOS?. 
If yes can  the DOS be addressed in a better way (e.g. by allowing to shrink
the resource by a shrinker callback).
 
Some of the existing limits are clearly bogus, e.g. the limit
on shared memory.

For others i don't see a good alternative. e.g. if you don't limit
the number of files allocated the only alternative would be to kill
processes when they allocate too many files. Is that really preferable
to a errno? 

- If they serve a useful purpose then check if the default is useful
on a modern machine. Or make them scale with the amount of memory
like many limits already do.

-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/6] AKT - Tunable structure and registration routines
  2007-02-12 15:07   ` Andi Kleen
@ 2007-02-13 10:18     ` Nadia Derbey
  2007-02-13 10:51       ` Andi Kleen
  0 siblings, 1 reply; 10+ messages in thread
From: Nadia Derbey @ 2007-02-13 10:18 UTC (permalink / raw)
  To: Andi Kleen; +Cc: akpm, randy.dunlap, linux-kernel

Andi Kleen wrote:
> Nadia.Derbey@bull.net writes:
> 
>>+
>>+This feature aims at making the kernel automatically change the tunables
>>+values as it sees resources running out.
> 
> 
> The only reason we have resource limit is to avoid DOS when one
> resource consumes too much memory.  When there is no such danger then
> there isn't any reason to have a limit at all and it could be just
> eliminated (or set to unlimited by default)

Automatic tuning is a way to set the limit to unlimited, in a sense, 
doesn't it? With this feature, we can leave the default limits as they 
are for an "every-day" usage, and when a particular application runs on 
the machine, authorize the limit to grow up as needed.

> 
> Your feature doesn't address the DOS and without that there isn't
> any reason to have limits at all. So what's the point? 

As I told Eric Biederman in another mail, DoS in ensured in AKT by 
exporting the min and max values for each tunable to sysfs (actually 
Eric complained about these min and max :-( ). These are RW atrributes 
that make it possible for a sysadmin to set the max value a tunable can 
ever reach, instead of letting it grow up to huge values.

> 
> I agree that some of the default limits we have are not very useful
> on modern machines. I guess you're trying to address that.

Yep

> 
> I would suggest the following strategy:
> 
> - Review any limits we have and make sure they make sense.
> 
> - Figure out if they actually serve a useful purpose 
> e.g. what happens when they are exceeded, is there a DOS?. 
> If yes can  the DOS be addressed in a better way (e.g. by allowing to shrink
> the resource by a shrinker callback).
>  
> Some of the existing limits are clearly bogus, e.g. the limit
> on shared memory.
> 
> For others i don't see a good alternative. e.g. if you don't limit
> the number of files allocated the only alternative would be to kill
> processes when they allocate too many files. Is that really preferable
> to a errno? 

Agree with you, BUT between the default max_files and the "too many 
files" situation, there is a gap that can be crossed by automatically 
tuning max_files, isn't there? e.g. max_file default value is NR_FILE 
(0x2000), while Oracle expects to have it set to 0x10000.

> 
> - If they serve a useful purpose then check if the default is useful
> on a modern machine. Or make them scale with the amount of memory
> like many limits already do.
> 
> -Andi
> 

Regards,
Nadia


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/6] AKT - Tunable structure and registration routines
  2007-02-13 10:18     ` Nadia Derbey
@ 2007-02-13 10:51       ` Andi Kleen
  0 siblings, 0 replies; 10+ messages in thread
From: Andi Kleen @ 2007-02-13 10:51 UTC (permalink / raw)
  To: Nadia Derbey; +Cc: akpm, randy.dunlap, linux-kernel

On Tuesday 13 February 2007 11:18, Nadia Derbey wrote:
> Andi Kleen wrote:
> > Nadia.Derbey@bull.net writes:
> > 
> >>+
> >>+This feature aims at making the kernel automatically change the tunables
> >>+values as it sees resources running out.
> > 
> > 
> > The only reason we have resource limit is to avoid DOS when one
> > resource consumes too much memory.  When there is no such danger then
> > there isn't any reason to have a limit at all and it could be just
> > eliminated (or set to unlimited by default)
> 
> Automatic tuning is a way to set the limit to unlimited, in a sense, 
> doesn't it? With this feature, we can leave the default limits as they 
> are for an "every-day" usage, and when a particular application runs on 
> the machine, authorize the limit to grow up as needed.

That would be effectively no limit, so why not just do away with them
completely? You have to solve the DOS issues first of course, either way.
  
> > Your feature doesn't address the DOS and without that there isn't
> > any reason to have limits at all. So what's the point? 
> 
> As I told Eric Biederman in another mail, DoS in ensured in AKT by 
> exporting the min and max values for each tunable to sysfs (actually 
> Eric complained about these min and max :-( ). These are RW atrributes 
> that make it possible for a sysadmin to set the max value a tunable can 
> ever reach, instead of letting it grow up to huge values.

Then you could just set the limit always to that max value and would
get the same effect.

Min limit doesn't make much sense to me because near all Linux data structures
are on demand only anyways (excluding mempools and a few special cases) 
and min is defined as the current number of used resources.

> Agree with you, BUT between the default max_files and the "too many 
> files" situation, there is a gap that can be crossed by automatically 
> tuning max_files, isn't there? e.g. max_file default value is NR_FILE 
> (0x2000), while Oracle expects to have it set to 0x10000.

The reason NR_FILE is by default relatively low is mostly because the existing
ulimits suck :-) .i.e. you want to limit the memory consumption 
of a user you limit the number of the user's processes and the number of files
per process -- then the max memory they can allocate in files is process ulimit*nr_files
[in theory in practice there are holes like in flight fds in unix socket fd passing]

But of course that ends up with either NR_FILE being far too low or process far
too low or too much memory anyways. In practice it doesn't work particularly
well.

The real solution for that is "beancounter" which has been posted in several
forms recently (from OpenVZ and from Google). Basically it adds real limits
(for much more than just file descriptors) per uid or container.

With such infrastructure in place NR_FILES could be set to unlimited, as long
as you have a reasonable (large) default per uid.
 
Similar reasoning applies to other resources which are currently limited.

I think you're just attacking the symptoms here instead of the  basic issue.
 
-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-02-13 10:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-30 10:11 [PATCH 0/6] AKT - Automatic Kernel Tunables Nadia.Derbey
2007-01-30 10:11 ` [PATCH 1/6] AKT - Tunable structure and registration routines Nadia.Derbey
2007-02-12 15:07   ` Andi Kleen
2007-02-13 10:18     ` Nadia Derbey
2007-02-13 10:51       ` Andi Kleen
2007-01-30 10:11 ` [PATCH 2/6] AKT - auto_tuning activation Nadia.Derbey
2007-01-30 10:11 ` [PATCH 3/6] AKT - tunables associated kobjects Nadia.Derbey
2007-01-30 10:11 ` [PATCH 4/6] AKT - min and max kobjects Nadia.Derbey
2007-01-30 10:11 ` [PATCH 5/6] AKT - per namespace tunables Nadia.Derbey
2007-01-30 10:11 ` [PATCH 6/6] AKT - automatic tuning applied to some kernel components Nadia.Derbey

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.