All of lore.kernel.org
 help / color / mirror / Atom feed
From: Randy Dunlap <randy.dunlap@oracle.com>
To: Nadia.Derbey@bull.net
Cc: linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCH 1/6] Tunable structure and registration routines
Date: Wed, 24 Jan 2007 16:32:18 -0800	[thread overview]
Message-ID: <20070124163218.ec54891d.randy.dunlap@oracle.com> (raw)
In-Reply-To: <20070116063028.375290000@bull.net>

On Tue, 16 Jan 2007 07:15:17 +0100 Nadia.Derbey@bull.net wrote:

> [PATCH 01/06]
> 
> Defines the auto_tune structure: this is the structure that contains the
> information needed by the adjustment routine for a given tunable.
> Also defines the registration routines.
> 
> The fork kernel component defines a tunable structure for the threads-max
> tunable and registers it.
> 
> Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
> ---
>  Documentation/00-INDEX      |    2 
>  Documentation/auto_tune.txt |  333 ++++++++++++++++++++++++++++++++++++++++++++
>  fs/Kconfig                  |    2 
>  include/linux/akt.h         |  186 ++++++++++++++++++++++++
>  include/linux/akt_ops.h     |  186 ++++++++++++++++++++++++
>  init/main.c                 |    2 
>  kernel/Makefile             |    1 
>  kernel/autotune/Kconfig     |   30 +++
>  kernel/autotune/Makefile    |    7 
>  kernel/autotune/akt.c       |  123 ++++++++++++++++
>  kernel/fork.c               |   18 ++
>  11 files changed, 890 insertions(+)
> 
> Index: linux-2.6.20-rc4/Documentation/auto_tune.txt
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6.20-rc4/Documentation/auto_tune.txt	2007-01-15 14:19:18.000000000 +0100
> @@ -0,0 +1,333 @@
> +			Automatic Kernel Tunables
> +                        =========================
> +
> +		   Nadia Derbey (Nadia.Derbey@bull.net)
> +
> +
> +
> +This feature aims at making the kernel automatically change the tunables
> +values as it sees resources running out.
> +
> +The AKT framework is made of 2 parts:
> +
> +1) Kernel part:
> +Interfaces are provided to the kernel subsystems, to (un)register the
> +tunables that might be automatically tuned in the future.
> +
> +Registering a tunable consists in the following steps:

                                 s/in/of/

> +- a structure is declared and filled by the kernel subsystem for the
> +registered tunable
> +- that tunable structure is registered into sysfs
> +
> +Registration should be done during the kernel subsystem initialization step.

...

> +Any kernel subsystem that has registered a tunable should call
> +auto_tune_func() as follows:
> +
> ++-------------------------+--------------------------------------------+
> +| Step                    | Routine to call                            |
> ++-------------------------+--------------------------------------------+
> +| Declaration phase       | DEFINE_TUNABLE(name, values...);           |
> ++-------------------------+--------------------------------------------+
> +| Initialization routine  | set_tunable_min_max(name, min, max);       |
> +|                         | set_autotuning_routine(name, routine);     |
> +|                         | register_tunable(&name);                   |
> +| Note: the 1st 2 calls   |                                            |
> +|       are optional      |                                            |
> ++-------------------------+--------------------------------------------+
> +| Alloc                   | activate_auto_tuning(AKT_UP, &name);       |
> ++-------------------------+--------------------------------------------+
> +| Free                    | activate_auto_tuning(AKT_DOWN, &name);     |

So does Free always use AKT_DOWN?  why does it matter?
Seems unneeded and inconsistent.
How does one activate a tunable for downward adjustment?

> ++-------------------------+--------------------------------------------+
> +| module_exit() routine   | unregister_tunable(&name);                 |
> ++-------------------------+--------------------------------------------+
> +
> +activate_auto_tuning is a static inline defined in akt.h, that does the
> +following:
> +. if <tunable is registered> and <auto tuning is allowd for tunable>

                                                    allowed

> +.   call the routine stored in tunable->auto_tune
> +
> +
> +The effect of the default automatic tuning routine is the following:
> +
> +           +----------------------------------------------------------------+
> +           |                 Tunable automatically adjustable               |
> +           +---------------+------------------------------------------------+
> +           |      NO       |                      YES                       |
> ++----------+---------------+------------------------------------------------+
> +| AKT_UP   | No effect     | If the tunable value exceeds the specified     |
> +|          |               | threshold, that value is increased up to a     |
> +|          |               | maximum value.                                 |
> +|          |               | The maximum value is specified during the      |
> +|          |               | tunable declaration and can be changed at any  |
> +|          |               | time through sysfs                             |
> ++----------+---------------+------------------------------------------------+
> +| AKT_DOWN | No effect     | If the tunable value falls under the specified |
> +|          |               | threshold, that value is decreased down to a   |
> +|          |               | minimum value.                                 |
> +|          |               | The minimum value is specified during the      |
> +|          |               | tunable declaration and can be changed at any  |
> +|          |               | time through sysfs                             |
> ++----------+---------------+------------------------------------------------+
> +
> +
> +1.6. Default automatic adjustment routine
> +
> +The last service provided by AKT at the kernel level is the default automatic
> +adjustment routine. As seen, above, this routine supports various tunables
> +types. It works as follows (only the AKT_UP direction is described here -
> +AKT_DOWN does the reverse operation):
> +
> +The 2nd parameter passed in to this routine is a pointer to a previously
> +registerd tunable structure. That structure contains the following fields (see
   registered

> +1.1 for the detailed description):
> +- threshold
> +- key
> +- min
> +- max
> +- tunable
> +- checked
> +
> +When this routine is entered, it does the following:
> +1. <*checked> is compared to <*tunable> * threshold
> +2. if <*checked> is greater, <*tunable> is set to:
> +	<*tunable> + (<*tunable> * (100 - threshold) / 100)
> +
> +
> +
> +1.6) akt and sysfs:
> +
...

> +
> +1.7) tunables that are namespace dependent
> +
...

> +
> +1.7.2) Initializing the tunable structure
> +
> +Then the tunable structure should be initialized by calling the following
> +routine:
> +
> +init_tunable_ipcns(namespace_ptr, structure_name, threshold, min, max,
> +		tunable_variable_ptr, checked_variable_ptr,
> +		tunable_variable_type);
> +
> +Parameters:
> +- namespace_ptr: pointer to the namespace the tunable belongs to.
> +
> +See DEFINE_TUNABLE for the other parameters

end with a period/full-stop '.'.

> +
> +1.7.3) Registering the tunable structure
> +
...

> +
> +2) User part:
> +
> +As seen above, the only way to activate automatic tuning is from user side:
> +- the directory /sys/tunables is created during the init phase.
> +- each time a tunable is registered by a kernel subsystem, a directory is
> +created for it under /sys/tunables.
> +- This directory contains 1 file for each tunable kobject attribute:

Please try to limit text documentation to 80 columns or less.

> ++-----------+---------------+-------------------+----------------------------+
> +| attribute | default value | how to set it     | effect                     |
> ++-----------+---------------+-------------------+----------------------------+
> +| autotune  | 0             | echo 1 > autotune | makes the tunable automatic|
> +|           |               | echo 0 > autotune | makes the tunable manual   |
> ++-----------+---------------+-------------------+----------------------------+
> +| max       | max value set | echo <M> > max    | sets the tunable max value |
> +|           | during tunable|                   | to <M>                     |
> +|           | definition    |                   |                            |
> ++-----------+---------------+-------------------+----------------------------+
> +| min       | min value set | echo <m> > min    | sets the tunable min value |
> +|           | during tunable|                   | to <m>                     |
> +|           | definition    |                   |                            |
> ++-----------+---------------+-------------------+----------------------------+
> +
> Index: linux-2.6.20-rc4/fs/Kconfig
> ===================================================================
> --- linux-2.6.20-rc4.orig/fs/Kconfig	2007-01-15 13:08:14.000000000 +0100
> +++ linux-2.6.20-rc4/fs/Kconfig	2007-01-15 14:20:20.000000000 +0100
> @@ -925,6 +925,8 @@ config PROC_KCORE
>  	bool "/proc/kcore support" if !ARM
>  	depends on PROC_FS && MMU
>  
> +source "kernel/autotune/Kconfig"

Why is that is the File systems menu?  Seems odd to me
for it to be there.  If it's just because it depends on
PROC_FS and SYSFS, then it should just go completely after
the File systems menu.

>  config PROC_VMCORE
>          bool "/proc/vmcore support (EXPERIMENTAL)"
>          depends on PROC_FS && EXPERIMENTAL && CRASH_DUMP
> Index: linux-2.6.20-rc4/include/linux/akt.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6.20-rc4/include/linux/akt.h	2007-01-15 14:26:24.000000000 +0100
> @@ -0,0 +1,186 @@
> +
> +#ifndef AKT_H
> +#define AKT_H
> +
> +#include <linux/types.h>
> +#include <linux/kobject.h>
> +
> +/*
> + * First parameter passed to the adjustment routine
> + */
> +#define AKT_UP   0   /* adjustment "up" */
> +#define AKT_DOWN 1   /* adjustment "down" */
> +
> +
> +struct auto_tune {
> +	spinlock_t tunable_lck; /* serializes access to the stucture fields */
> +	auto_tune_fn auto_tune; /* auto tuning routine registered by the */
> +				/* calling kernel susbsystem. If NULL, the */
> +				/* auto tuning routine that will be called */
> +				/* is the default one that processes uints */
> +	int (*check_parms)(struct auto_tune *);	/* min / max checking */
> +						/* routine ptr: points to */
> +						/* the appropriate routine */
> +						/* depending on the */
> +						/* tunable type */
> +	const char *name;
> +	char flags;	/* Only 2 bits are meaningful: */

Make flags unsigned char so that no sign bit is needed.

> +			/* bit 0: set to 1 if the associated tunable can */
> +			/*        be automatically adjusted */
> +			/* bits 1: set to 1 if the tunable has been */
> +			/*         registered */
> +			/* bits 2-7: useless */

                                     unused ??

> +	char threshold;	/* threshold to enable the adjustment expressed as */
> +			/* a %age */
> +	struct typed_value min;	/* min value the tunable can ever reach */
> +				/* and associated show / store routines) */
> +	struct typed_value max;	/* max value the tunable can ever reach */
> +				/* and associated show / store routines) */
> +	void *tunable;	/* address of the tunable to adjust */
> +	void *checked;	/* address of the variable that is controlled by */
> +			/* the tunable. This is the calling subsystem's */
> +			/* object counter */
> +};
> +

...

> +
> +extern void fork_late_init(void);

Looks like the wrong header file for that extern.

> +#endif /* AKT_H */

> Index: linux-2.6.20-rc4/kernel/autotune/akt.c
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux-2.6.20-rc4/kernel/autotune/akt.c	2007-01-15 14:51:54.000000000 +0100
> @@ -0,0 +1,123 @@
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/akt.h>
> +
> +
> +
> +	Too Much Whitespace.  :)
> +
> +
> +
> +/*
> + * FUNCTION:    Inserts a tunable structure into sysfs
> + *              This routine serves also as a checker for the tunable
> + *              structure fields.
> + *              This routine is called by any kernel subsystem that wants to
> + *              use akt services (automatic tunables adjustment) in the future
> + *
> + * NOTE: when calling this routine, the tunable structure should have already
> + *       been filled by defining it with DEFINE_TUNABLE()
> + *
> + * RETURN VALUE: 0: successful
> + *               <0 if failure
> + */

Please use kernel-doc format for function comment blocks.

> +int register_tunable(struct auto_tune *tun)
> +{
> +	if (tun == NULL) {
> +		printk(KERN_ERR "\tBad tunable structure pointer (NULL)\n");

	Each printk() needs something that tells that module or part
	of the kernel that it's coming from (sometimes called a prefix).
	And drop the \t (tab).  IOW, replace the tab with a prefix, e.g.:

		printk(KERN_ERR "autotune: Bad tunable structure NULL pointer\n");

> +		return -EINVAL;
> +	}
> +
> +	if (tun->threshold <= 0 || tun->threshold >= 100) {
> +		printk(KERN_ERR "\tBad threshold (%d) value "
> +			"- should be in the [1-99] interval\n",
> +			tun->threshold);

Replace \t with a prefix (and more below).

> +		return -EINVAL;
> +	}
> +
> +	if (tun->tunable == NULL) {
> +		printk(KERN_ERR "\tBad tunable pointer (NULL)\n");
> +		return -EINVAL;
> +	}
> +
> +	if (tun->checked == NULL) {
> +		printk(KERN_ERR "\tBad checked value pointer (NULL)\n");
> +		return -EINVAL;
> +	}
> +
> +	/*
> +	 * Check the min / max value
> +	 */
> +	if (tun->check_parms(tun)) {
> +		printk(KERN_ERR "\tBad min / max values\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +
> +/*
> + * FUNCTION:    Removes a tunable structure from sysfs.
> + *              This routine is called by any kernel subsystem that doesn't
> + *              need the akt services anymore
> + *
> + * NOTE:  reg_tun should point to a previously registered tunable
> + *
> + * RETURN VALUE: 0: successful
> + *               <0 if failure
> + */
> +int unregister_tunable(struct auto_tune *reg_tun)
> +{
> +	if (reg_tun == NULL) {
> +		printk(KERN_ERR "\tBad tunable address (NULL)\n");
> +		return -EINVAL;
> +	}
> +
> +	spin_lock(&reg_tun->tunable_lck);
> +
> +	BUG_ON(!is_tunable_registered(reg_tun));
> +
> +	reg_tun->flags = 0;
> +
> +	spin_unlock(&reg_tun->tunable_lck);
> +
> +	return 0;
> +}
> +
> +	Too Much Whitespace....
> +
> +
> +EXPORT_SYMBOL_GPL(register_tunable);
> +EXPORT_SYMBOL_GPL(unregister_tunable);

---
~Randy

  reply	other threads:[~2007-01-25  0:36 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-16  6:15 [RFC][PATCH 0/6] Automatice kernel tunables (AKT) Nadia.Derbey
2007-01-16  6:15 ` [RFC][PATCH 1/6] Tunable structure and registration routines Nadia.Derbey
2007-01-25  0:32   ` Randy Dunlap [this message]
2007-01-25 16:26     ` Nadia Derbey
2007-01-25 16:34       ` Randy Dunlap
2007-01-25 17:01         ` Nadia Derbey
2007-01-16  6:15 ` [RFC][PATCH 2/6] auto_tuning activation Nadia.Derbey
2007-01-16  6:15 ` [RFC][PATCH 3/6] tunables associated kobjects Nadia.Derbey
2007-01-16  6:15 ` [RFC][PATCH 4/6] min and max kobjects Nadia.Derbey
2007-01-24 22:41   ` Randy Dunlap
2007-01-25 16:34     ` Nadia Derbey
2007-01-16  6:15 ` [RFC][PATCH 5/6] per namespace tunables Nadia.Derbey
2007-01-24 22:41   ` Randy Dunlap
2007-01-16  6:15 ` [RFC][PATCH 6/6] automatic tuning applied to some kernel components Nadia.Derbey
2007-01-22 19:56   ` Andrew Morton
2007-01-23 14:40     ` Nadia Derbey
2007-02-07 21:18       ` Eric W. Biederman
2007-02-09 12:27         ` Nadia Derbey
2007-02-09 18:35           ` Eric W. Biederman
2007-02-13  9:06             ` Nadia Derbey
2007-02-13 10:10               ` Eric W. Biederman
2007-02-15  7:07                 ` Nadia Derbey
2007-02-15  7:49                   ` Eric W. Biederman
2007-02-15  8:25                     ` Nadia Derbey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070124163218.ec54891d.randy.dunlap@oracle.com \
    --to=randy.dunlap@oracle.com \
    --cc=Nadia.Derbey@bull.net \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.