lustre-devel-lustre.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@whamcloud.com>
To: lustre-devel@lists.lustre.org
Subject: [lustre-devel] [PATCH 37/37] lustre: obd_sysfs: error-check value stored in jobid_var
Date: Wed, 27 Feb 2019 06:17:57 +0000	[thread overview]
Message-ID: <4271D74A-4B36-4781-A1DA-F5512CBCAF15@whamcloud.com> (raw)
In-Reply-To: <155053494716.24125.12637812266850199400.stgit@noble.brown>

On Feb 18, 2019, at 17:09, NeilBrown <neilb@suse.com> wrote:
> 
> The jobid_var sysfs attribute only has 3 meaningful values.
> Other values cause lustre_get_jobid() to return an error
> which is uniformly ignored.
> 
> To improve usability and resilience, check that the value
> written is acceptable before storing it.
> 
> Signed-off-by: NeilBrown <neilb@suse.com>

This will no longer be true once https://review.whamcloud.com/31691
commit 6488c0ec57de ("LU-10698 obdclass: allow specifying complex jobids")
is merged into your tree.  That allows specifying more useful jobid
values, similar to how coredump files can be named:

    Allow specifying a format string for the jobid_name variable to create
    a jobid for processes on the client.  The jobid_name is used when
    jobid_var=nodelocal, if jobid_name contains "%j", or as a fallback if
    getting the specified jobid_var from the environment fails.
    
    The jobid_node string allows the following escape sequences:
    
        %e = executable name
        %g = group ID
        %h = hostname (system utsname)
        %j = jobid from jobid_var environment variable
        %p = process ID
        %u = user ID
    
    Any unknown escape sequences are dropped. Other arbitrary characters
    pass through unmodified, up to the maximum jobid string size of 32,
    though whitespace within the jobid is not copied.

    This allows, for example, specifying an arbitrary prefix, such as the
    cluster name, in addition to the traditional "procname.uid" format,
    to distinguish between jobs running on clients in different clusters:
    
        lctl set_param jobid_var=nodelocal jobid_name=cluster2.%e.%u
    or
        lctl set_param jobid_var=SLURM_JOB_ID jobid_name=cluster2.%j.%e
    
    To use an environment-specified JobID, if available, but fall back to
    a static string for all processes that do not have a valid JobID:
    
        lctl set_param jobid_var=SLURM_JOB_ID jobid_name=unknown


Currently the "%j" function was removed from the kernel client, even
though there is no technical reason it can't work (i.e. all of the code
to implement it is available and exported).  This is actually super
useful for HPC cluster administrators to monitor per-job IO bandwidth
and IOPS on the server, and something that I think should be restored.

Cheers, Andreas

> ---
> drivers/staging/lustre/lustre/obdclass/obd_sysfs.c |   21 ++++++++++++++------
> 1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/lustre/lustre/obdclass/obd_sysfs.c b/drivers/staging/lustre/lustre/obdclass/obd_sysfs.c
> index 57a6f2c2da1c..69ccc6a55947 100644
> --- a/drivers/staging/lustre/lustre/obdclass/obd_sysfs.c
> +++ b/drivers/staging/lustre/lustre/obdclass/obd_sysfs.c
> @@ -216,16 +216,25 @@ static ssize_t jobid_var_store(struct kobject *kobj, struct attribute *attr,
> 			       const char *buffer,
> 			       size_t count)
> {
> +	static char *valid[] = {
> +		JOBSTATS_DISABLE,
> +		JOBSTATS_PROCNAME_UID,
> +		JOBSTATS_NODELOCAL,
> +		NULL
> +	};
> +	int i;
> +
> 	if (!count || count > JOBSTATS_JOBID_VAR_MAX_LEN)
> 		return -EINVAL;
> 
> -	memset(obd_jobid_var, 0, JOBSTATS_JOBID_VAR_MAX_LEN + 1);
> -
> -	memcpy(obd_jobid_var, buffer, count);
> +	for (i = 0; valid[i]; i++)
> +		if (sysfs_streq(buffer, valid[i]))
> +			break;
> +	if (!valid[i])
> +		return -EINVAL;
> 
> -	/* Trim the trailing '\n' if any */
> -	if (obd_jobid_var[count - 1] == '\n')
> -		obd_jobid_var[count - 1] = 0;
> +	memset(obd_jobid_var, 0, JOBSTATS_JOBID_VAR_MAX_LEN + 1);
> +	strcpy(obd_jobid_var, valid[i]);
> 
> 	return count;
> }
> 
> 

Cheers, Andreas
---
Andreas Dilger
Principal Lustre Architect
Whamcloud

  reply	other threads:[~2019-02-27  6:17 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-19  0:09 [lustre-devel] [PATCH 00/37] More lustre patches from obdclass NeilBrown
2019-02-19  0:09 ` [lustre-devel] [PATCH 01/37] lustre: obdclass: char obd_ioctl_getdata type NeilBrown
2019-02-24 18:35   ` James Simmons
2019-02-19  0:09 ` [lustre-devel] [PATCH 02/37] lustre: llite: don't use class_setup_tunables() NeilBrown
2019-02-24 16:35   ` James Simmons
2019-02-25 22:27     ` NeilBrown
2019-02-26 22:18       ` James Simmons
2019-02-24 16:52   ` [lustre-devel] [PATCH 03/37] lustre: embed typ_kobj if obd_type James Simmons
2019-02-25 22:38     ` NeilBrown
2019-02-26 20:41       ` Simmons, James A.
2019-02-19  0:09 ` [lustre-devel] [PATCH 14/37] lustre: llog: change lgh_refcount to struct kref NeilBrown
2019-02-25 18:16   ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 13/37] lustre: llog: remove lgh_hdr_lock NeilBrown
2019-02-24 20:29   ` James Simmons
2019-02-25 18:16   ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 07/37] lustre: obd_type: discard obd_type_lock NeilBrown
2019-02-24 17:02   ` James Simmons
2019-02-19  0:09 ` [lustre-devel] [PATCH 08/37] lustre: obdclass: don't copy ops structures in to new type NeilBrown
2019-02-24 17:03   ` James Simmons
2019-02-19  0:09 ` [lustre-devel] [PATCH 16/37] lustre: obdclass: typo: Banlance -> Balance NeilBrown
2019-02-24 17:39   ` James Simmons
2019-02-19  0:09 ` [lustre-devel] [PATCH 05/37] lustre: obd_type: use typ_kobj.name as typ_name NeilBrown
2019-02-24 16:56   ` James Simmons
2019-02-19  0:09 ` [lustre-devel] [PATCH 03/37] lustre: embed typ_kobj if obd_type NeilBrown
2019-02-19  0:09 ` [lustre-devel] [PATCH 09/37] lustre: obdclass: fix module load locking NeilBrown
2019-02-24 17:04   ` James Simmons
2019-02-19  0:09 ` [lustre-devel] [PATCH 17/37] lustre: simplify lprocfs_read_frac_helper NeilBrown
2019-02-24 17:52   ` James Simmons
2019-02-26 23:59     ` NeilBrown
2019-02-27  1:06       ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 18/37] lustre: obdclass: discard lprocfs_single/seq_release NeilBrown
2019-02-24 17:53   ` James Simmons
2019-02-19  0:09 ` [lustre-devel] [PATCH 12/37] lustre: remove unused function in linkea NeilBrown
2019-02-25 18:16   ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 10/37] lustre: kernelcomm: pass correct gfp_t to kmalloc NeilBrown
2019-02-24 17:05   ` James Simmons
2019-02-25 18:16     ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 21/37] lustre: remove several MAX_STRING_SIZE defines NeilBrown
2019-02-24 19:07   ` James Simmons
2019-02-27  0:41     ` NeilBrown
2019-02-25 18:16   ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 19/37] lustre: discard lprocfs_strnstr() NeilBrown
2019-02-24 17:53   ` James Simmons
2019-02-19  0:09 ` [lustre-devel] [PATCH 15/37] lustre: llog_obd: Convert loc_refcount to refcount_t NeilBrown
2019-02-25 18:16   ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 11/37] lustre: kernelcomm: make libcfs_kkuc_msg_put static NeilBrown
2019-02-24 17:15   ` James Simmons
2019-02-26 23:45     ` NeilBrown
2019-02-27 22:36       ` James Simmons
2019-02-27 22:37   ` James Simmons
2019-02-19  0:09 ` [lustre-devel] [PATCH 04/37] lustre: collect all resource releasing for obj_type NeilBrown
2019-02-24 16:54   ` James Simmons
2019-02-19  0:09 ` [lustre-devel] [PATCH 06/37] lustre: obd_type: discard obd_types linked list NeilBrown
2019-02-24 17:00   ` James Simmons
2019-02-19  0:09 ` [lustre-devel] [PATCH 20/37] lustre: convert rsi_sem to a spinlock NeilBrown
2019-02-25 18:16   ` Andreas Dilger
2019-02-27  0:22     ` NeilBrown
2019-02-27  1:00       ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 32/37] lustre: portals_handle: rename ops to owner NeilBrown
2019-02-19  0:09 ` [lustre-devel] [PATCH 22/37] lustre: lprocfs: use log2.h macros instead of shift loop NeilBrown
2019-02-24 18:09   ` James Simmons
2019-02-26 20:55   ` James Simmons
2019-02-27  0:51     ` NeilBrown
2019-02-27  0:54       ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 25/37] lustre: deprecate libcfs_debug_vmsg2 NeilBrown
2019-02-24 20:02   ` James Simmons
2019-02-25 18:16   ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 30/37] lustre: handle: move refcount into the lustre_handle NeilBrown
2019-02-27  6:32   ` Andreas Dilger
2019-02-27 21:48     ` NeilBrown
2019-02-27 22:14       ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 28/37] lustre: remove scope and source from class_incref and class_decref NeilBrown
2019-02-27  6:52   ` Andreas Dilger
2019-02-28  0:39     ` NeilBrown
2019-02-19  0:09 ` [lustre-devel] [PATCH 26/37] lustre: remove libcfs_debug_vmsg2 NeilBrown
2019-02-25 18:16   ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 23/37] lustre: prefer to use tabs for alignment NeilBrown
2019-02-24 18:51   ` James Simmons
2019-02-25 18:16   ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 27/37] lustre: discard lu_ref NeilBrown
2019-02-24 20:28   ` James Simmons
2019-02-27  1:17     ` NeilBrown
2019-02-27  5:35       ` Andreas Dilger
2019-03-01  6:45         ` Mike Pershin
2019-02-19  0:09 ` [lustre-devel] [PATCH 29/37] lustre: handles: discard h_owner in favour of h_ops NeilBrown
2019-02-27  6:37   ` Andreas Dilger
2019-02-27 21:41     ` NeilBrown
2019-02-28  6:41       ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 33/37] lustre: portals_handle: remove locking from class_handle2object() NeilBrown
2019-02-19  0:09 ` [lustre-devel] [PATCH 34/37] lustre: portals_handle: use hlist for hash lists NeilBrown
2019-02-19  0:09 ` [lustre-devel] [PATCH 31/37] lustre: discard OBD_FREE_RCU NeilBrown
2019-02-19  0:09 ` [lustre-devel] [PATCH 24/37] lustre: lu_object: remove extra newline from debug printing NeilBrown
2019-02-24 19:08   ` James Simmons
2019-02-25 18:16   ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 35/37] lustre: portals_handle: discard h_lock NeilBrown
2019-02-19  0:09 ` [lustre-devel] [PATCH 37/37] lustre: obd_sysfs: error-check value stored in jobid_var NeilBrown
2019-02-27  6:17   ` Andreas Dilger [this message]
2019-03-01  2:35     ` NeilBrown
2019-03-01  8:32       ` Andreas Dilger
2019-03-01 14:30         ` Patrick Farrell
2019-03-14  0:34           ` NeilBrown
2019-03-14 14:12             ` Patrick Farrell
2019-03-14 22:56               ` NeilBrown
2019-03-14 23:05               ` Andreas Dilger
2019-02-19  0:09 ` [lustre-devel] [PATCH 36/37] lustre: remove unused fields from struct obd_device NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4271D74A-4B36-4781-A1DA-F5512CBCAF15@whamcloud.com \
    --to=adilger@whamcloud.com \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).