From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1785EC47094 for ; Mon, 7 Jun 2021 18:40:32 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9F33261108 for ; Mon, 7 Jun 2021 18:40:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9F33261108 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=tempfail smtp.mailfrom=linux-audit-bounces@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1623091230; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:list-id:list-help: list-unsubscribe:list-subscribe:list-post; bh=XHhgWFBwSCLnESg6/CD9o+og35Qj78Cv49JuYoOnDJE=; b=WAzXJdTzyN2JBb4pMycyR8lidLWChjo063GXUkqIEdje7puhgZ8KNd6PJYOwht4/Fr3cOY VnFs91S61Sz/SJ54WbYzUuM3LuedvxZljqWQkwcaV8PhE2KWzC5IQsgU9pCxvh5kvqL4fD CWXELS3Ro9No9PwPJx6Q9QICf60R0sI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-18-DaDhqeWMNXGoJrqqoohzYQ-1; Mon, 07 Jun 2021 14:40:29 -0400 X-MC-Unique: DaDhqeWMNXGoJrqqoohzYQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8DF421B18BC3; Mon, 7 Jun 2021 18:40:26 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 662A85D736; Mon, 7 Jun 2021 18:40:26 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id CBF1B4ED79; Mon, 7 Jun 2021 18:40:25 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 157IeOWT031669 for ; Mon, 7 Jun 2021 14:40:24 -0400 Received: by smtp.corp.redhat.com (Postfix) id 088465C1D1; Mon, 7 Jun 2021 18:40:24 +0000 (UTC) Received: from madcap2.tricolour.ca (unknown [10.3.128.13]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 1D5245C1C2; Mon, 7 Jun 2021 18:40:16 +0000 (UTC) Date: Mon, 7 Jun 2021 14:40:14 -0400 From: Richard Guy Briggs To: Paul Moore Subject: Re: [RFC PATCH] audit: reduce the number of kauditd_thread wakeups Message-ID: <20210607184013.GP447005@madcap2.tricolour.ca> References: <162294979358.65917.10329704371767126341.stgit@olly> MIME-Version: 1.0 In-Reply-To: <162294979358.65917.10329704371767126341.stgit@olly> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-loop: linux-audit@redhat.com Cc: linux-audit@redhat.com X-BeenThere: linux-audit@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: Linux Audit Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-audit-bounces@redhat.com Errors-To: linux-audit-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=linux-audit-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On 2021-06-05 23:23, Paul Moore wrote: > [NOTE: As this is an RFC patch, I wanted to add some commentary at > the top of the patch description explaining where this patch came > from and what testing has been done. This patch is a derivative > of another unreleased patch that removed all of the wake up calls > from the audit_log_start() and audit_log_end() functions; while > that patch worked well, there we some record losees with high > volume burst traffic in the case of `auditctl -a task,never` or > CONFIG_AUDITSYSCALL=n. As this patch doesn't completely remove > the wake up calls these two cases should behave similarly to how > they do today. As far as testing is concerned, this patch passes > both the audit-testsuite and selinux-testsuite without problem and > with expected audit queue losses (no losses outside of the tests > which attempt to generate losses). This patch also preserves the > ability for the system to continue to function, in the > `auditctl -a exit,always -S all` case, albeit very slowly.] > > When audit is enabled, the kauditd_thread() function runs in its own > kernel thread, emptying the audit record queue and sending the > entries to userspace via different means. As part of its normal > processing it goes to sleep after emptying the queue and waits for > the audit functions to wake it. > > The current audit kernel code attempts to wake the kauditd thread > after each entry is added to the queue. Considering that a single > syscall can have multiple audit records, with each wake attempt > involving locking and additional processing, this can be rather > wasteful. In an effort to limit the number of wake attempts without > unnecessarily risking the audit queue size this patch does the > following: > > * In the case of syscall auditing the wake up call is moved from > audit_log_end() to audit_log_exit() meaning that the kauditd > thread is only woken once, at the end of the syscall, after all of > the syscall's audit records have been added to the queue. > > * In the case where audit records are generated outside of a syscall > context, the wake up call in audit_log_end() is preserved in order > to ensure that these records do not suffer any additional latency > or put unnecessary pressure on the queue. This is done through > some additional checking in audit_log_start() and an additional > flag in the audit_buffer struct. > > * The audit_log_start() function never attempts to wake the kauditd > thread. In the current code this is only done when the queue is > under pressure, but at that point it is extremely likely that the > thread is already active or scheduled, do we should be able to > safely remove this wake attempt. > > * Always wake the kauditd thread after processing management and > user records in audit_receive_msg(). This should be relatively > low frequency compared to the other audit sources, and doing a > wake call here helps keep record latency low and the queue size > in check. > > * The kauditd thread itself is now a bit better at handling high > volume audit record bursts. Previously after emptying the queue > the thread would wake every process that was blocked and then go > to sleep; possibly going to sleep just a flood of new records > are added to the queue. The new kauditd thread approach wakes all > of the blocked processes and then reschedules itself in > anticipation of new records. When the thread returns to execution > it checks the queue and if there are any records present it > immediately starts processing them, if the queue is empty the > kauditd thread goes back to sleep. > > Signed-off-by: Paul Moore This looks good to me. Thank you for the thorough description. I can see how this work was provoked given some of the other work in progress and some of the minor changes that will be needed to those other works as a result. Acked-by: Richard Guy Briggs > --- > kernel/audit.c | 66 ++++++++++++++++++++++++++++++++++++++---------------- > kernel/audit.h | 2 ++ > kernel/auditsc.c | 2 ++ > 3 files changed, 51 insertions(+), 19 deletions(-) > > diff --git a/kernel/audit.c b/kernel/audit.c > index 551a394bc8f42..7c3368835bb71 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -197,9 +197,10 @@ static struct audit_ctl_mutex { > * to place it on a transmit queue. Multiple audit_buffers can be in > * use simultaneously. */ > struct audit_buffer { > - struct sk_buff *skb; /* formatted skb ready to send */ > - struct audit_context *ctx; /* NULL or associated context */ > - gfp_t gfp_mask; > + struct sk_buff *skb; > + struct audit_context *ctx; > + gfp_t gfp_mask; > + unsigned int kauditd_wake:1; > }; > > struct audit_reply { > @@ -804,6 +805,17 @@ static void kauditd_send_multicast_skb(struct sk_buff *skb) > nlmsg_multicast(sock, copy, 0, AUDIT_NLGRP_READLOG, GFP_KERNEL); > } > > +/** > + * kauditd_wakeup - Wake the kauditd_thread > + * > + * Description: > + * Wake up the kauditd_thread thread so that it can process the audit queues. > + */ > +void kauditd_wakeup(void) > +{ > + wake_up_interruptible(&kauditd_wait); > +} > + > /** > * kauditd_thread - Worker thread to send audit records to userspace > * @dummy: unused > @@ -832,6 +844,7 @@ static int kauditd_thread(void *dummy) > portid = ac->portid; > rcu_read_unlock(); > > +all_queues: > /* attempt to flush the hold queue */ > rc = kauditd_send_queue(sk, portid, > &audit_hold_queue, UNICAST_RETRIES, > @@ -861,25 +874,33 @@ static int kauditd_thread(void *dummy) > kauditd_send_multicast_skb, > (sk ? > kauditd_retry_skb : kauditd_hold_skb)); > - if (ac && rc < 0) > + if (ac && rc < 0) { > auditd_reset(ac); > - sk = NULL; > + goto disconnect; > + } > + > + /* we have processed all the queues so wake everyone who might > + * be waiting, reschedule ourselves to give others a chance to > + * run, and check the queue once again when we return; if > + * everything is quiet then we can go back to sleep */ > + wake_up(&audit_backlog_wait); > + cond_resched(); > + if (skb_queue_len(&audit_queue)) > + goto all_queues; > > +disconnect: > /* drop our netns reference, no auditd sends past this line */ > + sk = NULL; > if (net) { > put_net(net); > net = NULL; > } > > - /* we have processed all the queues so wake everyone */ > - wake_up(&audit_backlog_wait); > - > /* NOTE: we want to wake up if there is anything on the queue, > * regardless of if an auditd is connected, as we need to > * do the multicast send and rotate records from the > * main queue to the retry/hold queues */ > - wait_event_freezable(kauditd_wait, > - (skb_queue_len(&audit_queue) ? 1 : 0)); > + wait_event_freezable(kauditd_wait, skb_queue_len(&audit_queue)); > } > > return 0; > @@ -1509,6 +1530,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > break; > } > > + /* poke the kauditd_thread to handle any records we generated */ > + kauditd_wakeup(); > + > return err < 0 ? err : 0; > } > > @@ -1750,6 +1774,7 @@ static struct audit_buffer *audit_buffer_alloc(struct audit_context *ctx, > > ab->ctx = ctx; > ab->gfp_mask = gfp_mask; > + ab->kauditd_wake = 0; > > return ab; > > @@ -1831,14 +1856,10 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, > > while (audit_backlog_limit && > (skb_queue_len(&audit_queue) > audit_backlog_limit)) { > - /* wake kauditd to try and flush the queue */ > - wake_up_interruptible(&kauditd_wait); > - > /* sleep if we are allowed and we haven't exhausted our > * backlog wait limit */ > if (gfpflags_allow_blocking(gfp_mask) && (stime > 0)) { > long rtime = stime; > - > DECLARE_WAITQUEUE(wait, current); > > add_wait_queue_exclusive(&audit_backlog_wait, > @@ -1864,10 +1885,14 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, > return NULL; > } > > - audit_get_stamp(ab->ctx, &t, &serial); > /* cancel dummy context to enable supporting records */ > if (ctx) > ctx->dummy = 0; > + /* force a kauditd wakeup if the normal syscall exit isn't doing it */ > + if (!ctx || !ctx->in_syscall) > + ab->kauditd_wake = 1; > + > + audit_get_stamp(ab->ctx, &t, &serial); > audit_log_format(ab, "audit(%llu.%03lu:%u): ", > (unsigned long long)t.tv_sec, t.tv_nsec/1000000, serial); > > @@ -2365,8 +2390,8 @@ int audit_signal_info(int sig, struct task_struct *t) > * > * We can not do a netlink send inside an irq context because it blocks (last > * arg, flags, is not set to MSG_DONTWAIT), so the audit buffer is placed on a > - * queue and a kthread is scheduled to remove them from the queue outside the > - * irq context. May be called in any context. > + * queue where a kthread processes the buffer from outside the irq context. > + * May be called in any context. > */ > void audit_log_end(struct audit_buffer *ab) > { > @@ -2385,9 +2410,12 @@ void audit_log_end(struct audit_buffer *ab) > nlh = nlmsg_hdr(skb); > nlh->nlmsg_len = skb->len - NLMSG_HDRLEN; > > - /* queue the netlink packet and poke the kauditd thread */ > + /* queue the netlink packet */ > skb_queue_tail(&audit_queue, skb); > - wake_up_interruptible(&kauditd_wait); > + > + /* only wakeup kauditd if we can't do it later */ > + if (ab->kauditd_wake) > + kauditd_wakeup(); > } else > audit_log_lost("rate limit exceeded"); > > diff --git a/kernel/audit.h b/kernel/audit.h > index 3b9c0945225a1..b5f677017e975 100644 > --- a/kernel/audit.h > +++ b/kernel/audit.h > @@ -213,6 +213,8 @@ static inline int audit_hash_ino(u32 ino) > /* Indicates that audit should log the full pathname. */ > #define AUDIT_NAME_FULL -1 > > +extern void kauditd_wakeup(void); > + > extern int audit_match_class(int class, unsigned syscall); > extern int audit_comparator(const u32 left, const u32 op, const u32 right); > extern int audit_uid_comparator(kuid_t left, u32 op, kuid_t right); > diff --git a/kernel/auditsc.c b/kernel/auditsc.c > index 47fb48f42c934..bad776497438f 100644 > --- a/kernel/auditsc.c > +++ b/kernel/auditsc.c > @@ -1600,6 +1600,8 @@ static void audit_log_exit(void) > audit_log_end(ab); > if (call_panic) > audit_panic("error converting sid to string"); > + > + kauditd_wakeup(); > } > > /** > > -- > Linux-audit mailing list > Linux-audit@redhat.com > https://listman.redhat.com/mailman/listinfo/linux-audit - RGB -- Richard Guy Briggs Sr. S/W Engineer, Kernel Security, Base Operating Systems Remote, Ottawa, Red Hat Canada IRC: rgb, SunRaycer Voice: +1.647.777.2635, Internal: (81) 32635 -- Linux-audit mailing list Linux-audit@redhat.com https://listman.redhat.com/mailman/listinfo/linux-audit