From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paul Moore Subject: Re: [PATCH] audit: make sure we don't let the retry queue grow without bounds Date: Mon, 10 Apr 2017 17:04:15 -0400 Message-ID: References: <149185766728.14543.3818679159914063670.stgit@sifl> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mx1.redhat.com (ext-mx07.extmail.prod.ext.phx2.redhat.com [10.5.110.31]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5FA34179E9 for ; Mon, 10 Apr 2017 21:04:18 +0000 (UTC) Received: from mail-vk0-f68.google.com (mail-vk0-f68.google.com [209.85.213.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 90B11C04B326 for ; Mon, 10 Apr 2017 21:04:16 +0000 (UTC) Received: by mail-vk0-f68.google.com with SMTP id z204so13721338vkd.2 for ; Mon, 10 Apr 2017 14:04:16 -0700 (PDT) In-Reply-To: <149185766728.14543.3818679159914063670.stgit@sifl> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-audit-bounces@redhat.com Errors-To: linux-audit-bounces@redhat.com To: Paul Moore Cc: Seth Forshee , linux-audit@redhat.com List-Id: linux-audit@redhat.com On Mon, Apr 10, 2017 at 4:54 PM, Paul Moore wrote: > From: Paul Moore > > The retry queue is intended to provide a temporary buffer in the case > of transient errors when communicating with auditd, it is not meant > as a long life queue, that functionality is provided by the hold > queue. > > This patch fixes a problem identified by Seth where the retry queue > could grow uncontrollably if an auditd instance did not connect to > the kernel to drain the queues. This commit fixes this by doing the > following: > > * Make sure we always call auditd_reset() if we decide the connection > with audit is really dead. There were some cases in > kauditd_hold_skb() where we did not reset the connection, this patch > relocates the reset calls to kauditd_thread() so all the error > conditions are caught and the connection reset. As a side effect, > this means we could move auditd_reset() and get rid of the forward > definition at the top of kernel/audit.c. > > * We never checked the status of the auditd connection when > processing the main audit queue which meant that the retry queue > could grow unchecked. This patch adds a call to auditd_reset() > after the main queue has been processed if auditd is not connected, > the auditd_reset() call will make sure the retry and hold queues are > correctly managed/flushed so that the retry queue remains reasonable. > > Reported-by: Seth Forshee > Signed-off-by: Paul Moore > --- > kernel/audit.c | 67 +++++++++++++++++++++++++++----------------------------- > 1 file changed, 32 insertions(+), 35 deletions(-) FWIW, I just ran this for about 30m randomly starting and stopping auditd under heavy audit load and didn't notice any change in memory usage on my test system. If I don't hear anything negative by tomorrow I'll send this up to Linus. > diff --git a/kernel/audit.c b/kernel/audit.c > index 2f4964cfde0b..a871bf80fde1 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -160,7 +160,6 @@ static LIST_HEAD(audit_freelist); > > /* queue msgs to send via kauditd_task */ > static struct sk_buff_head audit_queue; > -static void kauditd_hold_skb(struct sk_buff *skb); > /* queue msgs due to temporary unicast send problems */ > static struct sk_buff_head audit_retry_queue; > /* queue msgs waiting for new auditd connection */ > @@ -454,30 +453,6 @@ static void auditd_set(int pid, u32 portid, struct net *net) > } > > /** > - * auditd_reset - Disconnect the auditd connection > - * > - * Description: > - * Break the auditd/kauditd connection and move all the queued records into the > - * hold queue in case auditd reconnects. > - */ > -static void auditd_reset(void) > -{ > - struct sk_buff *skb; > - > - /* if it isn't already broken, break the connection */ > - rcu_read_lock(); > - if (auditd_conn.pid) > - auditd_set(0, 0, NULL); > - rcu_read_unlock(); > - > - /* flush all of the main and retry queues to the hold queue */ > - while ((skb = skb_dequeue(&audit_retry_queue))) > - kauditd_hold_skb(skb); > - while ((skb = skb_dequeue(&audit_queue))) > - kauditd_hold_skb(skb); > -} > - > -/** > * kauditd_print_skb - Print the audit record to the ring buffer > * @skb: audit record > * > @@ -505,9 +480,6 @@ static void kauditd_rehold_skb(struct sk_buff *skb) > { > /* put the record back in the queue at the same place */ > skb_queue_head(&audit_hold_queue, skb); > - > - /* fail the auditd connection */ > - auditd_reset(); > } > > /** > @@ -544,9 +516,6 @@ static void kauditd_hold_skb(struct sk_buff *skb) > /* we have no other options - drop the message */ > audit_log_lost("kauditd hold queue overflow"); > kfree_skb(skb); > - > - /* fail the auditd connection */ > - auditd_reset(); > } > > /** > @@ -567,6 +536,30 @@ static void kauditd_retry_skb(struct sk_buff *skb) > } > > /** > + * auditd_reset - Disconnect the auditd connection > + * > + * Description: > + * Break the auditd/kauditd connection and move all the queued records into the > + * hold queue in case auditd reconnects. > + */ > +static void auditd_reset(void) > +{ > + struct sk_buff *skb; > + > + /* if it isn't already broken, break the connection */ > + rcu_read_lock(); > + if (auditd_conn.pid) > + auditd_set(0, 0, NULL); > + rcu_read_unlock(); > + > + /* flush all of the main and retry queues to the hold queue */ > + while ((skb = skb_dequeue(&audit_retry_queue))) > + kauditd_hold_skb(skb); > + while ((skb = skb_dequeue(&audit_queue))) > + kauditd_hold_skb(skb); > +} > + > +/** > * auditd_send_unicast_skb - Send a record via unicast to auditd > * @skb: audit record > * > @@ -758,6 +751,7 @@ static int kauditd_thread(void *dummy) > NULL, kauditd_rehold_skb); > if (rc < 0) { > sk = NULL; > + auditd_reset(); > goto main_queue; > } > > @@ -767,6 +761,7 @@ static int kauditd_thread(void *dummy) > NULL, kauditd_hold_skb); > if (rc < 0) { > sk = NULL; > + auditd_reset(); > goto main_queue; > } > > @@ -775,16 +770,18 @@ static int kauditd_thread(void *dummy) > * unicast, dump failed record sends to the retry queue; if > * sk == NULL due to previous failures we will just do the > * multicast send and move the record to the retry queue */ > - kauditd_send_queue(sk, portid, &audit_queue, 1, > - kauditd_send_multicast_skb, > - kauditd_retry_skb); > + rc = kauditd_send_queue(sk, portid, &audit_queue, 1, > + kauditd_send_multicast_skb, > + kauditd_retry_skb); > + if (sk == NULL || rc < 0) > + auditd_reset(); > + sk = NULL; > > /* drop our netns reference, no auditd sends past this line */ > if (net) { > put_net(net); > net = NULL; > } > - sk = NULL; > > /* we have processed all the queues so wake everyone */ > wake_up(&audit_backlog_wait); > > -- > Linux-audit mailing list > Linux-audit@redhat.com > https://www.redhat.com/mailman/listinfo/linux-audit -- paul moore www.paul-moore.com