[BUG][PATCH][RFC] audit: hang up in audit_log_start executed on auditd

* [BUG][PATCH][RFC] audit: hang up in audit_log_start executed on auditd
@ 2013-10-11  1:36 Toshiyuki Okajima
  2013-10-11  9:33 ` Gao feng
  0 siblings, 1 reply; 22+ messages in thread
From: Toshiyuki Okajima @ 2013-10-11  1:36 UTC (permalink / raw)
  To: viro, eparis; +Cc: linux-audit, linux-kernel, toshi.okajima

Hi. 

The following reproducer causes auditd daemon hang up.
(But the hang up is released after the audit_backlog_wait_time passes.)
 # auditctl -a exit,always -S all
 # reboot


I reproduced the hangup on KVM, and then got a crash dump.
After I analyzed the dump, I found auditd daemon hung up in audit_log_start. 
(I have confirmed it on linux-3.12-rc4.)

Like this:
crash> bt 1426
PID: 1426   TASK: ffff88007b63e040  CPU: 1   COMMAND: "auditd"
 #0 [ffff88007cb93918] __schedule at ffffffff8155d980
 #1 [ffff88007cb939b0] schedule at ffffffff8155de99
 #2 [ffff88007cb939c0] schedule_timeout at ffffffff8155b840
 #3 [ffff88007cb93a60] audit_log_start at ffffffff810d3ce5
 #4 [ffff88007cb93b20] audit_log_config_change at ffffffff810d3ece
 #5 [ffff88007cb93b60] audit_receive_msg at ffffffff810d4fd6
 #6 [ffff88007cb93c00] audit_receive at ffffffff810d5173
 #7 [ffff88007cb93c30] netlink_unicast at ffffffff814c5269
 #8 [ffff88007cb93c90] netlink_sendmsg at ffffffff814c6386
 #9 [ffff88007cb93d20] sock_sendmsg at ffffffff814813c0
#10 [ffff88007cb93e30] SYSC_sendto at ffffffff81481524
#11 [ffff88007cb93f70] sys_sendto at ffffffff8148157e
#12 [ffff88007cb93f80] system_call_fastpath at ffffffff81568052
    RIP: 00007f5c47f7fba3  RSP: 00007fffcf21a118  RFLAGS: 00010202
    RAX: 000000000000002c  RBX: ffffffff81568052  RCX: 0000000000000000
    RDX: 0000000000000030  RSI: 00007fffcf21e7d0  RDI: 0000000000000003
    RBP: 00007fffcf21e7d0   R8: 00007fffcf21a130   R9: 000000000000000c
    R10: 0000000000000000  R11: 0000000000000293  R12: ffffffff8148157e
    R13: ffff88007cb93f78  R14: 0000000000000020  R15: 0000000000000030
    ORIG_RAX: 000000000000002c  CS: 0033  SS: 002b


The reason is that auditd daemon itself cannot consume its backlog 
while audit_log_start is calling schedule_timeout on auditd daemon.  
So, that is a deadlock!

Therefore, I think audit_log_start shouldn't handle auditd's backlog
when auditd daemon executes audit_log_start.

For example, I made the following fix patch.
--------------------------------------------------------------
auditd daemon can execute the audit_log_start, and then it can cause 
a hang up because only auditd daemon can consume the backlog.
So, audit_log_start executed by auditd daemon should not handle the backlog 
in case auditd daemon hangs up (while wait_for_auditd is calling).

Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
---
 kernel/audit.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 7b0e23a..86c389e 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
 	int reserve;
 	unsigned long timeout_start = jiffies;
 
+	if (audit_pid && (audit_pid == current->pid))
+		return NULL;
+
 	if (audit_initialized != AUDIT_INITIALIZED)
 		return NULL;
 
-- 
1.5.5.6

^ permalink raw reply related	[flat|nested] 22+ messages in thread