* [BUG][PATCH][RFC] audit: hang up in audit_log_start executed on auditd @ 2013-10-11 1:36 Toshiyuki Okajima 2013-10-11 9:33 ` Gao feng 0 siblings, 1 reply; 22+ messages in thread From: Toshiyuki Okajima @ 2013-10-11 1:36 UTC (permalink / raw) To: viro, eparis; +Cc: linux-audit, linux-kernel, toshi.okajima Hi. The following reproducer causes auditd daemon hang up. (But the hang up is released after the audit_backlog_wait_time passes.) # auditctl -a exit,always -S all # reboot I reproduced the hangup on KVM, and then got a crash dump. After I analyzed the dump, I found auditd daemon hung up in audit_log_start. (I have confirmed it on linux-3.12-rc4.) Like this: crash> bt 1426 PID: 1426 TASK: ffff88007b63e040 CPU: 1 COMMAND: "auditd" #0 [ffff88007cb93918] __schedule at ffffffff8155d980 #1 [ffff88007cb939b0] schedule at ffffffff8155de99 #2 [ffff88007cb939c0] schedule_timeout at ffffffff8155b840 #3 [ffff88007cb93a60] audit_log_start at ffffffff810d3ce5 #4 [ffff88007cb93b20] audit_log_config_change at ffffffff810d3ece #5 [ffff88007cb93b60] audit_receive_msg at ffffffff810d4fd6 #6 [ffff88007cb93c00] audit_receive at ffffffff810d5173 #7 [ffff88007cb93c30] netlink_unicast at ffffffff814c5269 #8 [ffff88007cb93c90] netlink_sendmsg at ffffffff814c6386 #9 [ffff88007cb93d20] sock_sendmsg at ffffffff814813c0 #10 [ffff88007cb93e30] SYSC_sendto at ffffffff81481524 #11 [ffff88007cb93f70] sys_sendto at ffffffff8148157e #12 [ffff88007cb93f80] system_call_fastpath at ffffffff81568052 RIP: 00007f5c47f7fba3 RSP: 00007fffcf21a118 RFLAGS: 00010202 RAX: 000000000000002c RBX: ffffffff81568052 RCX: 0000000000000000 RDX: 0000000000000030 RSI: 00007fffcf21e7d0 RDI: 0000000000000003 RBP: 00007fffcf21e7d0 R8: 00007fffcf21a130 R9: 000000000000000c R10: 0000000000000000 R11: 0000000000000293 R12: ffffffff8148157e R13: ffff88007cb93f78 R14: 0000000000000020 R15: 0000000000000030 ORIG_RAX: 000000000000002c CS: 0033 SS: 002b The reason is that auditd daemon itself cannot consume its backlog while audit_log_start is calling schedule_timeout on auditd daemon. So, that is a deadlock! Therefore, I think audit_log_start shouldn't handle auditd's backlog when auditd daemon executes audit_log_start. For example, I made the following fix patch. -------------------------------------------------------------- auditd daemon can execute the audit_log_start, and then it can cause a hang up because only auditd daemon can consume the backlog. So, audit_log_start executed by auditd daemon should not handle the backlog in case auditd daemon hangs up (while wait_for_auditd is calling). Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> --- kernel/audit.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index 7b0e23a..86c389e 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, int reserve; unsigned long timeout_start = jiffies; + if (audit_pid && (audit_pid == current->pid)) + return NULL; + if (audit_initialized != AUDIT_INITIALIZED) return NULL; -- 1.5.5.6 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [BUG][PATCH][RFC] audit: hang up in audit_log_start executed on auditd 2013-10-11 1:36 [BUG][PATCH][RFC] audit: hang up in audit_log_start executed on auditd Toshiyuki Okajima @ 2013-10-11 9:33 ` Gao feng 2013-10-11 12:29 ` Toshiyuki Okajima (smtp-b.css) 0 siblings, 1 reply; 22+ messages in thread From: Gao feng @ 2013-10-11 9:33 UTC (permalink / raw) To: Toshiyuki Okajima; +Cc: viro, eparis, linux-audit, linux-kernel On 10/11/2013 09:36 AM, Toshiyuki Okajima wrote: > Hi. > > The following reproducer causes auditd daemon hang up. > (But the hang up is released after the audit_backlog_wait_time passes.) > # auditctl -a exit,always -S all > # reboot > > > I reproduced the hangup on KVM, and then got a crash dump. > After I analyzed the dump, I found auditd daemon hung up in audit_log_start. > (I have confirmed it on linux-3.12-rc4.) > > Like this: > crash> bt 1426 > PID: 1426 TASK: ffff88007b63e040 CPU: 1 COMMAND: "auditd" > #0 [ffff88007cb93918] __schedule at ffffffff8155d980 > #1 [ffff88007cb939b0] schedule at ffffffff8155de99 > #2 [ffff88007cb939c0] schedule_timeout at ffffffff8155b840 > #3 [ffff88007cb93a60] audit_log_start at ffffffff810d3ce5 > #4 [ffff88007cb93b20] audit_log_config_change at ffffffff810d3ece > #5 [ffff88007cb93b60] audit_receive_msg at ffffffff810d4fd6 > #6 [ffff88007cb93c00] audit_receive at ffffffff810d5173 > #7 [ffff88007cb93c30] netlink_unicast at ffffffff814c5269 > #8 [ffff88007cb93c90] netlink_sendmsg at ffffffff814c6386 > #9 [ffff88007cb93d20] sock_sendmsg at ffffffff814813c0 > #10 [ffff88007cb93e30] SYSC_sendto at ffffffff81481524 > #11 [ffff88007cb93f70] sys_sendto at ffffffff8148157e > #12 [ffff88007cb93f80] system_call_fastpath at ffffffff81568052 > RIP: 00007f5c47f7fba3 RSP: 00007fffcf21a118 RFLAGS: 00010202 > RAX: 000000000000002c RBX: ffffffff81568052 RCX: 0000000000000000 > RDX: 0000000000000030 RSI: 00007fffcf21e7d0 RDI: 0000000000000003 > RBP: 00007fffcf21e7d0 R8: 00007fffcf21a130 R9: 000000000000000c > R10: 0000000000000000 R11: 0000000000000293 R12: ffffffff8148157e > R13: ffff88007cb93f78 R14: 0000000000000020 R15: 0000000000000030 > ORIG_RAX: 000000000000002c CS: 0033 SS: 002b > > > The reason is that auditd daemon itself cannot consume its backlog > while audit_log_start is calling schedule_timeout on auditd daemon. > So, that is a deadlock! > > Therefore, I think audit_log_start shouldn't handle auditd's backlog > when auditd daemon executes audit_log_start. > > For example, I made the following fix patch. > -------------------------------------------------------------- > auditd daemon can execute the audit_log_start, and then it can cause > a hang up because only auditd daemon can consume the backlog. > So, audit_log_start executed by auditd daemon should not handle the backlog > in case auditd daemon hangs up (while wait_for_auditd is calling). > > Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> > --- > kernel/audit.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/kernel/audit.c b/kernel/audit.c > index 7b0e23a..86c389e 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, > int reserve; > unsigned long timeout_start = jiffies; > > + if (audit_pid && (audit_pid == current->pid)) > + return NULL; > + audit_log_start can be called in interrupt context, such as iptables AUDIT module, we can't use current here. please try the patch below. diff --git a/kernel/audit.c b/kernel/audit.c index 7b0e23a..1f35f3d 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -270,9 +270,13 @@ static int audit_log_config_change(char *function_name, int new, int old, int allow_changes) { struct audit_buffer *ab; + gfp_t gfp_mask = GFP_KERNEL; int rc = 0; - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); + if (audit_pid && audit_pid == current->pid) + gfp_mask = GFP_ATOMIC; + + ab = audit_log_start(NULL, gfp_mask, AUDIT_CONFIG_CHANGE); if (unlikely(!ab)) return rc; audit_log_format(ab, "%s=%d old=%d", function_name, new, old); Thanks ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [BUG][PATCH][RFC] audit: hang up in audit_log_start executed on auditd 2013-10-11 9:33 ` Gao feng @ 2013-10-11 12:29 ` Toshiyuki Okajima (smtp-b.css) 2013-10-15 4:43 ` [BUG][PATCH] audit: audit_log_start running on auditd should not stop Toshiyuki Okajima 0 siblings, 1 reply; 22+ messages in thread From: Toshiyuki Okajima (smtp-b.css) @ 2013-10-11 12:29 UTC (permalink / raw) To: Gao feng; +Cc: toshi.okajima, viro, eparis, linux-kernel Gao-san, (2013/10/11 18:33), Gao feng wrote: > On 10/11/2013 09:36 AM, Toshiyuki Okajima wrote: >> Hi. >> >> The following reproducer causes auditd daemon hang up. >> (But the hang up is released after the audit_backlog_wait_time passes.) >> # auditctl -a exit,always -S all >> # reboot >> >> >> I reproduced the hangup on KVM, and then got a crash dump. >> After I analyzed the dump, I found auditd daemon hung up in audit_log_start. >> (I have confirmed it on linux-3.12-rc4.) >> >> Like this: >> crash> bt 1426 >> PID: 1426 TASK: ffff88007b63e040 CPU: 1 COMMAND: "auditd" >> #0 [ffff88007cb93918] __schedule at ffffffff8155d980 >> #1 [ffff88007cb939b0] schedule at ffffffff8155de99 >> #2 [ffff88007cb939c0] schedule_timeout at ffffffff8155b840 >> #3 [ffff88007cb93a60] audit_log_start at ffffffff810d3ce5 >> #4 [ffff88007cb93b20] audit_log_config_change at ffffffff810d3ece >> #5 [ffff88007cb93b60] audit_receive_msg at ffffffff810d4fd6 >> #6 [ffff88007cb93c00] audit_receive at ffffffff810d5173 >> #7 [ffff88007cb93c30] netlink_unicast at ffffffff814c5269 >> #8 [ffff88007cb93c90] netlink_sendmsg at ffffffff814c6386 >> #9 [ffff88007cb93d20] sock_sendmsg at ffffffff814813c0 >> #10 [ffff88007cb93e30] SYSC_sendto at ffffffff81481524 >> #11 [ffff88007cb93f70] sys_sendto at ffffffff8148157e >> #12 [ffff88007cb93f80] system_call_fastpath at ffffffff81568052 >> RIP: 00007f5c47f7fba3 RSP: 00007fffcf21a118 RFLAGS: 00010202 >> RAX: 000000000000002c RBX: ffffffff81568052 RCX: 0000000000000000 >> RDX: 0000000000000030 RSI: 00007fffcf21e7d0 RDI: 0000000000000003 >> RBP: 00007fffcf21e7d0 R8: 00007fffcf21a130 R9: 000000000000000c >> R10: 0000000000000000 R11: 0000000000000293 R12: ffffffff8148157e >> R13: ffff88007cb93f78 R14: 0000000000000020 R15: 0000000000000030 >> ORIG_RAX: 000000000000002c CS: 0033 SS: 002b >> >> >> The reason is that auditd daemon itself cannot consume its backlog >> while audit_log_start is calling schedule_timeout on auditd daemon. >> So, that is a deadlock! >> >> Therefore, I think audit_log_start shouldn't handle auditd's backlog >> when auditd daemon executes audit_log_start. >> >> For example, I made the following fix patch. >> -------------------------------------------------------------- >> auditd daemon can execute the audit_log_start, and then it can cause >> a hang up because only auditd daemon can consume the backlog. >> So, audit_log_start executed by auditd daemon should not handle the backlog >> in case auditd daemon hangs up (while wait_for_auditd is calling). >> >> Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> >> --- >> kernel/audit.c | 3 +++ >> 1 files changed, 3 insertions(+), 0 deletions(-) >> >> diff --git a/kernel/audit.c b/kernel/audit.c >> index 7b0e23a..86c389e 100644 >> --- a/kernel/audit.c >> +++ b/kernel/audit.c >> @@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, >> int reserve; >> unsigned long timeout_start = jiffies; >> >> + if (audit_pid && (audit_pid == current->pid)) >> + return NULL; >> + > > audit_log_start can be called in interrupt context, such as iptables AUDIT module, > we can't use current here. > please try the patch below. > > diff --git a/kernel/audit.c b/kernel/audit.c > index 7b0e23a..1f35f3d 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -270,9 +270,13 @@ static int audit_log_config_change(char *function_name, int new, int old, > int allow_changes) > { > struct audit_buffer *ab; > + gfp_t gfp_mask = GFP_KERNEL; > int rc = 0; > > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE); > + if (audit_pid && audit_pid == current->pid) > + gfp_mask = GFP_ATOMIC; > + > + ab = audit_log_start(NULL, gfp_mask, AUDIT_CONFIG_CHANGE); > if (unlikely(!ab)) > return rc; > audit_log_format(ab, "%s=%d old=%d", function_name, new, old); > > Thanks for your suggestion. I will try it. Regards, Toshiyuki Okajima > Thanks > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* [BUG][PATCH] audit: audit_log_start running on auditd should not stop 2013-10-11 12:29 ` Toshiyuki Okajima (smtp-b.css) @ 2013-10-15 4:43 ` Toshiyuki Okajima 2013-10-15 6:30 ` Gao feng 0 siblings, 1 reply; 22+ messages in thread From: Toshiyuki Okajima @ 2013-10-15 4:43 UTC (permalink / raw) To: Gao feng, viro, eparis; +Cc: toshi.okajima, linux-kernel The backlog cannot be consumed when audit_log_start is running on auditd even if audit_log_start calls wait_for_auditd to consume it. The situation is a deadlock because only auditd can consume the backlog. If the other process needs to send the backlog, it can be also stopped by the deadlock. So, audit_log_start running on auditd should not stop. You can see the deadlock with the following reproducer: # auditctl -a exit,always -S all # reboot Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Cc: gaofeng@cn.fujitsu.com --- kernel/audit.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index 7b0e23a..ce1fb38 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, int reserve; unsigned long timeout_start = jiffies; + if (audit_pid && audit_pid == current->pid) + gfp_mask &= ~__GFP_WAIT; + if (audit_initialized != AUDIT_INITIALIZED) return NULL; -- 1.5.5.6 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop 2013-10-15 4:43 ` [BUG][PATCH] audit: audit_log_start running on auditd should not stop Toshiyuki Okajima @ 2013-10-15 6:30 ` Gao feng 2013-10-15 7:07 ` Toshiyuki Okajima 2013-10-23 19:55 ` [BUG][PATCH] " Richard Guy Briggs 0 siblings, 2 replies; 22+ messages in thread From: Gao feng @ 2013-10-15 6:30 UTC (permalink / raw) To: Toshiyuki Okajima; +Cc: viro, eparis, linux-kernel Hi Toshiyuki-san, On 10/15/2013 12:43 PM, Toshiyuki Okajima wrote: > The backlog cannot be consumed when audit_log_start is running on auditd > even if audit_log_start calls wait_for_auditd to consume it. > The situation is a deadlock because only auditd can consume the backlog. > If the other process needs to send the backlog, it can be also stopped > by the deadlock. > > So, audit_log_start running on auditd should not stop. > > You can see the deadlock with the following reproducer: > # auditctl -a exit,always -S all > # reboot > > Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> > Cc: gaofeng@cn.fujitsu.com > --- > kernel/audit.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/kernel/audit.c b/kernel/audit.c > index 7b0e23a..ce1fb38 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, > int reserve; > unsigned long timeout_start = jiffies; > > + if (audit_pid && audit_pid == current->pid) > + gfp_mask &= ~__GFP_WAIT; > + > if (audit_initialized != AUDIT_INITIALIZED) > return NULL; > > Hmm, I see, There may be other code paths that auditd can call audit_log_start except audit_log_config_change. so it's better to handle this problem in audit_log_start. but current task is only meaningful when gfp_mask & __GFP_WAIT is true. so maybe the below patch is what you want. diff --git a/kernel/audit.c b/kernel/audit.c index 7b0e23a..10b4545 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1095,7 +1095,9 @@ struct audit_buffer *audit_log_start(struct audit_context struct audit_buffer *ab = NULL; struct timespec t; unsigned int uninitialized_var(serial); - int reserve; + int reserve = 5; /* Allow atomic callers to go up to five + entries over the normal backlog limit */ + unsigned long timeout_start = jiffies; if (audit_initialized != AUDIT_INITIALIZED) @@ -1104,11 +1106,12 @@ struct audit_buffer *audit_log_start(struct audit_contex if (unlikely(audit_filter_type(type))) return NULL; - if (gfp_mask & __GFP_WAIT) - reserve = 0; - else - reserve = 5; /* Allow atomic callers to go up to five - entries over the normal backlog limit */ + if (gfp_mask & __GFP_WAIT) { + if (audit_pid && audit_pid == current->pid) + gfp_mask &= ~__GFP_WAIT; + else + reserve = 0; + } while (audit_backlog_limit && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserv Thanks ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop 2013-10-15 6:30 ` Gao feng @ 2013-10-15 7:07 ` Toshiyuki Okajima 2013-10-15 7:58 ` [BUG][PATCH V3] " Toshiyuki Okajima 2013-10-23 19:55 ` [BUG][PATCH] " Richard Guy Briggs 1 sibling, 1 reply; 22+ messages in thread From: Toshiyuki Okajima @ 2013-10-15 7:07 UTC (permalink / raw) To: Gao feng; +Cc: toshi.okajima, toshi.okajima, viro, eparis, linux-kernel Hi Gao-san, (2013/10/15 15:30), Gao feng wrote: > Hi Toshiyuki-san, > On 10/15/2013 12:43 PM, Toshiyuki Okajima wrote: >> The backlog cannot be consumed when audit_log_start is running on auditd >> even if audit_log_start calls wait_for_auditd to consume it. >> The situation is a deadlock because only auditd can consume the backlog. >> If the other process needs to send the backlog, it can be also stopped >> by the deadlock. >> >> So, audit_log_start running on auditd should not stop. >> >> You can see the deadlock with the following reproducer: >> # auditctl -a exit,always -S all >> # reboot >> >> Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> >> Cc: gaofeng@cn.fujitsu.com >> --- >> kernel/audit.c | 3 +++ >> 1 files changed, 3 insertions(+), 0 deletions(-) >> >> diff --git a/kernel/audit.c b/kernel/audit.c >> index 7b0e23a..ce1fb38 100644 >> --- a/kernel/audit.c >> +++ b/kernel/audit.c >> @@ -1098,6 +1098,9 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, >> int reserve; >> unsigned long timeout_start = jiffies; >> >> + if (audit_pid && audit_pid == current->pid) >> + gfp_mask &= ~__GFP_WAIT; >> + >> if (audit_initialized != AUDIT_INITIALIZED) >> return NULL; >> >> > > Hmm, I see, There may be other code paths that auditd can call audit_log_start except Yeah. I found it when I reviewed the code paths to audit_log_start after I got your comment. sys_sendto -> sock_sendmsg -> netlink_unicast -> audit_receive -> audit_receive_skb -> audit_receive_msg -> audit_log_config_change -> audit_log_start -> audit_log_common_recv_msg *** -> audit_log_start *** > audit_log_config_change. so it's better to handle this problem in audit_log_start. > > but current task is only meaningful when gfp_mask & __GFP_WAIT is true. > so maybe the below patch is what you want. OK. I considered the code size, but I didn't consider the execution speed. audit_log_start doesn't always need to execute "if (audit_pid && audit_pid == current->pid)"! I will send your revised patch later. Thanks, Toshiyuki Okajima > > diff --git a/kernel/audit.c b/kernel/audit.c > index 7b0e23a..10b4545 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -1095,7 +1095,9 @@ struct audit_buffer *audit_log_start(struct audit_context > struct audit_buffer *ab = NULL; > struct timespec t; > unsigned int uninitialized_var(serial); > - int reserve; > + int reserve = 5; /* Allow atomic callers to go up to five > + entries over the normal backlog limit */ > + > unsigned long timeout_start = jiffies; > > if (audit_initialized != AUDIT_INITIALIZED) > @@ -1104,11 +1106,12 @@ struct audit_buffer *audit_log_start(struct audit_contex > if (unlikely(audit_filter_type(type))) > return NULL; > > - if (gfp_mask & __GFP_WAIT) > - reserve = 0; > - else > - reserve = 5; /* Allow atomic callers to go up to five > - entries over the normal backlog limit */ > + if (gfp_mask & __GFP_WAIT) { > + if (audit_pid && audit_pid == current->pid) > + gfp_mask &= ~__GFP_WAIT; > + else > + reserve = 0; > + } > > while (audit_backlog_limit > && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserv > > > Thanks > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* [BUG][PATCH V3] audit: audit_log_start running on auditd should not stop 2013-10-15 7:07 ` Toshiyuki Okajima @ 2013-10-15 7:58 ` Toshiyuki Okajima 2013-10-15 9:41 ` Gao feng 0 siblings, 1 reply; 22+ messages in thread From: Toshiyuki Okajima @ 2013-10-15 7:58 UTC (permalink / raw) To: Gao feng, viro, eparis; +Cc: Toshiyuki Okajima, linux-kernel The backlog cannot be consumed when audit_log_start is running on auditd even if audit_log_start calls wait_for_auditd to consume it. The situation is the deadlock because only auditd can consume the backlog. If the other process needs to send the backlog, it can be also stopped by the deadlock. So, audit_log_start running on auditd should not stop. You can see the deadlock with the following reproducer: # auditctl -a exit,always -S all # reboot Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Cc: gaofeng@cn.fujitsu.com --- kernel/audit.c | 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index 7b0e23a..29cfc94 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1095,7 +1095,8 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, struct audit_buffer *ab = NULL; struct timespec t; unsigned int uninitialized_var(serial); - int reserve; + int reserve = 5; /* Allow atomic callers to go up to five + entries over the normal backlog limit */ unsigned long timeout_start = jiffies; if (audit_initialized != AUDIT_INITIALIZED) @@ -1104,11 +1105,12 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, if (unlikely(audit_filter_type(type))) return NULL; - if (gfp_mask & __GFP_WAIT) - reserve = 0; - else - reserve = 5; /* Allow atomic callers to go up to five - entries over the normal backlog limit */ + if (gfp_mask & __GFP_WAIT) { + if (audit_pid && audit_pid == current->pid) + gfp_mask &= ~__GFP_WAIT; + else + reserve = 0; + } while (audit_backlog_limit && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserve) { -- 1.5.5.6 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [BUG][PATCH V3] audit: audit_log_start running on auditd should not stop 2013-10-15 7:58 ` [BUG][PATCH V3] " Toshiyuki Okajima @ 2013-10-15 9:41 ` Gao feng 0 siblings, 0 replies; 22+ messages in thread From: Gao feng @ 2013-10-15 9:41 UTC (permalink / raw) To: Toshiyuki Okajima; +Cc: viro, eparis, linux-kernel On 10/15/2013 03:58 PM, Toshiyuki Okajima wrote: > The backlog cannot be consumed when audit_log_start is running on auditd > even if audit_log_start calls wait_for_auditd to consume it. > The situation is the deadlock because only auditd can consume the backlog. > If the other process needs to send the backlog, it can be also stopped > by the deadlock. > > So, audit_log_start running on auditd should not stop. > > You can see the deadlock with the following reproducer: > # auditctl -a exit,always -S all > # reboot > > Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> > Cc: gaofeng@cn.fujitsu.com > --- > kernel/audit.c | 14 ++++++++------ > 1 files changed, 8 insertions(+), 6 deletions(-) > Looks good to me, thanks! Reviewed-by: Gao feng <gaofeng@cn.fujitsu.com> > diff --git a/kernel/audit.c b/kernel/audit.c > index 7b0e23a..29cfc94 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -1095,7 +1095,8 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, > struct audit_buffer *ab = NULL; > struct timespec t; > unsigned int uninitialized_var(serial); > - int reserve; > + int reserve = 5; /* Allow atomic callers to go up to five > + entries over the normal backlog limit */ > unsigned long timeout_start = jiffies; > > if (audit_initialized != AUDIT_INITIALIZED) > @@ -1104,11 +1105,12 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, > if (unlikely(audit_filter_type(type))) > return NULL; > > - if (gfp_mask & __GFP_WAIT) > - reserve = 0; > - else > - reserve = 5; /* Allow atomic callers to go up to five > - entries over the normal backlog limit */ > + if (gfp_mask & __GFP_WAIT) { > + if (audit_pid && audit_pid == current->pid) > + gfp_mask &= ~__GFP_WAIT; > + else > + reserve = 0; > + } > > while (audit_backlog_limit > && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserve) { > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop 2013-10-15 6:30 ` Gao feng 2013-10-15 7:07 ` Toshiyuki Okajima @ 2013-10-23 19:55 ` Richard Guy Briggs 2013-10-24 5:55 ` Gao feng 2013-10-25 1:36 ` Toshiyuki Okajima 1 sibling, 2 replies; 22+ messages in thread From: Richard Guy Briggs @ 2013-10-23 19:55 UTC (permalink / raw) To: linux-audit, linux-kernel; +Cc: Toshiyuki Okajima, Gao feng, eparis On Tue, Oct 15, 2013 at 02:30:34PM +0800, Gao feng wrote: > Hi Toshiyuki-san, Toshiuki and Gao, > On 10/15/2013 12:43 PM, Toshiyuki Okajima wrote: > > The backlog cannot be consumed when audit_log_start is running on auditd > > even if audit_log_start calls wait_for_auditd to consume it. > > The situation is a deadlock because only auditd can consume the backlog. > > If the other process needs to send the backlog, it can be also stopped > > by the deadlock. > > > > So, audit_log_start running on auditd should not stop. > > > > You can see the deadlock with the following reproducer: > > # auditctl -a exit,always -S all > > # reboot > Hmm, I see, There may be other code paths that auditd can call audit_log_start except > audit_log_config_change. so it's better to handle this problem in audit_log_start. > > but current task is only meaningful when gfp_mask & __GFP_WAIT is true. > so maybe the below patch is what you want. I have been following this thread with interest. I like the general evolution of this patch. The first patch was a bit too abrupt, dropping too much, but this one makes much more sense. I would be tempted to make the reserve even bigger. I see that you should be using a kernel that has included commit 8ac1c8d5 (which made it into v3.12-rc3) audit: fix endless wait in audit_log_start() That was an obvious bug, but I was still concerned about the cause of the initial wait. There are other fixes and ideas in the works that should alleviate some of the pressure to make the service more usable. https://lkml.org/lkml/2013/9/18/453 I have tested with and without this v3 patch and I don't see any significant difference with the reproducer provided above. I'm also testing with a reproducer of the endless wait bug (readahead-collector). What are your expected results? What are your actual results in each case? How are they different? > diff --git a/kernel/audit.c b/kernel/audit.c > index 7b0e23a..10b4545 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -1095,7 +1095,9 @@ struct audit_buffer *audit_log_start(struct audit_context > struct audit_buffer *ab = NULL; > struct timespec t; > unsigned int uninitialized_var(serial); > - int reserve; > + int reserve = 5; /* Allow atomic callers to go up to five > + entries over the normal backlog limit */ > + > unsigned long timeout_start = jiffies; > > if (audit_initialized != AUDIT_INITIALIZED) > @@ -1104,11 +1106,12 @@ struct audit_buffer *audit_log_start(struct audit_contex > if (unlikely(audit_filter_type(type))) > return NULL; > > - if (gfp_mask & __GFP_WAIT) > - reserve = 0; > - else > - reserve = 5; /* Allow atomic callers to go up to five > - entries over the normal backlog limit */ > + if (gfp_mask & __GFP_WAIT) { > + if (audit_pid && audit_pid == current->pid) > + gfp_mask &= ~__GFP_WAIT; > + else > + reserve = 0; > + } > > while (audit_backlog_limit > && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserv - RGB -- Richard Guy Briggs <rbriggs@redhat.com> Senior Software Engineer Kernel Security AMER ENG Base Operating Systems Remote, Ottawa, Canada Voice: +1.647.777.2635 Internal: (81) 32635 Alt: +1.613.693.0684x3545 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop 2013-10-23 19:55 ` [BUG][PATCH] " Richard Guy Briggs @ 2013-10-24 5:55 ` Gao feng 2013-10-24 19:35 ` Richard Guy Briggs 2013-10-25 1:36 ` Toshiyuki Okajima 1 sibling, 1 reply; 22+ messages in thread From: Gao feng @ 2013-10-24 5:55 UTC (permalink / raw) To: Richard Guy Briggs; +Cc: linux-audit, linux-kernel, Toshiyuki Okajima, eparis On 10/24/2013 03:55 AM, Richard Guy Briggs wrote: > On Tue, Oct 15, 2013 at 02:30:34PM +0800, Gao feng wrote: >> Hi Toshiyuki-san, > > Toshiuki and Gao, > >> On 10/15/2013 12:43 PM, Toshiyuki Okajima wrote: >>> The backlog cannot be consumed when audit_log_start is running on auditd >>> even if audit_log_start calls wait_for_auditd to consume it. >>> The situation is a deadlock because only auditd can consume the backlog. >>> If the other process needs to send the backlog, it can be also stopped >>> by the deadlock. >>> >>> So, audit_log_start running on auditd should not stop. >>> >>> You can see the deadlock with the following reproducer: >>> # auditctl -a exit,always -S all >>> # reboot > >> Hmm, I see, There may be other code paths that auditd can call audit_log_start except >> audit_log_config_change. so it's better to handle this problem in audit_log_start. >> >> but current task is only meaningful when gfp_mask & __GFP_WAIT is true. >> so maybe the below patch is what you want. > > I have been following this thread with interest. I like the general > evolution of this patch. The first patch was a bit too abrupt, dropping > too much, but this one makes much more sense. I would be tempted to > make the reserve even bigger. > > I see that you should be using a kernel that has included commit > 8ac1c8d5 (which made it into v3.12-rc3) > audit: fix endless wait in audit_log_start() > That was an obvious bug, include or not include? The problem is, if the audit_backlog_limit is 3, but there are 5 tasks calling audit_log_start, so 2 tasks will wait auditd to consume audit_skb_queue. if before auditd consumes skbs, somebody want to kill auditd, and auditd will set the audit_pid to zero, this will triger an audit message. so auditd will wait for himself. and this waiting is endless, since auditd cann't consume audit_skb_queue any more. the commit 8ac1c8d5 prevent this problem happening. because if once a task is blocked over the audit_backlog_wait_time. the audit_backlog_wait_time will be set to zero(audit_backlog_wait_overflow which is zero). so the other tasks will not wait anymore. but I'm confused if this is what we expected? these audit messages will lost once any task is blocked over audit_backlog_wait_time. So, AFAIK if commit 8ac1c8d5 exist, this patch is not necessary, but we still do something to fix the problem commit 8ac1c8d5 brings. but I was still concerned about the cause of > the initial wait. There are other fixes and ideas in the works that > should alleviate some of the pressure to make the service more usable. > https://lkml.org/lkml/2013/9/18/453 > > I have tested with and without this v3 patch and I don't see any > significant difference with the reproducer provided above. I'm also > testing with a reproducer of the endless wait bug (readahead-collector). > > What are your expected results? What are your actual results in each > case? How are they different? > >> diff --git a/kernel/audit.c b/kernel/audit.c >> index 7b0e23a..10b4545 100644 >> --- a/kernel/audit.c >> +++ b/kernel/audit.c >> @@ -1095,7 +1095,9 @@ struct audit_buffer *audit_log_start(struct audit_context >> struct audit_buffer *ab = NULL; >> struct timespec t; >> unsigned int uninitialized_var(serial); >> - int reserve; >> + int reserve = 5; /* Allow atomic callers to go up to five >> + entries over the normal backlog limit */ >> + >> unsigned long timeout_start = jiffies; >> >> if (audit_initialized != AUDIT_INITIALIZED) >> @@ -1104,11 +1106,12 @@ struct audit_buffer *audit_log_start(struct audit_contex >> if (unlikely(audit_filter_type(type))) >> return NULL; >> >> - if (gfp_mask & __GFP_WAIT) >> - reserve = 0; >> - else >> - reserve = 5; /* Allow atomic callers to go up to five >> - entries over the normal backlog limit */ >> + if (gfp_mask & __GFP_WAIT) { >> + if (audit_pid && audit_pid == current->pid) >> + gfp_mask &= ~__GFP_WAIT; >> + else >> + reserve = 0; >> + } >> >> while (audit_backlog_limit >> && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserv > > - RGB > > -- > Richard Guy Briggs <rbriggs@redhat.com> > Senior Software Engineer > Kernel Security > AMER ENG Base Operating Systems > Remote, Ottawa, Canada > Voice: +1.647.777.2635 > Internal: (81) 32635 > Alt: +1.613.693.0684x3545 > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop 2013-10-24 5:55 ` Gao feng @ 2013-10-24 19:35 ` Richard Guy Briggs 0 siblings, 0 replies; 22+ messages in thread From: Richard Guy Briggs @ 2013-10-24 19:35 UTC (permalink / raw) To: Gao feng; +Cc: linux-audit, linux-kernel, Toshiyuki Okajima, eparis On Thu, Oct 24, 2013 at 01:55:37PM +0800, Gao feng wrote: > On 10/24/2013 03:55 AM, Richard Guy Briggs wrote: > > On Tue, Oct 15, 2013 at 02:30:34PM +0800, Gao feng wrote: > >> Hi Toshiyuki-san, > > > > Toshiuki and Gao, > > > >> On 10/15/2013 12:43 PM, Toshiyuki Okajima wrote: > >>> The backlog cannot be consumed when audit_log_start is running on auditd > >>> even if audit_log_start calls wait_for_auditd to consume it. > >>> The situation is a deadlock because only auditd can consume the backlog. > >>> If the other process needs to send the backlog, it can be also stopped > >>> by the deadlock. > >>> > >>> So, audit_log_start running on auditd should not stop. > >>> > >>> You can see the deadlock with the following reproducer: > >>> # auditctl -a exit,always -S all > >>> # reboot > > > >> Hmm, I see, There may be other code paths that auditd can call audit_log_start except > >> audit_log_config_change. so it's better to handle this problem in audit_log_start. > >> > >> but current task is only meaningful when gfp_mask & __GFP_WAIT is true. > >> so maybe the below patch is what you want. > > > > I have been following this thread with interest. I like the general > > evolution of this patch. The first patch was a bit too abrupt, dropping > > too much, but this one makes much more sense. I would be tempted to > > make the reserve even bigger. > > > > I see that you should be using a kernel that has included commit > > 8ac1c8d5 (which made it into v3.12-rc3) > > audit: fix endless wait in audit_log_start() > > That was an obvious bug, > > include or not include? You say you tested on v3.12-rc4, which does (should?) include this fix. > The problem is, if the audit_backlog_limit is 3, but there are 5 tasks > calling audit_log_start, so 2 tasks will wait auditd to consume > audit_skb_queue. if before auditd consumes skbs, somebody want to kill > auditd, and auditd will set the audit_pid to zero, this will triger an > audit message. so auditd will wait for himself. and this waiting is endless, > since auditd cann't consume audit_skb_queue any more. > > the commit 8ac1c8d5 prevent this problem happening. because if once a task is > blocked over the audit_backlog_wait_time. the audit_backlog_wait_time will > be set to zero(audit_backlog_wait_overflow which is zero). so the other tasks > will not wait anymore. but I'm confused if this is what we expected? these > audit messages will lost once any task is blocked over audit_backlog_wait_time. This is configurable. The default is this behaviour, so that if there is a problem with the audit subsystem, it will give a reasonable chance for auditd to catch up, then give up, make noise in the syslog (and configured colsoles) and then continue. Other auditd configurations allow the system to halt if this condition occurs. There is another patch coming that will set that audit_backlog_wait_time back to the original value once auditd has recovered. It is not upstream yet. > So, AFAIK if commit 8ac1c8d5 exist, this patch is not necessary, but > we still do something to fix the problem commit 8ac1c8d5 brings. This is the condition in which I am interested. Do you have a way to clearly show this condition without your patch and a way to show it is solved with your patch? I'm looking for a clear reproducer. > but I was still concerned about the cause of > > the initial wait. There are other fixes and ideas in the works that > > should alleviate some of the pressure to make the service more usable. > > https://lkml.org/lkml/2013/9/18/453 > > > > I have tested with and without this v3 patch and I don't see any > > significant difference with the reproducer provided above. I'm also > > testing with a reproducer of the endless wait bug (readahead-collector). > > > > What are your expected results? What are your actual results in each > > case? How are they different? > > > >> diff --git a/kernel/audit.c b/kernel/audit.c > >> index 7b0e23a..10b4545 100644 > >> --- a/kernel/audit.c > >> +++ b/kernel/audit.c > >> @@ -1095,7 +1095,9 @@ struct audit_buffer *audit_log_start(struct audit_context > >> struct audit_buffer *ab = NULL; > >> struct timespec t; > >> unsigned int uninitialized_var(serial); > >> - int reserve; > >> + int reserve = 5; /* Allow atomic callers to go up to five > >> + entries over the normal backlog limit */ > >> + > >> unsigned long timeout_start = jiffies; > >> > >> if (audit_initialized != AUDIT_INITIALIZED) > >> @@ -1104,11 +1106,12 @@ struct audit_buffer *audit_log_start(struct audit_contex > >> if (unlikely(audit_filter_type(type))) > >> return NULL; > >> > >> - if (gfp_mask & __GFP_WAIT) > >> - reserve = 0; > >> - else > >> - reserve = 5; /* Allow atomic callers to go up to five > >> - entries over the normal backlog limit */ > >> + if (gfp_mask & __GFP_WAIT) { > >> + if (audit_pid && audit_pid == current->pid) > >> + gfp_mask &= ~__GFP_WAIT; > >> + else > >> + reserve = 0; > >> + } > >> > >> while (audit_backlog_limit > >> && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserv > > > > - RGB - RGB -- Richard Guy Briggs <rbriggs@redhat.com> Senior Software Engineer Kernel Security AMER ENG Base Operating Systems Remote, Ottawa, Canada Voice: +1.647.777.2635 Internal: (81) 32635 Alt: +1.613.693.0684x3545 ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop 2013-10-23 19:55 ` [BUG][PATCH] " Richard Guy Briggs 2013-10-24 5:55 ` Gao feng @ 2013-10-25 1:36 ` Toshiyuki Okajima 2013-10-25 15:12 ` Eric Paris 1 sibling, 1 reply; 22+ messages in thread From: Toshiyuki Okajima @ 2013-10-25 1:36 UTC (permalink / raw) To: Richard Guy Briggs Cc: toshi.okajima, linux-audit, linux-kernel, Gao feng, eparis Hi. Thanks for your interest. (2013/10/24 4:55), Richard Guy Briggs wrote: > On Tue, Oct 15, 2013 at 02:30:34PM +0800, Gao feng wrote: >> Hi Toshiyuki-san, > > Toshiuki and Gao, > >> On 10/15/2013 12:43 PM, Toshiyuki Okajima wrote: >>> The backlog cannot be consumed when audit_log_start is running on auditd >>> even if audit_log_start calls wait_for_auditd to consume it. >>> The situation is a deadlock because only auditd can consume the backlog. >>> If the other process needs to send the backlog, it can be also stopped >>> by the deadlock. >>> >>> So, audit_log_start running on auditd should not stop. >>> >>> You can see the deadlock with the following reproducer: >>> # auditctl -a exit,always -S all >>> # reboot > >> Hmm, I see, There may be other code paths that auditd can call audit_log_start except >> audit_log_config_change. so it's better to handle this problem in audit_log_start. >> >> but current task is only meaningful when gfp_mask & __GFP_WAIT is true. >> so maybe the below patch is what you want. > > I have been following this thread with interest. I like the general > evolution of this patch. The first patch was a bit too abrupt, dropping > too much, but this one makes much more sense. I would be tempted to > make the reserve even bigger. > > I see that you should be using a kernel that has included commit > 8ac1c8d5 (which made it into v3.12-rc3) > audit: fix endless wait in audit_log_start() > That was an obvious bug, but I was still concerned about the cause of > the initial wait. There are other fixes and ideas in the works that > should alleviate some of the pressure to make the service more usable. > https://lkml.org/lkml/2013/9/18/453 > > I have tested with and without this v3 patch and I don't see any > significant difference with the reproducer provided above. I'm also > testing with a reproducer of the endless wait bug (readahead-collector). > > What are your expected results? What are your actual results in each > case? How are they different? OK. I try to describe. 1) I found that the system cannot reboot smoothly because auditd daemon can stop for 60 seconds. 2) As the result of my investigation of this problem, I found that audit_log_start executed on auditd called wait_for_auditd and stopped for "audit_backlog_wait_time". Calling wait_for_auditd is to let auditd consume the backlog during this sleep. However, when wait_for_auditd is called on auditd, the backlog cannot be consumed during this time. In addition, audit_log_start called by other processes can also stop while auditd is stopping. So, wait_for_auditd called on auditd is meaningless. 3) Therefore, I made the v3 patch that audit_log_start executed on auditd doesn't call wait_for_auditd. 4) On my environment, I couldn't see the situation that auditd stopped for 60 seconds by using this patch. 5) So, I thought this problem could be fixed. But after I received your comment, I encountered the other problem that auditd can stop for 60 seconds when I tried this patch on the other environment installed "systemd". So, I think you encountered the other problem that was the same as mine. The following situation was occurred: zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz PID: 1 TASK: ffff88007c7d8000 CPU: 0 COMMAND: "systemd" #0 [ffff88007c7758d8] __schedule at ffffffff815f231a #1 [ffff88007c775950] schedule at ffffffff815f2a79 #2 [ffff88007c775960] schedule_timeout at ffffffff815f0f93 #3 [ffff88007c775a10] audit_log_start at ffffffff810d7d09 #4 [ffff88007c775ab0] audit_log_common_recv_msg at ffffffff810d8068 #5 [ffff88007c775b00] audit_receive_msg at ffffffff810d8ca0 #6 [ffff88007c775bb0] audit_receive at ffffffff810d8f98 #7 [ffff88007c775be0] netlink_unicast at ffffffff8150bf31 #8 [ffff88007c775c30] netlink_sendmsg at ffffffff8150c2b1 #9 [ffff88007c775cc0] sock_sendmsg at ffffffff814ca3fc #10 [ffff88007c775e50] sys_sendto at ffffffff814ccd6d #11 [ffff88007c775f80] system_call_fastpath at ffffffff815fc259 RIP: 00007fbee5653543 RSP: 00007fff23d684b8 RFLAGS: 00000246 RAX: 000000000000002c RBX: ffffffff815fc259 RCX: ffffffffffffffff RDX: 0000000000000078 RSI: 00007fff23d5d8c0 RDI: 0000000000000003 RBP: 0000000000000003 R8: 00007fff23d5d8b0 R9: 000000000000000c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000050 R13: 00007fff23d66380 R14: 000000000000046b R15: 0000000000000067 ORIG_RAX: 000000000000002c CS: 0033 SS: 002b PID: 488 TASK: ffff880036edcbf0 CPU: 1 COMMAND: "auditd" #0 [ffff88007ba93a98] __schedule at ffffffff815f231a #1 [ffff88007ba93b10] schedule at ffffffff815f2a79 #2 [ffff88007ba93b20] schedule_preempt_disabled at ffffffff815f2d0e #3 [ffff88007ba93b30] __mutex_lock_slowpath at ffffffff815f1903 #4 [ffff88007ba93b90] mutex_lock at ffffffff815f143a #5 [ffff88007ba93bb0] audit_receive at ffffffff810d8f71 #6 [ffff88007ba93be0] netlink_unicast at ffffffff8150bf31 #7 [ffff88007ba93c30] netlink_sendmsg at ffffffff8150c2b1 #8 [ffff88007ba93cc0] sock_sendmsg at ffffffff814ca3fc #9 [ffff88007ba93e50] sys_sendto at ffffffff814ccd6d #10 [ffff88007ba93f80] system_call_fastpath at ffffffff815fc259 RIP: 00007fc256ff16f3 RSP: 00007fffff6b39d8 RFLAGS: 00010202 RAX: 000000000000002c RBX: ffffffff815fc259 RCX: 0000000000000000 RDX: 0000000000000030 RSI: 00007fffff6b3a00 RDI: 0000000000000003 RBP: 0000000000000003 R8: 00007fffff6b39f0 R9: 000000000000000c R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000005 R13: 00007fffff6ba3f0 R14: 00000000000003e9 R15: 0000000000000020 ORIG_RAX: 000000000000002c CS: 0033 SS: 002b systemd |auditd -------------------------------------------+----------------------------------- ... | -> audit_receive |... -> mutex_lock(&audit_cmd_mutex) |-> audit_receive ... -> audit_log_start | -> mutex_lock(&audit_cmd_mutex) -> wait_for_auditd | // wait for systemd -> schedule_timeout(60*HZ) | zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz V3 patch cannot fix the above problem. V3 patch only fixes the problem that audit_log_start on auditd stops for 60 seconds in vain. The other problem is that auditd stops for about 60 seconds by the mutex lock which the other process gets. If the other process is "systemd", changing V3 patch like the following fixes the other problem. (I confirm that the new v3 patch can fix the other problem which I encountered on installed "systemd" environment.) zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz @@ -1104,11 +1105,12 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, if (unlikely(audit_filter_type(type))) return NULL; - if (gfp_mask & __GFP_WAIT) - reserve = 0; - else - reserve = 5; /* Allow atomic callers to go up to five - entries over the normal backlog limit */ + if (gfp_mask & __GFP_WAIT) { + if (audit_pid && (current->pid == 1 || audit_pid == current->pid)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + gfp_mask &= ~__GFP_WAIT; + else + reserve = 0; + } zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz However, if the other process is not "systemd", another problem cannot be corrected by a simple correction. At present, a drastic method to fix the other problem doesn't occur to me. Regards, Toshiyuki Okajima > >> diff --git a/kernel/audit.c b/kernel/audit.c >> index 7b0e23a..10b4545 100644 >> --- a/kernel/audit.c >> +++ b/kernel/audit.c >> @@ -1095,7 +1095,9 @@ struct audit_buffer *audit_log_start(struct audit_context >> struct audit_buffer *ab = NULL; >> struct timespec t; >> unsigned int uninitialized_var(serial); >> - int reserve; >> + int reserve = 5; /* Allow atomic callers to go up to five >> + entries over the normal backlog limit */ >> + >> unsigned long timeout_start = jiffies; >> >> if (audit_initialized != AUDIT_INITIALIZED) >> @@ -1104,11 +1106,12 @@ struct audit_buffer *audit_log_start(struct audit_contex >> if (unlikely(audit_filter_type(type))) >> return NULL; >> >> - if (gfp_mask & __GFP_WAIT) >> - reserve = 0; >> - else >> - reserve = 5; /* Allow atomic callers to go up to five >> - entries over the normal backlog limit */ >> + if (gfp_mask & __GFP_WAIT) { >> + if (audit_pid && audit_pid == current->pid) >> + gfp_mask &= ~__GFP_WAIT; >> + else >> + reserve = 0; >> + } >> >> while (audit_backlog_limit >> && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserv > > - RGB > > -- > Richard Guy Briggs <rbriggs@redhat.com> > Senior Software Engineer > Kernel Security > AMER ENG Base Operating Systems > Remote, Ottawa, Canada > Voice: +1.647.777.2635 > Internal: (81) 32635 > Alt: +1.613.693.0684x3545 > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop 2013-10-25 1:36 ` Toshiyuki Okajima @ 2013-10-25 15:12 ` Eric Paris 2013-10-28 9:20 ` Toshiyuki Okajima 2013-12-05 2:45 ` [PATCH 0/3] audit: remove audit_log_start() contention in AUDIT_USER type calls Richard Guy Briggs 0 siblings, 2 replies; 22+ messages in thread From: Eric Paris @ 2013-10-25 15:12 UTC (permalink / raw) To: Toshiyuki Okajima; +Cc: Richard Guy Briggs, linux-audit, linux-kernel, Gao feng On Fri, 2013-10-25 at 10:36 +0900, Toshiyuki Okajima wrote: > systemd |auditd > -------------------------------------------+----------------------------------- > ... | > -> audit_receive |... > -> mutex_lock(&audit_cmd_mutex) |-> audit_receive > ... -> audit_log_start | -> mutex_lock(&audit_cmd_mutex) > -> wait_for_auditd | // wait for systemd > -> schedule_timeout(60*HZ) | Ugggh, definitely a problem. Adding a similar hack to systemd really does not seem like an acceptable answer. It seems to me that in audit_receive_msg() case AUDIT_USER: case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG: case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2: we do not need to hold the audit_cmd_mutex. So a quick and dirty patch should be to just drop the mutex there (and we need to verify there aren't issues running the audit_filter_user() without the lock). That will take care of systemd and anything USING audit. It still means that you could race with something configuring audit and auditd shutting down. Seems like a good quick and dirty 'fix' while we work on a better fix... To take care of that I think maybe we could drop the cmd_mutex every time we call audit_log_start. That's not necessarily going to be pretty. Maybe make a new switch at the top of the function which knows which operations we are going to have to allocate an audit_buffer. Drop the lock, allocate the buffer, then retake the lock to finish running audit_receive_msg().... Maybe that second option isn't so hard and we can go directly after that instead of just dealing with userspace audit messages? Thoughts? ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [BUG][PATCH] audit: audit_log_start running on auditd should not stop 2013-10-25 15:12 ` Eric Paris @ 2013-10-28 9:20 ` Toshiyuki Okajima 2013-12-05 2:45 ` [PATCH 0/3] audit: remove audit_log_start() contention in AUDIT_USER type calls Richard Guy Briggs 1 sibling, 0 replies; 22+ messages in thread From: Toshiyuki Okajima @ 2013-10-28 9:20 UTC (permalink / raw) To: Eric Paris Cc: toshi.okajima, Richard Guy Briggs, linux-audit, linux-kernel, Gao feng Hi. (2013/10/26 0:12), Eric Paris wrote: > On Fri, 2013-10-25 at 10:36 +0900, Toshiyuki Okajima wrote: > >> systemd |auditd >> -------------------------------------------+----------------------------------- >> ... | >> -> audit_receive |... >> -> mutex_lock(&audit_cmd_mutex) |-> audit_receive >> ... -> audit_log_start | -> mutex_lock(&audit_cmd_mutex) >> -> wait_for_auditd | // wait for systemd >> -> schedule_timeout(60*HZ) | > > Ugggh, definitely a problem. Adding a similar hack to systemd really > does not seem like an acceptable answer. It seems to me that in I think so, too. We should fix it against the various cases. > audit_receive_msg() > > case AUDIT_USER: > case AUDIT_FIRST_USER_MSG ... AUDIT_LAST_USER_MSG: > case AUDIT_FIRST_USER_MSG2 ... AUDIT_LAST_USER_MSG2: > > we do not need to hold the audit_cmd_mutex. So a quick and dirty patch > should be to just drop the mutex there (and we need to verify there > aren't issues running the audit_filter_user() without the lock). That > will take care of systemd and anything USING audit. It still means that > you could race with something configuring audit and auditd shutting > down. Seems like a good quick and dirty 'fix' while we work on a better > fix... > > To take care of that I think maybe we could drop the cmd_mutex every > time we call audit_log_start. That's not necessarily going to be > pretty. Maybe make a new switch at the top of the function which knows > which operations we are going to have to allocate an audit_buffer. Drop > the lock, allocate the buffer, then retake the lock to finish running > audit_receive_msg().... > > Maybe that second option isn't so hard and we can go directly after that > instead of just dealing with userspace audit messages? > > Thoughts? Does it mean that we can also fix the problem only in the userspace? Even if we fix userspace process (auditd, readahead-collector and systemd) only, the problem would happen again if a new userspace audit process is implemented. Therefore, I think we should fix only in the kernel. Sorry, but I don't have clear method to fix it. Regards, Toshiyuki Okajima > > > ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 0/3] audit: remove audit_log_start() contention in AUDIT_USER type calls 2013-10-25 15:12 ` Eric Paris 2013-10-28 9:20 ` Toshiyuki Okajima @ 2013-12-05 2:45 ` Richard Guy Briggs 2013-12-05 2:45 ` [PATCH 1/3] selinux: call WARN_ONCE() instead of calling audit_log_start() Richard Guy Briggs ` (3 more replies) 1 sibling, 4 replies; 22+ messages in thread From: Richard Guy Briggs @ 2013-12-05 2:45 UTC (permalink / raw) To: linux-audit, linux-kernel Cc: Richard Guy Briggs, Eric Paris, Gao feng, Toshiyuki Okajima There is a race condition between systemd and auditd: systemd |auditd -------------------------------------------+----------------------------------- ... | -> audit_receive |... -> mutex_lock(&audit_cmd_mutex) |-> audit_receive ... -> audit_log_start | -> mutex_lock(&audit_cmd_mutex) -> wait_for_auditd | // wait for systemd -> schedule_timeout(60*HZ) | This fix will take care of systemd and anything USING audit. It still means that we could race with something configuring audit and auditd shutting down. The idea of dropping the lock at the top of audit_receive_msg() isn't as clean as I had hoped, with AUDIT_ADD_RULE, AUDIT_TRIM, AUDIT_MAKE_EQUIV all potentially allocating additional audit buffers indirectly through trim_marked(). It may make sense to have trim_marked() send its queue through a new thread. Richard Guy Briggs (3): selinux: call WARN_ONCE() instead of calling audit_log_start() smack: call WARN_ONCE() instead of calling audit_log_start() audit: drop audit_cmd_lock in AUDIT_USER family of cases kernel/audit.c | 2 ++ security/selinux/ss/services.c | 12 ++++-------- security/smack/smack_lsm.c | 5 ++--- 3 files changed, 8 insertions(+), 11 deletions(-) ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 1/3] selinux: call WARN_ONCE() instead of calling audit_log_start() 2013-12-05 2:45 ` [PATCH 0/3] audit: remove audit_log_start() contention in AUDIT_USER type calls Richard Guy Briggs @ 2013-12-05 2:45 ` Richard Guy Briggs 2013-12-05 2:45 ` [PATCH 2/3] smack: " Richard Guy Briggs ` (2 subsequent siblings) 3 siblings, 0 replies; 22+ messages in thread From: Richard Guy Briggs @ 2013-12-05 2:45 UTC (permalink / raw) To: linux-audit, linux-kernel Cc: Richard Guy Briggs, Eric Paris, Gao feng, Toshiyuki Okajima Two of the conditions in selinux_audit_rule_match() should never happen and the third indicates a race that should be retried. Remove the calls to audit_log() (which call audit_log_start()) and deal with the errors in the caller, logging only once if the condition is met. Calling audit_log_start() in this location makes buffer allocation and locking more complicated in the calling tree (audit_filter_user()). Signed-off-by: Richard Guy Briggs <rgb@redhat.com> --- security/selinux/ss/services.c | 12 ++++-------- 1 files changed, 4 insertions(+), 8 deletions(-) diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index b4feecc..f4dda05 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -2938,25 +2938,21 @@ int selinux_audit_rule_match(u32 sid, u32 field, u32 op, void *vrule, struct selinux_audit_rule *rule = vrule; int match = 0; - if (!rule) { - audit_log(actx, GFP_ATOMIC, AUDIT_SELINUX_ERR, - "selinux_audit_rule_match: missing rule\n"); + if (unlikely(!rule)) { + WARN_ONCE(1, "selinux_audit_rule_match: missing rule\n"); return -ENOENT; } read_lock(&policy_rwlock); if (rule->au_seqno < latest_granting) { - audit_log(actx, GFP_ATOMIC, AUDIT_SELINUX_ERR, - "selinux_audit_rule_match: stale rule\n"); match = -ESTALE; goto out; } ctxt = sidtab_search(&sidtab, sid); - if (!ctxt) { - audit_log(actx, GFP_ATOMIC, AUDIT_SELINUX_ERR, - "selinux_audit_rule_match: unrecognized SID %d\n", + if (unlikely(!ctxt)) { + WARN_ONCE(1, "selinux_audit_rule_match: unrecognized SID %d\n", sid); match = -ENOENT; goto out; -- 1.7.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* [PATCH 2/3] smack: call WARN_ONCE() instead of calling audit_log_start() 2013-12-05 2:45 ` [PATCH 0/3] audit: remove audit_log_start() contention in AUDIT_USER type calls Richard Guy Briggs 2013-12-05 2:45 ` [PATCH 1/3] selinux: call WARN_ONCE() instead of calling audit_log_start() Richard Guy Briggs @ 2013-12-05 2:45 ` Richard Guy Briggs 2013-12-06 18:40 ` Casey Schaufler 2013-12-05 2:45 ` [PATCH 3/3] audit: drop audit_cmd_lock in AUDIT_USER family of cases Richard Guy Briggs 2013-12-05 7:15 ` [RESEND][BUG][PATCH V3] audit: audit_log_start running on auditd should not stop Toshiyuki Okajima 3 siblings, 1 reply; 22+ messages in thread From: Richard Guy Briggs @ 2013-12-05 2:45 UTC (permalink / raw) To: linux-audit, linux-kernel Cc: Richard Guy Briggs, Eric Paris, Gao feng, Toshiyuki Okajima Remove the call to audit_log() (which call audit_log_start()) and deal with the errors in the caller, logging only once if the condition is met. Calling audit_log_start() in this location makes buffer allocation and locking more complicated in the calling tree (audit_filter_user()). Signed-off-by: Richard Guy Briggs <rgb@redhat.com> --- security/smack/smack_lsm.c | 5 ++--- 1 files changed, 2 insertions(+), 3 deletions(-) diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index 8825375..185e2e7 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -3615,9 +3615,8 @@ static int smack_audit_rule_match(u32 secid, u32 field, u32 op, void *vrule, struct smack_known *skp; char *rule = vrule; - if (!rule) { - audit_log(actx, GFP_ATOMIC, AUDIT_SELINUX_ERR, - "Smack: missing rule\n"); + if (unlikely(!rule)) { + WARN_ONCE(1, "Smack: missing rule\n"); return -ENOENT; } -- 1.7.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 2/3] smack: call WARN_ONCE() instead of calling audit_log_start() 2013-12-05 2:45 ` [PATCH 2/3] smack: " Richard Guy Briggs @ 2013-12-06 18:40 ` Casey Schaufler 2013-12-08 22:17 ` Richard Guy Briggs 0 siblings, 1 reply; 22+ messages in thread From: Casey Schaufler @ 2013-12-06 18:40 UTC (permalink / raw) To: Richard Guy Briggs, linux-audit, linux-kernel; +Cc: Casey Schaufler On 12/4/2013 6:45 PM, Richard Guy Briggs wrote: > Remove the call to audit_log() (which call audit_log_start()) and deal with > the errors in the caller, logging only once if the condition is met. Calling > audit_log_start() in this location makes buffer allocation and locking more > complicated in the calling tree (audit_filter_user()). > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com> I'm not opposed to this change, but have you actually tried it? > --- > security/smack/smack_lsm.c | 5 ++--- > 1 files changed, 2 insertions(+), 3 deletions(-) > > diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c > index 8825375..185e2e7 100644 > --- a/security/smack/smack_lsm.c > +++ b/security/smack/smack_lsm.c > @@ -3615,9 +3615,8 @@ static int smack_audit_rule_match(u32 secid, u32 field, u32 op, void *vrule, > struct smack_known *skp; > char *rule = vrule; > > - if (!rule) { > - audit_log(actx, GFP_ATOMIC, AUDIT_SELINUX_ERR, > - "Smack: missing rule\n"); > + if (unlikely(!rule)) { > + WARN_ONCE(1, "Smack: missing rule\n"); > return -ENOENT; > } > ^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: [PATCH 2/3] smack: call WARN_ONCE() instead of calling audit_log_start() 2013-12-06 18:40 ` Casey Schaufler @ 2013-12-08 22:17 ` Richard Guy Briggs 0 siblings, 0 replies; 22+ messages in thread From: Richard Guy Briggs @ 2013-12-08 22:17 UTC (permalink / raw) To: Casey Schaufler; +Cc: linux-audit, linux-kernel On Fri, Dec 06, 2013 at 10:40:56AM -0800, Casey Schaufler wrote: > On 12/4/2013 6:45 PM, Richard Guy Briggs wrote: > > Remove the call to audit_log() (which call audit_log_start()) and deal with > > the errors in the caller, logging only once if the condition is met. Calling > > audit_log_start() in this location makes buffer allocation and locking more > > complicated in the calling tree (audit_filter_user()). > > > > Signed-off-by: Richard Guy Briggs <rgb@redhat.com> > > I'm not opposed to this change, but have you actually tried it? No, do you have concerns? It is a projection based on the selinux_audit_match_rule() patch in the same set. > > --- > > security/smack/smack_lsm.c | 5 ++--- > > 1 files changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c > > index 8825375..185e2e7 100644 > > --- a/security/smack/smack_lsm.c > > +++ b/security/smack/smack_lsm.c > > @@ -3615,9 +3615,8 @@ static int smack_audit_rule_match(u32 secid, u32 field, u32 op, void *vrule, > > struct smack_known *skp; > > char *rule = vrule; > > > > - if (!rule) { > > - audit_log(actx, GFP_ATOMIC, AUDIT_SELINUX_ERR, > > - "Smack: missing rule\n"); > > + if (unlikely(!rule)) { > > + WARN_ONCE(1, "Smack: missing rule\n"); > > return -ENOENT; > > } > > > - RGB -- Richard Guy Briggs <rbriggs@redhat.com> Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat Remote, Ottawa, Canada Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545 ^ permalink raw reply [flat|nested] 22+ messages in thread
* [PATCH 3/3] audit: drop audit_cmd_lock in AUDIT_USER family of cases 2013-12-05 2:45 ` [PATCH 0/3] audit: remove audit_log_start() contention in AUDIT_USER type calls Richard Guy Briggs 2013-12-05 2:45 ` [PATCH 1/3] selinux: call WARN_ONCE() instead of calling audit_log_start() Richard Guy Briggs 2013-12-05 2:45 ` [PATCH 2/3] smack: " Richard Guy Briggs @ 2013-12-05 2:45 ` Richard Guy Briggs 2013-12-09 2:31 ` Toshiyuki Okajima 2013-12-05 7:15 ` [RESEND][BUG][PATCH V3] audit: audit_log_start running on auditd should not stop Toshiyuki Okajima 3 siblings, 1 reply; 22+ messages in thread From: Richard Guy Briggs @ 2013-12-05 2:45 UTC (permalink / raw) To: linux-audit, linux-kernel Cc: Richard Guy Briggs, Eric Paris, Gao feng, Toshiyuki Okajima, Richard Guy Briggs We do not need to hold the audit_cmd_mutex for this family of cases. The possible exception to this is the call to audit_filter_user(), so drop the lock immediately after. To help in fixing the race we are trying to avoid, make sure that nothing called by audit_filter_user() calls audit_log_start(). In particular, watch out for *_audit_rule_match(). This fix will take care of systemd and anything USING audit. It still means that we could race with something configuring audit and auditd shutting down. Signed-off-by: Richard Guy Briggs <rgb@tricolour.ca> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> --- kernel/audit.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index 4689012..4cbc945 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -713,6 +713,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) if (err) break; } + mutex_unlock(&audit_cmd_mutex); audit_log_common_recv_msg(&ab, msg_type); if (msg_type != AUDIT_USER_TTY) audit_log_format(ab, " msg='%.1024s'", @@ -729,6 +730,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) } audit_set_pid(ab, NETLINK_CB(skb).portid); audit_log_end(ab); + mutex_lock(&audit_cmd_mutex); } break; case AUDIT_ADD_RULE: -- 1.7.1 ^ permalink raw reply related [flat|nested] 22+ messages in thread
* Re: [PATCH 3/3] audit: drop audit_cmd_lock in AUDIT_USER family of cases 2013-12-05 2:45 ` [PATCH 3/3] audit: drop audit_cmd_lock in AUDIT_USER family of cases Richard Guy Briggs @ 2013-12-09 2:31 ` Toshiyuki Okajima 0 siblings, 0 replies; 22+ messages in thread From: Toshiyuki Okajima @ 2013-12-09 2:31 UTC (permalink / raw) To: Richard Guy Briggs Cc: linux-audit, linux-kernel, Eric Paris, Gao feng, Richard Guy Briggs, toshi.okajima On Wed, 4 Dec 2013 21:45:56 -0500 Richard Guy Briggs <rgb@redhat.com> wrote: > We do not need to hold the audit_cmd_mutex for this family of cases. The > possible exception to this is the call to audit_filter_user(), so drop the lock > immediately after. To help in fixing the race we are trying to avoid, make > sure that nothing called by audit_filter_user() calls audit_log_start(). In > particular, watch out for *_audit_rule_match(). > > This fix will take care of systemd and anything USING audit. It still means > that we could race with something configuring audit and auditd shutting down. > > Signed-off-by: Richard Guy Briggs <rgb@tricolour.ca> > Signed-off-by: Richard Guy Briggs <rgb@redhat.com> When I have tested a patch with my reproducer, a hangup by this problem doesn't occur. //// reproducer //// # auditctl -a exit,always -S all # reboot ///////////////////////// So, this fix looks good to me. Please add: Reported-by: toshi.okajima@jp.fujitsu.com Tested-by: toshi.okajima@jp.fujitsu.com Thanks. > --- > kernel/audit.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/kernel/audit.c b/kernel/audit.c > index 4689012..4cbc945 100644 > --- a/kernel/audit.c > +++ b/kernel/audit.c > @@ -713,6 +713,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > if (err) > break; > } > + mutex_unlock(&audit_cmd_mutex); > audit_log_common_recv_msg(&ab, msg_type); > if (msg_type != AUDIT_USER_TTY) > audit_log_format(ab, " msg='%.1024s'", > @@ -729,6 +730,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh) > } > audit_set_pid(ab, NETLINK_CB(skb).portid); > audit_log_end(ab); > + mutex_lock(&audit_cmd_mutex); > } > break; > case AUDIT_ADD_RULE: > -- > 1.7.1 > -- Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> ^ permalink raw reply [flat|nested] 22+ messages in thread
* [RESEND][BUG][PATCH V3] audit: audit_log_start running on auditd should not stop 2013-12-05 2:45 ` [PATCH 0/3] audit: remove audit_log_start() contention in AUDIT_USER type calls Richard Guy Briggs ` (2 preceding siblings ...) 2013-12-05 2:45 ` [PATCH 3/3] audit: drop audit_cmd_lock in AUDIT_USER family of cases Richard Guy Briggs @ 2013-12-05 7:15 ` Toshiyuki Okajima 3 siblings, 0 replies; 22+ messages in thread From: Toshiyuki Okajima @ 2013-12-05 7:15 UTC (permalink / raw) To: Eric Paris, Richard Guy Briggs Cc: linux-audit, linux-kernel, toshi.okajima, Gao feng Hi. Please apply this patch because the problem that audit_receive called by auditd process is hung up by other process (systemd) which has already called it is fixed. This patch fixes the problem that auditd hangs up by itself. Thanks. --- The backlog cannot be consumed when audit_log_start is running on auditd even if audit_log_start calls wait_for_auditd to consume it. The situation is the deadlock because only auditd can consume the backlog. If the other process needs to send the backlog, it can be also stopped by the deadlock. So, audit_log_start running on auditd should not stop. You can see the deadlock with the following reproducer: # auditctl -a exit,always -S all # reboot Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com> Reviewed-by: gaofeng@cn.fujitsu.com --- kernel/audit.c | 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-) diff --git a/kernel/audit.c b/kernel/audit.c index 7b0e23a..29cfc94 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1095,7 +1095,8 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, struct audit_buffer *ab = NULL; struct timespec t; unsigned int uninitialized_var(serial); - int reserve; + int reserve = 5; /* Allow atomic callers to go up to five + entries over the normal backlog limit */ unsigned long timeout_start = jiffies; if (audit_initialized != AUDIT_INITIALIZED) @@ -1104,11 +1105,12 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask, if (unlikely(audit_filter_type(type))) return NULL; - if (gfp_mask & __GFP_WAIT) - reserve = 0; - else - reserve = 5; /* Allow atomic callers to go up to five - entries over the normal backlog limit */ + if (gfp_mask & __GFP_WAIT) { + if (audit_pid && audit_pid == current->pid) + gfp_mask &= ~__GFP_WAIT; + else + reserve = 0; + } while (audit_backlog_limit && skb_queue_len(&audit_skb_queue) > audit_backlog_limit + reserve) { -- 1.5.5.6 ^ permalink raw reply related [flat|nested] 22+ messages in thread
end of thread, other threads:[~2013-12-09 2:33 UTC | newest] Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-10-11 1:36 [BUG][PATCH][RFC] audit: hang up in audit_log_start executed on auditd Toshiyuki Okajima 2013-10-11 9:33 ` Gao feng 2013-10-11 12:29 ` Toshiyuki Okajima (smtp-b.css) 2013-10-15 4:43 ` [BUG][PATCH] audit: audit_log_start running on auditd should not stop Toshiyuki Okajima 2013-10-15 6:30 ` Gao feng 2013-10-15 7:07 ` Toshiyuki Okajima 2013-10-15 7:58 ` [BUG][PATCH V3] " Toshiyuki Okajima 2013-10-15 9:41 ` Gao feng 2013-10-23 19:55 ` [BUG][PATCH] " Richard Guy Briggs 2013-10-24 5:55 ` Gao feng 2013-10-24 19:35 ` Richard Guy Briggs 2013-10-25 1:36 ` Toshiyuki Okajima 2013-10-25 15:12 ` Eric Paris 2013-10-28 9:20 ` Toshiyuki Okajima 2013-12-05 2:45 ` [PATCH 0/3] audit: remove audit_log_start() contention in AUDIT_USER type calls Richard Guy Briggs 2013-12-05 2:45 ` [PATCH 1/3] selinux: call WARN_ONCE() instead of calling audit_log_start() Richard Guy Briggs 2013-12-05 2:45 ` [PATCH 2/3] smack: " Richard Guy Briggs 2013-12-06 18:40 ` Casey Schaufler 2013-12-08 22:17 ` Richard Guy Briggs 2013-12-05 2:45 ` [PATCH 3/3] audit: drop audit_cmd_lock in AUDIT_USER family of cases Richard Guy Briggs 2013-12-09 2:31 ` Toshiyuki Okajima 2013-12-05 7:15 ` [RESEND][BUG][PATCH V3] audit: audit_log_start running on auditd should not stop Toshiyuki Okajima
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).