linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Boot regression caused by kauditd
@ 2017-04-26 20:48 Cong Wang
  2017-04-26 21:20 ` Paul Moore
  0 siblings, 1 reply; 13+ messages in thread
From: Cong Wang @ 2017-04-26 20:48 UTC (permalink / raw)
  To: Paul Moore; +Cc: LKML

Hi,

I don't know if anyone else already reported this, but I randomly get
the following kernel crash during boot (Fedora 18, I know it's pretty
old!!):

[   20.940715] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000004
[   20.942122] IP: get_net+0x7/0xd
[   20.942739] PGD 0
[   20.942741]
[   20.943370] Oops: 0002 [#1] SMP
[   20.943910] CPU: 0 PID: 56 Comm: kauditd Not tainted 4.11.0-rc7+ #499
[   20.945056] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.2-20150714_191134- 04/01/2014
[   20.946675] task: ffff8800783f8040 task.stack: ffffc90000660000
[   20.947687] RIP: 0010:get_net+0x7/0xd
[   20.948310] RSP: 0018:ffffc90000663e98 EFLAGS: 00010202
[   20.949409] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 9dd28c6188bbd6db
[   20.951093] RDX: ffffffff8111556a RSI: 0000000057539c73 RDI: 0000000000000000
[   20.952382] RBP: ffffc90000663e98 R08: 0000000000000000 R09: 0000000000000000
[   20.953658] R10: ffffc90000663e20 R11: 0000000000000000 R12: 0000000000000000
[   20.954946] R13: 0000000000000000 R14: ffff8800783f8040 R15: ffff8800783f8040
[   20.956144] FS:  0000000000000000(0000) GS:ffff88007d200000(0000)
knlGS:0000000000000000
[   20.957628] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   20.958672] CR2: 0000000000000004 CR3: 0000000062ea3000 CR4: 00000000000406f0
[   20.959943] Call Trace:
[   20.960380]  kauditd_thread+0x60/0x1dc
[   20.961027]  ? signal_pending_state+0x2f/0x2f
[   20.961771]  ? auditd_reset+0x5f/0x5f
[   20.962407]  kthread+0x107/0x10f
[   20.962963]  ? __list_del_entry+0x22/0x22
[   20.963657]  ret_from_fork+0x2e/0x40
[   20.964267] Code: 00 00 5b 41 5c 5d c3 31 c0 83 bf 84 00 00 00 00
55 48 89 e5 75 0c 8b 87 cc 00 00 00 2b 87 c8 00 00 00 5d c3 55 48 89
f8 48 89 e5 <f0> ff 47 04 5d c3 f0 ff 4f 04 74 01 c3 55 48 89 e5 e8 68
6b 7a
[   20.967651] RIP: get_net+0x7/0xd RSP: ffffc90000663e98
[   20.968526] CR2: 0000000000000004
[   20.969094] ---[ end trace aab4cbd6ae264eab ]---
[   20.969886] Kernel panic - not syncing: Fatal exception
[   20.971312] Kernel Offset: disabled
[   20.971928] ---[ end Kernel panic - not syncing: Fatal exception


I guess commit 264d509637d95f9404e introduced this because I never see
this crash before pulling net-next today, but given the fact I only
saw this once so far, I can't just revert it to verify if it is the
case... It is probably some race condition.

Please let me know if you need any other information.

Thanks!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-26 20:48 Boot regression caused by kauditd Cong Wang
@ 2017-04-26 21:20 ` Paul Moore
  2017-04-27  5:04   ` Cong Wang
  2017-04-27 20:31   ` Cong Wang
  0 siblings, 2 replies; 13+ messages in thread
From: Paul Moore @ 2017-04-26 21:20 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML

On Wed, Apr 26, 2017 at 4:48 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> Hi,
>
> I don't know if anyone else already reported this, but I randomly get
> the following kernel crash during boot (Fedora 18, I know it's pretty
> old!!):
>
> [   20.940715] BUG: unable to handle kernel NULL pointer dereference
> at 0000000000000004
> [   20.942122] IP: get_net+0x7/0xd
> [   20.942739] PGD 0
> [   20.942741]
> [   20.943370] Oops: 0002 [#1] SMP
> [   20.943910] CPU: 0 PID: 56 Comm: kauditd Not tainted 4.11.0-rc7+ #499
> [   20.945056] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.8.2-20150714_191134- 04/01/2014
> [   20.946675] task: ffff8800783f8040 task.stack: ffffc90000660000
> [   20.947687] RIP: 0010:get_net+0x7/0xd
> [   20.948310] RSP: 0018:ffffc90000663e98 EFLAGS: 00010202
> [   20.949409] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 9dd28c6188bbd6db
> [   20.951093] RDX: ffffffff8111556a RSI: 0000000057539c73 RDI: 0000000000000000
> [   20.952382] RBP: ffffc90000663e98 R08: 0000000000000000 R09: 0000000000000000
> [   20.953658] R10: ffffc90000663e20 R11: 0000000000000000 R12: 0000000000000000
> [   20.954946] R13: 0000000000000000 R14: ffff8800783f8040 R15: ffff8800783f8040
> [   20.956144] FS:  0000000000000000(0000) GS:ffff88007d200000(0000)
> knlGS:0000000000000000
> [   20.957628] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   20.958672] CR2: 0000000000000004 CR3: 0000000062ea3000 CR4: 00000000000406f0
> [   20.959943] Call Trace:
> [   20.960380]  kauditd_thread+0x60/0x1dc
> [   20.961027]  ? signal_pending_state+0x2f/0x2f
> [   20.961771]  ? auditd_reset+0x5f/0x5f
> [   20.962407]  kthread+0x107/0x10f
> [   20.962963]  ? __list_del_entry+0x22/0x22
> [   20.963657]  ret_from_fork+0x2e/0x40
> [   20.964267] Code: 00 00 5b 41 5c 5d c3 31 c0 83 bf 84 00 00 00 00
> 55 48 89 e5 75 0c 8b 87 cc 00 00 00 2b 87 c8 00 00 00 5d c3 55 48 89
> f8 48 89 e5 <f0> ff 47 04 5d c3 f0 ff 4f 04 74 01 c3 55 48 89 e5 e8 68
> 6b 7a
> [   20.967651] RIP: get_net+0x7/0xd RSP: ffffc90000663e98
> [   20.968526] CR2: 0000000000000004
> [   20.969094] ---[ end trace aab4cbd6ae264eab ]---
> [   20.969886] Kernel panic - not syncing: Fatal exception
> [   20.971312] Kernel Offset: disabled
> [   20.971928] ---[ end Kernel panic - not syncing: Fatal exception
>
>
> I guess commit 264d509637d95f9404e introduced this because I never see
> this crash before pulling net-next today, but given the fact I only
> saw this once so far, I can't just revert it to verify if it is the
> case... It is probably some race condition.
>
> Please let me know if you need any other information.

Hi,

Thanks for the report, this is the only one like it that I've seen.
I'm looking at the code in Linus' tree and I'm not seeing anything
obvious ... looking at the trace above it appears that the problem is
when get_net() goes to bump the refcount and the passed net pointer is
NULL; unless I'm missing something, the only way this would happen in
kauditd_thread() is if the auditd_conn.pid value is non-zero but the
auditd_conn.net pointer is NULL.

That shouldn't happen.

The only way I could see that even being possible is if the
sock_net(NETLINK_CB(skb).sk) call in audit_receive_msg() returned NULL
which in turned was passed as the net pointer (third parameter) in the
auditd_set() call.  Once again, I don't think this should ever be
possible?  Am I missing something?

I realize you aren't able to reproduce this reliably, but if you do,
any chance you try it with Linus' tree?  I'd like to see if we can
rule out the changes in net-next (my testing doesn't typically include
net-next patches).

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-26 21:20 ` Paul Moore
@ 2017-04-27  5:04   ` Cong Wang
  2017-04-27 20:31   ` Cong Wang
  1 sibling, 0 replies; 13+ messages in thread
From: Cong Wang @ 2017-04-27  5:04 UTC (permalink / raw)
  To: Paul Moore; +Cc: LKML

On Wed, Apr 26, 2017 at 2:20 PM, Paul Moore <paul@paul-moore.com> wrote:
> Hi,
>
> Thanks for the report, this is the only one like it that I've seen.
> I'm looking at the code in Linus' tree and I'm not seeing anything
> obvious ... looking at the trace above it appears that the problem is
> when get_net() goes to bump the refcount and the passed net pointer is
> NULL; unless I'm missing something, the only way this would happen in
> kauditd_thread() is if the auditd_conn.pid value is non-zero but the
> auditd_conn.net pointer is NULL.
>
> That shouldn't happen.
>
> The only way I could see that even being possible is if the
> sock_net(NETLINK_CB(skb).sk) call in audit_receive_msg() returned NULL
> which in turned was passed as the net pointer (third parameter) in the
> auditd_set() call.  Once again, I don't think this should ever be
> possible?  Am I missing something?

I don't have time to look into it yet, I think I can take a look tomorrow.

>
> I realize you aren't able to reproduce this reliably, but if you do,
> any chance you try it with Linus' tree?  I'd like to see if we can
> rule out the changes in net-next (my testing doesn't typically include
> net-next patches).
>

Will do, I tried to boot for ~6 times today to reproduce it but just saw
it once.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-26 21:20 ` Paul Moore
  2017-04-27  5:04   ` Cong Wang
@ 2017-04-27 20:31   ` Cong Wang
  2017-04-27 21:35     ` Cong Wang
  1 sibling, 1 reply; 13+ messages in thread
From: Cong Wang @ 2017-04-27 20:31 UTC (permalink / raw)
  To: Paul Moore; +Cc: LKML

On Wed, Apr 26, 2017 at 2:20 PM, Paul Moore <paul@paul-moore.com> wrote:
> Thanks for the report, this is the only one like it that I've seen.
> I'm looking at the code in Linus' tree and I'm not seeing anything
> obvious ... looking at the trace above it appears that the problem is
> when get_net() goes to bump the refcount and the passed net pointer is
> NULL; unless I'm missing something, the only way this would happen in
> kauditd_thread() is if the auditd_conn.pid value is non-zero but the
> auditd_conn.net pointer is NULL.
>
> That shouldn't happen.
>

Looking at the code that reads/writes the global auditd_conn,
I don't see how it even works with RCU+spinlock, RCU plays
with pointers and you have to make a copy as its name implies.
But it looks like you simply use RCU+spinlock as a traditional
rwlock, it doesn't work.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-27 20:31   ` Cong Wang
@ 2017-04-27 21:35     ` Cong Wang
  2017-04-27 21:45       ` Cong Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Cong Wang @ 2017-04-27 21:35 UTC (permalink / raw)
  To: Paul Moore; +Cc: LKML

[-- Attachment #1: Type: text/plain, Size: 1093 bytes --]

On Thu, Apr 27, 2017 at 1:31 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Wed, Apr 26, 2017 at 2:20 PM, Paul Moore <paul@paul-moore.com> wrote:
>> Thanks for the report, this is the only one like it that I've seen.
>> I'm looking at the code in Linus' tree and I'm not seeing anything
>> obvious ... looking at the trace above it appears that the problem is
>> when get_net() goes to bump the refcount and the passed net pointer is
>> NULL; unless I'm missing something, the only way this would happen in
>> kauditd_thread() is if the auditd_conn.pid value is non-zero but the
>> auditd_conn.net pointer is NULL.
>>
>> That shouldn't happen.
>>
>
> Looking at the code that reads/writes the global auditd_conn,
> I don't see how it even works with RCU+spinlock, RCU plays
> with pointers and you have to make a copy as its name implies.
> But it looks like you simply use RCU+spinlock as a traditional
> rwlock, it doesn't work.

The attached patch seems working for me, I tried to boot my
VM for 4 times, so far no crash or warning.

Please let me know if it looks reasonable to you.

[-- Attachment #2: auditd-rcu.diff --]
[-- Type: text/plain, Size: 8231 bytes --]

diff --git a/kernel/audit.c b/kernel/audit.c
index a871bf8..182d31b 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -110,7 +110,6 @@ struct audit_net {
  * @pid: auditd PID
  * @portid: netlink portid
  * @net: the associated network namespace
- * @lock: spinlock to protect write access
  *
  * Description:
  * This struct is RCU protected; you must either hold the RCU lock for reading
@@ -120,8 +119,9 @@ static struct auditd_connection {
 	int pid;
 	u32 portid;
 	struct net *net;
-	spinlock_t lock;
-} auditd_conn;
+	struct rcu_head rcu;
+} *auditd_conn;
+static DEFINE_SPINLOCK(auditd_conn_lock);
 
 /* If audit_rate_limit is non-zero, limit the rate of sending audit records
  * to that number per second.  This prevents DoS attacks, but results in
@@ -223,9 +223,11 @@ struct audit_reply {
 int auditd_test_task(const struct task_struct *task)
 {
 	int rc;
+	pid_t pid;
 
 	rcu_read_lock();
-	rc = (auditd_conn.pid && task->tgid == auditd_conn.pid ? 1 : 0);
+	pid = rcu_dereference(auditd_conn)->pid;
+	rc = (pid && task->tgid == pid ? 1 : 0);
 	rcu_read_unlock();
 
 	return rc;
@@ -426,30 +428,37 @@ static int audit_set_failure(u32 state)
 	return audit_do_config_change("audit_failure", &audit_failure, state);
 }
 
+static void auditd_conn_free(struct rcu_head *rcu)
+{
+	struct auditd_connection *cn = container_of(rcu, struct auditd_connection, rcu);
+
+	if (cn->net)
+		put_net(cn->net);
+	kfree(cn);
+}
+
 /**
- * auditd_set - Set/Reset the auditd connection state
- * @pid: auditd PID
- * @portid: auditd netlink portid
- * @net: auditd network namespace pointer
+ * auditd_set - Set the auditd connection state
+ * @new_coon: the new auditd connection
  *
  * Description:
  * This function will obtain and drop network namespace references as
  * necessary.
  */
-static void auditd_set(int pid, u32 portid, struct net *net)
+static void auditd_set(struct auditd_connection *new_conn)
 {
+	struct auditd_connection *old_conn;
 	unsigned long flags;
 
-	spin_lock_irqsave(&auditd_conn.lock, flags);
-	auditd_conn.pid = pid;
-	auditd_conn.portid = portid;
-	if (auditd_conn.net)
-		put_net(auditd_conn.net);
-	if (net)
-		auditd_conn.net = get_net(net);
-	else
-		auditd_conn.net = NULL;
-	spin_unlock_irqrestore(&auditd_conn.lock, flags);
+	if (new_conn->net)
+		get_net(new_conn->net);
+	spin_lock_irqsave(&auditd_conn_lock, flags);
+	old_conn = rcu_dereference_protected(auditd_conn,
+					lockdep_is_held(&auditd_conn_lock));
+	rcu_assign_pointer(auditd_conn, new_conn);
+	spin_unlock_irqrestore(&auditd_conn_lock, flags);
+
+	call_rcu(&old_conn->rcu, auditd_conn_free);
 }
 
 /**
@@ -537,19 +546,24 @@ static void kauditd_retry_skb(struct sk_buff *skb)
 
 /**
  * auditd_reset - Disconnect the auditd connection
+ * @flags: GFP flags
  *
  * Description:
  * Break the auditd/kauditd connection and move all the queued records into the
  * hold queue in case auditd reconnects.
  */
-static void auditd_reset(void)
+static void auditd_reset(gfp_t flags)
 {
 	struct sk_buff *skb;
+	struct auditd_connection *null_conn;
 
+	null_conn = kzalloc(sizeof(*null_conn), flags);
+	if (!null_conn)
+		return;
 	/* if it isn't already broken, break the connection */
 	rcu_read_lock();
-	if (auditd_conn.pid)
-		auditd_set(0, 0, NULL);
+	if (rcu_dereference(auditd_conn)->pid)
+		auditd_set(null_conn);
 	rcu_read_unlock();
 
 	/* flush all of the main and retry queues to the hold queue */
@@ -585,15 +599,15 @@ static int auditd_send_unicast_skb(struct sk_buff *skb)
 	 *       section netlink_unicast() should safely return an error */
 
 	rcu_read_lock();
-	if (!auditd_conn.pid) {
+	if (!rcu_dereference(auditd_conn)->pid) {
 		rcu_read_unlock();
 		rc = -ECONNREFUSED;
 		goto err;
 	}
-	net = auditd_conn.net;
+	net = rcu_dereference(auditd_conn)->net;
 	get_net(net);
 	sk = audit_get_sk(net);
-	portid = auditd_conn.portid;
+	portid = rcu_dereference(auditd_conn)->portid;
 	rcu_read_unlock();
 
 	rc = netlink_unicast(sk, skb, portid, 0);
@@ -605,7 +619,7 @@ static int auditd_send_unicast_skb(struct sk_buff *skb)
 
 err:
 	if (rc == -ECONNREFUSED)
-		auditd_reset();
+		auditd_reset(GFP_KERNEL);
 	return rc;
 }
 
@@ -735,14 +749,14 @@ static int kauditd_thread(void *dummy)
 	while (!kthread_should_stop()) {
 		/* NOTE: see the lock comments in auditd_send_unicast_skb() */
 		rcu_read_lock();
-		if (!auditd_conn.pid) {
+		if (!rcu_dereference(auditd_conn)->pid) {
 			rcu_read_unlock();
 			goto main_queue;
 		}
-		net = auditd_conn.net;
+		net = rcu_dereference(auditd_conn)->net;
 		get_net(net);
 		sk = audit_get_sk(net);
-		portid = auditd_conn.portid;
+		portid = rcu_dereference(auditd_conn)->portid;
 		rcu_read_unlock();
 
 		/* attempt to flush the hold queue */
@@ -751,7 +765,7 @@ static int kauditd_thread(void *dummy)
 					NULL, kauditd_rehold_skb);
 		if (rc < 0) {
 			sk = NULL;
-			auditd_reset();
+			auditd_reset(GFP_KERNEL);
 			goto main_queue;
 		}
 
@@ -761,7 +775,7 @@ static int kauditd_thread(void *dummy)
 					NULL, kauditd_hold_skb);
 		if (rc < 0) {
 			sk = NULL;
-			auditd_reset();
+			auditd_reset(GFP_KERNEL);
 			goto main_queue;
 		}
 
@@ -774,7 +788,7 @@ static int kauditd_thread(void *dummy)
 					kauditd_send_multicast_skb,
 					kauditd_retry_skb);
 		if (sk == NULL || rc < 0)
-			auditd_reset();
+			auditd_reset(GFP_KERNEL);
 		sk = NULL;
 
 		/* drop our netns reference, no auditd sends past this line */
@@ -1103,7 +1117,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		s.enabled		= audit_enabled;
 		s.failure		= audit_failure;
 		rcu_read_lock();
-		s.pid			= auditd_conn.pid;
+		s.pid			= rcu_dereference(auditd_conn)->pid;
 		rcu_read_unlock();
 		s.rate_limit		= audit_rate_limit;
 		s.backlog_limit		= audit_backlog_limit;
@@ -1139,17 +1153,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 			int new_pid = s.pid;
 			pid_t auditd_pid;
 			pid_t requesting_pid = task_tgid_vnr(current);
+			struct auditd_connection *new;
 
+			new = kmalloc(sizeof(*new), GFP_KERNEL);
+			if (!new)
+				return -ENOMEM;
 			/* test the auditd connection */
 			audit_replace(requesting_pid);
 
 			rcu_read_lock();
-			auditd_pid = auditd_conn.pid;
+			auditd_pid = rcu_dereference(auditd_conn)->pid;
 			/* only the current auditd can unregister itself */
 			if ((!new_pid) && (requesting_pid != auditd_pid)) {
 				rcu_read_unlock();
 				audit_log_config_change("audit_pid", new_pid,
 							auditd_pid, 0);
+				kfree(new);
 				return -EACCES;
 			}
 			/* replacing a healthy auditd is not allowed */
@@ -1157,6 +1176,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 				rcu_read_unlock();
 				audit_log_config_change("audit_pid", new_pid,
 							auditd_pid, 0);
+				kfree(new);
 				return -EEXIST;
 			}
 			rcu_read_unlock();
@@ -1166,15 +1186,16 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 							auditd_pid, 1);
 
 			if (new_pid) {
+				new->pid = new_pid;
+				new->portid = NETLINK_CB(skb).portid;
+				new->net = sock_net(NETLINK_CB(skb).sk);
 				/* register a new auditd connection */
-				auditd_set(new_pid,
-					   NETLINK_CB(skb).portid,
-					   sock_net(NETLINK_CB(skb).sk));
+				auditd_set(new);
 				/* try to process any backlog */
 				wake_up_interruptible(&kauditd_wait);
 			} else
 				/* unregister the auditd connection */
-				auditd_reset();
+				auditd_reset(GFP_KERNEL);
 		}
 		if (s.mask & AUDIT_STATUS_RATE_LIMIT) {
 			err = audit_set_rate_limit(s.rate_limit);
@@ -1448,8 +1469,8 @@ static void __net_exit audit_net_exit(struct net *net)
 	struct audit_net *aunet = net_generic(net, audit_net_id);
 
 	rcu_read_lock();
-	if (net == auditd_conn.net)
-		auditd_reset();
+	if (net == rcu_dereference(auditd_conn)->net)
+		auditd_reset(GFP_ATOMIC);
 	rcu_read_unlock();
 
 	netlink_kernel_release(aunet->sk);
@@ -1470,8 +1491,9 @@ static int __init audit_init(void)
 	if (audit_initialized == AUDIT_DISABLED)
 		return 0;
 
-	memset(&auditd_conn, 0, sizeof(auditd_conn));
-	spin_lock_init(&auditd_conn.lock);
+	auditd_conn = kzalloc(sizeof(*auditd_conn), GFP_KERNEL);
+	if (!auditd_conn)
+		return -ENOMEM;
 
 	skb_queue_head_init(&audit_queue);
 	skb_queue_head_init(&audit_retry_queue);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-27 21:35     ` Cong Wang
@ 2017-04-27 21:45       ` Cong Wang
  2017-04-27 22:38         ` Paul Moore
  0 siblings, 1 reply; 13+ messages in thread
From: Cong Wang @ 2017-04-27 21:45 UTC (permalink / raw)
  To: Paul Moore; +Cc: LKML

[-- Attachment #1: Type: text/plain, Size: 1205 bytes --]

On Thu, Apr 27, 2017 at 2:35 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Thu, Apr 27, 2017 at 1:31 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Wed, Apr 26, 2017 at 2:20 PM, Paul Moore <paul@paul-moore.com> wrote:
>>> Thanks for the report, this is the only one like it that I've seen.
>>> I'm looking at the code in Linus' tree and I'm not seeing anything
>>> obvious ... looking at the trace above it appears that the problem is
>>> when get_net() goes to bump the refcount and the passed net pointer is
>>> NULL; unless I'm missing something, the only way this would happen in
>>> kauditd_thread() is if the auditd_conn.pid value is non-zero but the
>>> auditd_conn.net pointer is NULL.
>>>
>>> That shouldn't happen.
>>>
>>
>> Looking at the code that reads/writes the global auditd_conn,
>> I don't see how it even works with RCU+spinlock, RCU plays
>> with pointers and you have to make a copy as its name implies.
>> But it looks like you simply use RCU+spinlock as a traditional
>> rwlock, it doesn't work.
>
> The attached patch seems working for me, I tried to boot my
> VM for 4 times, so far no crash or warning.
>

Or even better, save a memory allocation for reset path...

[-- Attachment #2: auditd-rcu.diff --]
[-- Type: text/plain, Size: 6715 bytes --]

diff --git a/kernel/audit.c b/kernel/audit.c
index a871bf8..9953dbe 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -110,7 +110,7 @@ struct audit_net {
  * @pid: auditd PID
  * @portid: netlink portid
  * @net: the associated network namespace
- * @lock: spinlock to protect write access
+ * @rcu: rcu head
  *
  * Description:
  * This struct is RCU protected; you must either hold the RCU lock for reading
@@ -120,8 +120,9 @@ static struct auditd_connection {
 	int pid;
 	u32 portid;
 	struct net *net;
-	spinlock_t lock;
-} auditd_conn;
+	struct rcu_head rcu;
+} *auditd_conn, null_conn;
+static DEFINE_SPINLOCK(auditd_conn_lock);
 
 /* If audit_rate_limit is non-zero, limit the rate of sending audit records
  * to that number per second.  This prevents DoS attacks, but results in
@@ -223,9 +224,11 @@ struct audit_reply {
 int auditd_test_task(const struct task_struct *task)
 {
 	int rc;
+	pid_t pid;
 
 	rcu_read_lock();
-	rc = (auditd_conn.pid && task->tgid == auditd_conn.pid ? 1 : 0);
+	pid = rcu_dereference(auditd_conn)->pid;
+	rc = (pid && task->tgid == pid ? 1 : 0);
 	rcu_read_unlock();
 
 	return rc;
@@ -426,30 +429,39 @@ static int audit_set_failure(u32 state)
 	return audit_do_config_change("audit_failure", &audit_failure, state);
 }
 
+static void auditd_conn_free(struct rcu_head *rcu)
+{
+	struct auditd_connection *cn = container_of(rcu, struct auditd_connection, rcu);
+
+	if (cn == &null_conn)
+		return;
+	if (cn->net)
+		put_net(cn->net);
+	kfree(cn);
+}
+
 /**
- * auditd_set - Set/Reset the auditd connection state
- * @pid: auditd PID
- * @portid: auditd netlink portid
- * @net: auditd network namespace pointer
+ * auditd_set - Set the auditd connection state
+ * @new_coon: the new auditd connection
  *
  * Description:
  * This function will obtain and drop network namespace references as
  * necessary.
  */
-static void auditd_set(int pid, u32 portid, struct net *net)
+static void auditd_set(struct auditd_connection *new_conn)
 {
+	struct auditd_connection *old_conn;
 	unsigned long flags;
 
-	spin_lock_irqsave(&auditd_conn.lock, flags);
-	auditd_conn.pid = pid;
-	auditd_conn.portid = portid;
-	if (auditd_conn.net)
-		put_net(auditd_conn.net);
-	if (net)
-		auditd_conn.net = get_net(net);
-	else
-		auditd_conn.net = NULL;
-	spin_unlock_irqrestore(&auditd_conn.lock, flags);
+	if (new_conn->net)
+		get_net(new_conn->net);
+	spin_lock_irqsave(&auditd_conn_lock, flags);
+	old_conn = rcu_dereference_protected(auditd_conn,
+					lockdep_is_held(&auditd_conn_lock));
+	rcu_assign_pointer(auditd_conn, new_conn);
+	spin_unlock_irqrestore(&auditd_conn_lock, flags);
+
+	call_rcu(&old_conn->rcu, auditd_conn_free);
 }
 
 /**
@@ -548,8 +560,8 @@ static void auditd_reset(void)
 
 	/* if it isn't already broken, break the connection */
 	rcu_read_lock();
-	if (auditd_conn.pid)
-		auditd_set(0, 0, NULL);
+	if (rcu_dereference(auditd_conn)->pid)
+		auditd_set(&null_conn);
 	rcu_read_unlock();
 
 	/* flush all of the main and retry queues to the hold queue */
@@ -585,15 +597,15 @@ static int auditd_send_unicast_skb(struct sk_buff *skb)
 	 *       section netlink_unicast() should safely return an error */
 
 	rcu_read_lock();
-	if (!auditd_conn.pid) {
+	if (!rcu_dereference(auditd_conn)->pid) {
 		rcu_read_unlock();
 		rc = -ECONNREFUSED;
 		goto err;
 	}
-	net = auditd_conn.net;
+	net = rcu_dereference(auditd_conn)->net;
 	get_net(net);
 	sk = audit_get_sk(net);
-	portid = auditd_conn.portid;
+	portid = rcu_dereference(auditd_conn)->portid;
 	rcu_read_unlock();
 
 	rc = netlink_unicast(sk, skb, portid, 0);
@@ -735,14 +747,14 @@ static int kauditd_thread(void *dummy)
 	while (!kthread_should_stop()) {
 		/* NOTE: see the lock comments in auditd_send_unicast_skb() */
 		rcu_read_lock();
-		if (!auditd_conn.pid) {
+		if (!rcu_dereference(auditd_conn)->pid) {
 			rcu_read_unlock();
 			goto main_queue;
 		}
-		net = auditd_conn.net;
+		net = rcu_dereference(auditd_conn)->net;
 		get_net(net);
 		sk = audit_get_sk(net);
-		portid = auditd_conn.portid;
+		portid = rcu_dereference(auditd_conn)->portid;
 		rcu_read_unlock();
 
 		/* attempt to flush the hold queue */
@@ -1103,7 +1115,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 		s.enabled		= audit_enabled;
 		s.failure		= audit_failure;
 		rcu_read_lock();
-		s.pid			= auditd_conn.pid;
+		s.pid			= rcu_dereference(auditd_conn)->pid;
 		rcu_read_unlock();
 		s.rate_limit		= audit_rate_limit;
 		s.backlog_limit		= audit_backlog_limit;
@@ -1139,17 +1151,22 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 			int new_pid = s.pid;
 			pid_t auditd_pid;
 			pid_t requesting_pid = task_tgid_vnr(current);
+			struct auditd_connection *new;
 
+			new = kmalloc(sizeof(*new), GFP_KERNEL);
+			if (!new)
+				return -ENOMEM;
 			/* test the auditd connection */
 			audit_replace(requesting_pid);
 
 			rcu_read_lock();
-			auditd_pid = auditd_conn.pid;
+			auditd_pid = rcu_dereference(auditd_conn)->pid;
 			/* only the current auditd can unregister itself */
 			if ((!new_pid) && (requesting_pid != auditd_pid)) {
 				rcu_read_unlock();
 				audit_log_config_change("audit_pid", new_pid,
 							auditd_pid, 0);
+				kfree(new);
 				return -EACCES;
 			}
 			/* replacing a healthy auditd is not allowed */
@@ -1157,6 +1174,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 				rcu_read_unlock();
 				audit_log_config_change("audit_pid", new_pid,
 							auditd_pid, 0);
+				kfree(new);
 				return -EEXIST;
 			}
 			rcu_read_unlock();
@@ -1166,10 +1184,11 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
 							auditd_pid, 1);
 
 			if (new_pid) {
+				new->pid = new_pid;
+				new->portid = NETLINK_CB(skb).portid;
+				new->net = sock_net(NETLINK_CB(skb).sk);
 				/* register a new auditd connection */
-				auditd_set(new_pid,
-					   NETLINK_CB(skb).portid,
-					   sock_net(NETLINK_CB(skb).sk));
+				auditd_set(new);
 				/* try to process any backlog */
 				wake_up_interruptible(&kauditd_wait);
 			} else
@@ -1448,7 +1467,7 @@ static void __net_exit audit_net_exit(struct net *net)
 	struct audit_net *aunet = net_generic(net, audit_net_id);
 
 	rcu_read_lock();
-	if (net == auditd_conn.net)
+	if (net == rcu_dereference(auditd_conn)->net)
 		auditd_reset();
 	rcu_read_unlock();
 
@@ -1470,8 +1489,9 @@ static int __init audit_init(void)
 	if (audit_initialized == AUDIT_DISABLED)
 		return 0;
 
-	memset(&auditd_conn, 0, sizeof(auditd_conn));
-	spin_lock_init(&auditd_conn.lock);
+	auditd_conn = kzalloc(sizeof(*auditd_conn), GFP_KERNEL);
+	if (!auditd_conn)
+		return -ENOMEM;
 
 	skb_queue_head_init(&audit_queue);
 	skb_queue_head_init(&audit_retry_queue);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-27 21:45       ` Cong Wang
@ 2017-04-27 22:38         ` Paul Moore
  2017-04-27 23:41           ` Cong Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Moore @ 2017-04-27 22:38 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML

On Thu, Apr 27, 2017 at 5:45 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Thu, Apr 27, 2017 at 2:35 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Thu, Apr 27, 2017 at 1:31 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>> On Wed, Apr 26, 2017 at 2:20 PM, Paul Moore <paul@paul-moore.com> wrote:
>>>> Thanks for the report, this is the only one like it that I've seen.
>>>> I'm looking at the code in Linus' tree and I'm not seeing anything
>>>> obvious ... looking at the trace above it appears that the problem is
>>>> when get_net() goes to bump the refcount and the passed net pointer is
>>>> NULL; unless I'm missing something, the only way this would happen in
>>>> kauditd_thread() is if the auditd_conn.pid value is non-zero but the
>>>> auditd_conn.net pointer is NULL.
>>>>
>>>> That shouldn't happen.
>>>>
>>>
>>> Looking at the code that reads/writes the global auditd_conn,
>>> I don't see how it even works with RCU+spinlock, RCU plays
>>> with pointers and you have to make a copy as its name implies.
>>> But it looks like you simply use RCU+spinlock as a traditional
>>> rwlock, it doesn't work.
>>
>> The attached patch seems working for me, I tried to boot my
>> VM for 4 times, so far no crash or warning.
>>
>
> Or even better, save a memory allocation for reset path...

I need to step away from my laptop for the evening so I can't give
this a proper review until tomorrow (sending patches as attachments
makes it difficult to review), but on quick glance I did notice a few
small things I would like to see changed.  However, since there is no
normal commit description and sign-off, I'm guessing you sent these
out as a suggestion and not a proper patch submission, yes/no?  If
that's the case, I'll work up a proper fix tomorrow and share it with
you for comment/review, but if you were planning on sending a proper
patch let me know and I'll wait until I see something in my inbox from
you.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-27 22:38         ` Paul Moore
@ 2017-04-27 23:41           ` Cong Wang
  2017-04-28  0:47             ` Paul Moore
  0 siblings, 1 reply; 13+ messages in thread
From: Cong Wang @ 2017-04-27 23:41 UTC (permalink / raw)
  To: Paul Moore; +Cc: LKML

On Thu, Apr 27, 2017 at 3:38 PM, Paul Moore <paul@paul-moore.com> wrote:
> On Thu, Apr 27, 2017 at 5:45 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Thu, Apr 27, 2017 at 2:35 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>> On Thu, Apr 27, 2017 at 1:31 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>> On Wed, Apr 26, 2017 at 2:20 PM, Paul Moore <paul@paul-moore.com> wrote:
>>>>> Thanks for the report, this is the only one like it that I've seen.
>>>>> I'm looking at the code in Linus' tree and I'm not seeing anything
>>>>> obvious ... looking at the trace above it appears that the problem is
>>>>> when get_net() goes to bump the refcount and the passed net pointer is
>>>>> NULL; unless I'm missing something, the only way this would happen in
>>>>> kauditd_thread() is if the auditd_conn.pid value is non-zero but the
>>>>> auditd_conn.net pointer is NULL.
>>>>>
>>>>> That shouldn't happen.
>>>>>
>>>>
>>>> Looking at the code that reads/writes the global auditd_conn,
>>>> I don't see how it even works with RCU+spinlock, RCU plays
>>>> with pointers and you have to make a copy as its name implies.
>>>> But it looks like you simply use RCU+spinlock as a traditional
>>>> rwlock, it doesn't work.
>>>
>>> The attached patch seems working for me, I tried to boot my
>>> VM for 4 times, so far no crash or warning.
>>>
>>
>> Or even better, save a memory allocation for reset path...
>
> I need to step away from my laptop for the evening so I can't give
> this a proper review until tomorrow (sending patches as attachments
> makes it difficult to review), but on quick glance I did notice a few
> small things I would like to see changed.  However, since there is no
> normal commit description and sign-off, I'm guessing you sent these
> out as a suggestion and not a proper patch submission, yes/no?  If
> that's the case, I'll work up a proper fix tomorrow and share it with
> you for comment/review, but if you were planning on sending a proper
> patch let me know and I'll wait until I see something in my inbox from
> you.

I want you to give it sanity check before I submit a formal one. ;)
If you don't reject it, I will send a formal one with description and SoB.

Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-27 23:41           ` Cong Wang
@ 2017-04-28  0:47             ` Paul Moore
  2017-04-28 15:30               ` Paul Moore
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Moore @ 2017-04-28  0:47 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML

In that case please send a proper inline patch to the audit mailing list
and we'll review it.

Thanks.
--
paul moore
www.paul-moore.com



On April 27, 2017 7:41:45 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:

> On Thu, Apr 27, 2017 at 3:38 PM, Paul Moore <paul@paul-moore.com> wrote:
>> On Thu, Apr 27, 2017 at 5:45 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>> On Thu, Apr 27, 2017 at 2:35 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>> On Thu, Apr 27, 2017 at 1:31 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>>> On Wed, Apr 26, 2017 at 2:20 PM, Paul Moore <paul@paul-moore.com> wrote:
>>>>>> Thanks for the report, this is the only one like it that I've seen.
>>>>>> I'm looking at the code in Linus' tree and I'm not seeing anything
>>>>>> obvious ... looking at the trace above it appears that the problem is
>>>>>> when get_net() goes to bump the refcount and the passed net pointer is
>>>>>> NULL; unless I'm missing something, the only way this would happen in
>>>>>> kauditd_thread() is if the auditd_conn.pid value is non-zero but the
>>>>>> auditd_conn.net pointer is NULL.
>>>>>>
>>>>>> That shouldn't happen.
>>>>>>
>>>>>
>>>>> Looking at the code that reads/writes the global auditd_conn,
>>>>> I don't see how it even works with RCU+spinlock, RCU plays
>>>>> with pointers and you have to make a copy as its name implies.
>>>>> But it looks like you simply use RCU+spinlock as a traditional
>>>>> rwlock, it doesn't work.
>>>>
>>>> The attached patch seems working for me, I tried to boot my
>>>> VM for 4 times, so far no crash or warning.
>>>>
>>>
>>> Or even better, save a memory allocation for reset path...
>>
>> I need to step away from my laptop for the evening so I can't give
>> this a proper review until tomorrow (sending patches as attachments
>> makes it difficult to review), but on quick glance I did notice a few
>> small things I would like to see changed.  However, since there is no
>> normal commit description and sign-off, I'm guessing you sent these
>> out as a suggestion and not a proper patch submission, yes/no?  If
>> that's the case, I'll work up a proper fix tomorrow and share it with
>> you for comment/review, but if you were planning on sending a proper
>> patch let me know and I'll wait until I see something in my inbox from
>> you.
>
> I want you to give it sanity check before I submit a formal one. ;)
> If you don't reject it, I will send a formal one with description and SoB.
>
> Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-28  0:47             ` Paul Moore
@ 2017-04-28 15:30               ` Paul Moore
  2017-04-28 16:11                 ` Cong Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Moore @ 2017-04-28 15:30 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML

On Thu, Apr 27, 2017 at 8:47 PM, Paul Moore <paul@paul-moore.com> wrote:
> In that case please send a proper inline patch to the audit mailing list
> and we'll review it.
>
> Thanks.

Now that I'm back in front of a proper screen/keyboard I've been
looking over your patch and while you are very right in that the
current RCU usage is very wrong, there are quite a few things I would
like to see changed in your patch ... I'm working on something right
now, I'll post an RFC draft to the audit list and CC you once I get
this sorted out, expect something in a few hours.

Also, once you've had a look at this new patch, and assuming you are
okay with it, I'd like to add your sign-off to it.  This may not be
your patch exactly, but a significant portion of it is borrowed from
your patch yesterday.

> On April 27, 2017 7:41:45 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
>> On Thu, Apr 27, 2017 at 3:38 PM, Paul Moore <paul@paul-moore.com> wrote:
>>> On Thu, Apr 27, 2017 at 5:45 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>> On Thu, Apr 27, 2017 at 2:35 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>>> On Thu, Apr 27, 2017 at 1:31 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>>>> On Wed, Apr 26, 2017 at 2:20 PM, Paul Moore <paul@paul-moore.com> wrote:
>>>>>>> Thanks for the report, this is the only one like it that I've seen.
>>>>>>> I'm looking at the code in Linus' tree and I'm not seeing anything
>>>>>>> obvious ... looking at the trace above it appears that the problem is
>>>>>>> when get_net() goes to bump the refcount and the passed net pointer is
>>>>>>> NULL; unless I'm missing something, the only way this would happen in
>>>>>>> kauditd_thread() is if the auditd_conn.pid value is non-zero but the
>>>>>>> auditd_conn.net pointer is NULL.
>>>>>>>
>>>>>>> That shouldn't happen.
>>>>>>>
>>>>>>
>>>>>> Looking at the code that reads/writes the global auditd_conn,
>>>>>> I don't see how it even works with RCU+spinlock, RCU plays
>>>>>> with pointers and you have to make a copy as its name implies.
>>>>>> But it looks like you simply use RCU+spinlock as a traditional
>>>>>> rwlock, it doesn't work.
>>>>>
>>>>> The attached patch seems working for me, I tried to boot my
>>>>> VM for 4 times, so far no crash or warning.
>>>>>
>>>>
>>>> Or even better, save a memory allocation for reset path...
>>>
>>> I need to step away from my laptop for the evening so I can't give
>>> this a proper review until tomorrow (sending patches as attachments
>>> makes it difficult to review), but on quick glance I did notice a few
>>> small things I would like to see changed.  However, since there is no
>>> normal commit description and sign-off, I'm guessing you sent these
>>> out as a suggestion and not a proper patch submission, yes/no?  If
>>> that's the case, I'll work up a proper fix tomorrow and share it with
>>> you for comment/review, but if you were planning on sending a proper
>>> patch let me know and I'll wait until I see something in my inbox from
>>> you.
>>
>> I want you to give it sanity check before I submit a formal one. ;)
>> If you don't reject it, I will send a formal one with description and SoB.
>>
>> Thanks.
>
>



-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-28 15:30               ` Paul Moore
@ 2017-04-28 16:11                 ` Cong Wang
  2017-04-28 16:26                   ` Paul Moore
  0 siblings, 1 reply; 13+ messages in thread
From: Cong Wang @ 2017-04-28 16:11 UTC (permalink / raw)
  To: Paul Moore; +Cc: LKML

On Fri, Apr 28, 2017 at 8:30 AM, Paul Moore <paul@paul-moore.com> wrote:
> On Thu, Apr 27, 2017 at 8:47 PM, Paul Moore <paul@paul-moore.com> wrote:
>> In that case please send a proper inline patch to the audit mailing list
>> and we'll review it.
>>
>> Thanks.
>
> Now that I'm back in front of a proper screen/keyboard I've been
> looking over your patch and while you are very right in that the
> current RCU usage is very wrong, there are quite a few things I would
> like to see changed in your patch ... I'm working on something right
> now, I'll post an RFC draft to the audit list and CC you once I get
> this sorted out, expect something in a few hours.
>
> Also, once you've had a look at this new patch, and assuming you are
> okay with it, I'd like to add your sign-off to it.  This may not be
> your patch exactly, but a significant portion of it is borrowed from
> your patch yesterday.

So your review process is: if people's V1 patch is not perfect, you
will rewrite it by yourself?

But the normal review process is: people need to address feedback
and send V2, V3 etc..

That's too odd. Someday, no one will be willing to work on audit
patches except yourself.

No offense, just don't feel your review process is cooperative...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-28 16:11                 ` Cong Wang
@ 2017-04-28 16:26                   ` Paul Moore
  2017-05-02  5:50                     ` Cong Wang
  0 siblings, 1 reply; 13+ messages in thread
From: Paul Moore @ 2017-04-28 16:26 UTC (permalink / raw)
  To: Cong Wang; +Cc: LKML

On Fri, Apr 28, 2017 at 12:11 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Fri, Apr 28, 2017 at 8:30 AM, Paul Moore <paul@paul-moore.com> wrote:
>> On Thu, Apr 27, 2017 at 8:47 PM, Paul Moore <paul@paul-moore.com> wrote:
>>> In that case please send a proper inline patch to the audit mailing list
>>> and we'll review it.
>>>
>>> Thanks.
>>
>> Now that I'm back in front of a proper screen/keyboard I've been
>> looking over your patch and while you are very right in that the
>> current RCU usage is very wrong, there are quite a few things I would
>> like to see changed in your patch ... I'm working on something right
>> now, I'll post an RFC draft to the audit list and CC you once I get
>> this sorted out, expect something in a few hours.
>>
>> Also, once you've had a look at this new patch, and assuming you are
>> okay with it, I'd like to add your sign-off to it.  This may not be
>> your patch exactly, but a significant portion of it is borrowed from
>> your patch yesterday.
>
> So your review process is: if people's V1 patch is not perfect, you
> will rewrite it by yourself?

As I mentioned earlier I didn't get a chance to properly review your
patch yesterday for two important reasons: 1) it hit my inbox at the
end of my day and I simply didn't have time and 2) you sent it as an
attachment which makes it hard to review and provide feedback.  I took
a closer look at your patch this morning and noticed a number of
things that needed additional work as well as some merge/porting
things; considering I never saw a response from you on my last email
asking for an inline patch submission and taking into account where we
are in the merge window (I'd like to submit this fix during the v4.12
merge window) I went ahead and put together a patch based on your
prototype.

You'll see I just posted it and CC'd you (our emails probably crossed
paths), asking if it was okay to add your sign-off and give you
credit.  I'm only trying to speed up the process.  There is no malice
here, I actually thought I was helping you out ... I suppose the old
adage rings true: no good deed goes unpunished ;)

-- 
paul moore
www.paul-moore.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Boot regression caused by kauditd
  2017-04-28 16:26                   ` Paul Moore
@ 2017-05-02  5:50                     ` Cong Wang
  0 siblings, 0 replies; 13+ messages in thread
From: Cong Wang @ 2017-05-02  5:50 UTC (permalink / raw)
  To: Paul Moore; +Cc: LKML

On Fri, Apr 28, 2017 at 9:26 AM, Paul Moore <paul@paul-moore.com> wrote:
> On Fri, Apr 28, 2017 at 12:11 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Fri, Apr 28, 2017 at 8:30 AM, Paul Moore <paul@paul-moore.com> wrote:
>>> On Thu, Apr 27, 2017 at 8:47 PM, Paul Moore <paul@paul-moore.com> wrote:
>>>> In that case please send a proper inline patch to the audit mailing list
>>>> and we'll review it.
>>>>
>>>> Thanks.
>>>
>>> Now that I'm back in front of a proper screen/keyboard I've been
>>> looking over your patch and while you are very right in that the
>>> current RCU usage is very wrong, there are quite a few things I would
>>> like to see changed in your patch ... I'm working on something right
>>> now, I'll post an RFC draft to the audit list and CC you once I get
>>> this sorted out, expect something in a few hours.
>>>
>>> Also, once you've had a look at this new patch, and assuming you are
>>> okay with it, I'd like to add your sign-off to it.  This may not be
>>> your patch exactly, but a significant portion of it is borrowed from
>>> your patch yesterday.
>>
>> So your review process is: if people's V1 patch is not perfect, you
>> will rewrite it by yourself?
>
> As I mentioned earlier I didn't get a chance to properly review your
> patch yesterday for two important reasons: 1) it hit my inbox at the
> end of my day and I simply didn't have time and 2) you sent it as an
> attachment which makes it hard to review and provide feedback.  I took
> a closer look at your patch this morning and noticed a number of
> things that needed additional work as well as some merge/porting
> things; considering I never saw a response from you on my last email
> asking for an inline patch submission and taking into account where we
> are in the merge window (I'd like to submit this fix during the v4.12
> merge window) I went ahead and put together a patch based on your
> prototype.
>
> You'll see I just posted it and CC'd you (our emails probably crossed
> paths), asking if it was okay to add your sign-off and give you
> credit.  I'm only trying to speed up the process.  There is no malice
> here, I actually thought I was helping you out ... I suppose the old
> adage rings true: no good deed goes unpunished ;)

Sure, it is of course faster since you are the maintainer, but this is
not how we work as a whole community. I am sure it could be very
offensive if you just rewrite other new people's patch only to speed
up.

Speed is one thing, collaboration is another, for the long term the
latter is much more important than the former for kernel community.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-05-02  5:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-26 20:48 Boot regression caused by kauditd Cong Wang
2017-04-26 21:20 ` Paul Moore
2017-04-27  5:04   ` Cong Wang
2017-04-27 20:31   ` Cong Wang
2017-04-27 21:35     ` Cong Wang
2017-04-27 21:45       ` Cong Wang
2017-04-27 22:38         ` Paul Moore
2017-04-27 23:41           ` Cong Wang
2017-04-28  0:47             ` Paul Moore
2017-04-28 15:30               ` Paul Moore
2017-04-28 16:11                 ` Cong Wang
2017-04-28 16:26                   ` Paul Moore
2017-05-02  5:50                     ` Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).