* [B.A.T.M.A.N.] [PATCH] batman-adv: Fix crash on module shutdown with multiple ifaces
@ 2011-04-15 18:24 Linus Lüssing
2011-04-16 7:54 ` Sven Eckelmann
0 siblings, 1 reply; 5+ messages in thread
From: Linus Lüssing @ 2011-04-15 18:24 UTC (permalink / raw)
To: b.a.t.m.a.n
hardif_remove_interfaces() removes all hard interfaces from the
hardif_list before freeing and cleaning up any device. However the clean
up procedures in orig_hash_del_if()
(hardif_remove_interface()->hardif_disable_interface()->
orig_hash_del_if())
need the other interfaces still to be present in the hardif_list.
Otherwise it won't renumber any preceding interfaces, which leads to an
unhandled kernel paging request in orig_node_del_if()'s "/* copy second
part */" due to wrong hard_if->if_num's.
With this commit the interface removal on module shutdown will be down
in the same way as removing single interfaces from batman only: One
interface will be removed and cleaned at a time.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
---
hard-interface.c | 15 ++++-----------
1 files changed, 4 insertions(+), 11 deletions(-)
diff --git a/hard-interface.c b/hard-interface.c
index b3058e4..f039a3d 100644
--- a/hard-interface.c
+++ b/hard-interface.c
@@ -490,20 +490,13 @@ static void hardif_remove_interface(struct hard_iface *hard_iface)
void hardif_remove_interfaces(void)
{
struct hard_iface *hard_iface, *hard_iface_tmp;
- struct list_head if_queue;
- INIT_LIST_HEAD(&if_queue);
-
- spin_lock(&hardif_list_lock);
- list_for_each_entry_safe(hard_iface, hard_iface_tmp,
- &hardif_list, list) {
+ rtnl_lock();
+ list_for_each_entry_safe(hard_iface, hard_iface_tmp, &hardif_list, list) {
+ spin_lock(&hardif_list_lock);
list_del_rcu(&hard_iface->list);
- list_add_tail(&hard_iface->list, &if_queue);
- }
- spin_unlock(&hardif_list_lock);
+ spin_unlock(&hardif_list_lock);
- rtnl_lock();
- list_for_each_entry_safe(hard_iface, hard_iface_tmp, &if_queue, list) {
hardif_remove_interface(hard_iface);
}
rtnl_unlock();
--
1.7.4.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] batman-adv: Fix crash on module shutdown with multiple ifaces
2011-04-15 18:24 [B.A.T.M.A.N.] [PATCH] batman-adv: Fix crash on module shutdown with multiple ifaces Linus Lüssing
@ 2011-04-16 7:54 ` Sven Eckelmann
2011-04-16 8:25 ` Sven Eckelmann
2011-04-27 16:02 ` Linus Lüssing
0 siblings, 2 replies; 5+ messages in thread
From: Sven Eckelmann @ 2011-04-16 7:54 UTC (permalink / raw)
To: b.a.t.m.a.n
[-- Attachment #1: Type: Text/Plain, Size: 1266 bytes --]
Linus Lüssing wrote:
> hardif_remove_interfaces() removes all hard interfaces from the
> hardif_list before freeing and cleaning up any device. However the clean
> up procedures in orig_hash_del_if()
> (hardif_remove_interface()->hardif_disable_interface()->
> orig_hash_del_if())
> need the other interfaces still to be present in the hardif_list.
> Otherwise it won't renumber any preceding interfaces, which leads to an
> unhandled kernel paging request in orig_node_del_if()'s "/* copy second
> part */" due to wrong hard_if->if_num's.
>
> With this commit the interface removal on module shutdown will be down
> in the same way as removing single interfaces from batman only: One
> interface will be removed and cleaned at a time.
>
> Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Please use --patience as requested in
http://www.open-mesh.org/wiki/open-mesh/Contribute
Please show us (as part of the commit message) why the information in
http://www.open-mesh.org/projects/batman-adv/repository/revisions/132b776c22c9b71962a3ed9a33e5b6f9218dae1b
isn't valid anymore and explain why it is save to use the spin_lock only
inside the loop (but it would have to protect the loop in normal situations).
Kind regards,
Sven
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] batman-adv: Fix crash on module shutdown with multiple ifaces
2011-04-16 7:54 ` Sven Eckelmann
@ 2011-04-16 8:25 ` Sven Eckelmann
2011-04-27 16:02 ` Linus Lüssing
1 sibling, 0 replies; 5+ messages in thread
From: Sven Eckelmann @ 2011-04-16 8:25 UTC (permalink / raw)
To: b.a.t.m.a.n
[-- Attachment #1: Type: Text/Plain, Size: 1616 bytes --]
Sven Eckelmann wrote:
> Linus Lüssing wrote:
> > hardif_remove_interfaces() removes all hard interfaces from the
> > hardif_list before freeing and cleaning up any device. However the clean
> > up procedures in orig_hash_del_if()
> > (hardif_remove_interface()->hardif_disable_interface()->
> > orig_hash_del_if())
> > need the other interfaces still to be present in the hardif_list.
> > Otherwise it won't renumber any preceding interfaces, which leads to an
> > unhandled kernel paging request in orig_node_del_if()'s "/* copy second
> > part */" due to wrong hard_if->if_num's.
> >
> > With this commit the interface removal on module shutdown will be down
> > in the same way as removing single interfaces from batman only: One
> > interface will be removed and cleaned at a time.
> >
> > Signed-off-by: Linus Lüssing <linus.luessing@web.de>
>
> Please use --patience as requested in
> http://www.open-mesh.org/wiki/open-mesh/Contribute
>
> Please show us (as part of the commit message) why the information in
> http://www.open-mesh.org/projects/batman-adv/repository/revisions/132b776c
> 22c9b71962a3ed9a33e5b6f9218dae1b isn't valid anymore and explain why it is
> save to use the spin_lock only inside the loop (but it would have to
> protect the loop in normal situations).
Sry, this was not the correct commit - The commit which fixed a problematic
locking behaviour was 5d4b5a4d - but I didn't gave a lockdep output there.
The other question must still be answered.
Btw. what is the status of the sysfs_addrm_finish vs. rtnl_lock patch?
Kind regards,
Sven
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] batman-adv: Fix crash on module shutdown with multiple ifaces
2011-04-16 7:54 ` Sven Eckelmann
2011-04-16 8:25 ` Sven Eckelmann
@ 2011-04-27 16:02 ` Linus Lüssing
2011-04-29 19:58 ` Sven Eckelmann
1 sibling, 1 reply; 5+ messages in thread
From: Linus Lüssing @ 2011-04-27 16:02 UTC (permalink / raw)
To: Sven Eckelmann; +Cc: b.a.t.m.a.n
On Sat, Apr 16, 2011 at 09:54:48AM +0200, Sven Eckelmann wrote:
> Linus Lüssing wrote:
> > hardif_remove_interfaces() removes all hard interfaces from the
> > hardif_list before freeing and cleaning up any device. However the clean
> > up procedures in orig_hash_del_if()
> > (hardif_remove_interface()->hardif_disable_interface()->
> > orig_hash_del_if())
> > need the other interfaces still to be present in the hardif_list.
> > Otherwise it won't renumber any preceding interfaces, which leads to an
> > unhandled kernel paging request in orig_node_del_if()'s "/* copy second
> > part */" due to wrong hard_if->if_num's.
> >
> > With this commit the interface removal on module shutdown will be down
> > in the same way as removing single interfaces from batman only: One
> > interface will be removed and cleaned at a time.
> >
> > Signed-off-by: Linus Lüssing <linus.luessing@web.de>
>
>
> Please use --patience as requested in
> http://www.open-mesh.org/wiki/open-mesh/Contribute
>
> Please show us (as part of the commit message) why the information in
> http://www.open-mesh.org/projects/batman-adv/repository/revisions/132b776c22c9b71962a3ed9a33e5b6f9218dae1b
> isn't valid anymore and explain why it is save to use the spin_lock only
> inside the loop (but it would have to protect the loop in normal situations).
>
> Kind regards,
> Sven
Hi Sven,
Ah, oki doki, didn't know about commit 5d4b5a4d and yes, a revert
of that commit looks kind of similar to my patch.
Commit 5d4b5a4d together with your statement confuse me a little. The
commit message does not say anything about a locking dependancy
issue, but seems to be a performance patch (which does not seem as
such a severe problem to me, as removing/adding interfaces /
removing the batman-adv module does not happen that frequently in
common setups ;) ). Could you explain a little further which
combinations of locks could introduce a problem?
Hmm, for the "and explain why it is save to use the spin_lock
only" part, aggreed, I think it was initially a mistake of mine.
And usually this would not protect us from a new interface being
added or an interface being removed from batman during a
NETDEV_REGISTER/UNREGISTER event while we are trying to flush the
if_list.
However, just before calling hardif_remove_interfaces(), we are
calling unregister_netdevice_notifier(&hard_if_notifier).
So as far as I know, no hardif_add_interface() or
hardif_remove_interface() and according list_add/del_rcu for the
if_list should be called anymore.
Cheers, Linus
PS: And it's actually not an "unhandled kernel paging request" but
a "Null pointer dereference". Also see this ticket:
http://www.open-mesh.org/issues/147
I'm also wondering why we'd actually need the rtnl_lock() in
hardif_remove_interfaces() then with that reasoning. What
operation in hardif_remove_interface() (without the 's') needs to
be protected with the rtnl_lock(), could be place the rtnl_lock a
little tighter instead to also fix the issue from here?
http://www.open-mesh.org/issues/145
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] batman-adv: Fix crash on module shutdown with multiple ifaces
2011-04-27 16:02 ` Linus Lüssing
@ 2011-04-29 19:58 ` Sven Eckelmann
0 siblings, 0 replies; 5+ messages in thread
From: Sven Eckelmann @ 2011-04-29 19:58 UTC (permalink / raw)
To: Linus Lüssing; +Cc: b.a.t.m.a.n
[-- Attachment #1: Type: Text/Plain, Size: 2085 bytes --]
Linus Lüssing wrote:
> Ah, oki doki, didn't know about commit 5d4b5a4d and yes, a revert
> of that commit looks kind of similar to my patch.
>
> Commit 5d4b5a4d together with your statement confuse me a little. The
> commit message does not say anything about a locking dependancy
> issue, but seems to be a performance patch (which does not seem as
> such a severe problem to me, as removing/adding interfaces /
> removing the batman-adv module does not happen that frequently in
> common setups ;) ). Could you explain a little further which
> combinations of locks could introduce a problem?
No
> Hmm, for the "and explain why it is save to use the spin_lock
> only" part, aggreed, I think it was initially a mistake of mine.
> And usually this would not protect us from a new interface being
> added or an interface being removed from batman during a
> NETDEV_REGISTER/UNREGISTER event while we are trying to flush the
> if_list.
> However, just before calling hardif_remove_interfaces(), we are
> calling unregister_netdevice_notifier(&hard_if_notifier).
> So as far as I know, no hardif_add_interface() or
> hardif_remove_interface() and according list_add/del_rcu for the
> if_list should be called anymore.
Interesting assumption, but how did you ensure that everything is in a
synchronous state? The network core also uses rcu - and it doesn't use the
atomic notifier functions.
> Cheers, Linus
>
> PS: And it's actually not an "unhandled kernel paging request" but
> a "Null pointer dereference". Also see this ticket:
> http://www.open-mesh.org/issues/147
>
> I'm also wondering why we'd actually need the rtnl_lock() in
> hardif_remove_interfaces() then with that reasoning. What
> operation in hardif_remove_interface() (without the 's') needs to
> be protected with the rtnl_lock(), could be place the rtnl_lock a
> little tighter instead to also fix the issue from here?
> http://www.open-mesh.org/issues/145
See 132b776c22c9b71962a3ed9a33e5b6f9218dae1b
I will propose two different patches.
Regards,
Sven
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-04-29 19:58 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-15 18:24 [B.A.T.M.A.N.] [PATCH] batman-adv: Fix crash on module shutdown with multiple ifaces Linus Lüssing
2011-04-16 7:54 ` Sven Eckelmann
2011-04-16 8:25 ` Sven Eckelmann
2011-04-27 16:02 ` Linus Lüssing
2011-04-29 19:58 ` Sven Eckelmann
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.