* [Ocfs2-devel] [PATCH] ocfs2: fix a panic problem caused by o2cb_ctl
@ 2019-01-25 6:05 Jia Guo
2019-01-25 12:30 ` Joseph Qi
0 siblings, 1 reply; 3+ messages in thread
From: Jia Guo @ 2019-01-25 6:05 UTC (permalink / raw)
To: ocfs2-devel
In the process of creating a node, it will cause NULL pointer reference
in kernel if o2cb_ctl failed in the interval
(mkdir, o2cb_set_node_attribute(node_num)] in function o2cb_add_node.
The node num is initialized to 0 in function o2nm_node_group_make_item,
o2nm_node_group_drop_item will mistake the node number 0 for a
valid node number when we delete the node before the node number is set
correctly. If the local node number of the current host happens to be 0,
cluster->cl_local_node will be set to O2NM_INVALID_NODE_NUM while
o2hb_thread still running. The panic stack is generated as follows:
o2hb_thread
\-o2hb_do_disk_heartbeat
\-o2hb_check_own_slot
|-slot = ®->hr_slots[o2nm_this_node()];
//o2nm_this_node() return O2NM_INVALID_NODE_NUM
We need to check whether the node number is set when we delete the node.
Signed-off-by: Jia Guo <guojia12@huawei.com>
---
fs/ocfs2/cluster/nodemanager.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/fs/ocfs2/cluster/nodemanager.c b/fs/ocfs2/cluster/nodemanager.c
index 0e4166c..992ccc0 100644
--- a/fs/ocfs2/cluster/nodemanager.c
+++ b/fs/ocfs2/cluster/nodemanager.c
@@ -620,10 +620,16 @@ static void o2nm_node_group_drop_item(struct config_group *group,
{
struct o2nm_node *node = to_o2nm_node(item);
struct o2nm_cluster *cluster = to_o2nm_cluster(group->cg_item.ci_parent);
+ int nd_num_set = 0;
- o2net_disconnect_node(node);
+ /* nd_num might be 0 if the node number hasn't been set.. */
+ if (cluster->cl_nodes[node->nd_num] == node) {
+ nd_num_set = 1;
+
+ if (nd_num_set)
+ o2net_disconnect_node(node);
- if (cluster->cl_has_local &&
+ if (nd_num_set && cluster->cl_has_local &&
(cluster->cl_local_node == node->nd_num)) {
cluster->cl_has_local = 0;
cluster->cl_local_node = O2NM_INVALID_NODE_NUM;
@@ -638,8 +644,7 @@ static void o2nm_node_group_drop_item(struct config_group *group,
if (node->nd_ipv4_address)
rb_erase(&node->nd_ip_node, &cluster->cl_node_ip_tree);
- /* nd_num might be 0 if the node number hasn't been set.. */
- if (cluster->cl_nodes[node->nd_num] == node) {
+ if (nd_num_set) {
cluster->cl_nodes[node->nd_num] = NULL;
clear_bit(node->nd_num, cluster->cl_nodes_bitmap);
}
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [Ocfs2-devel] [PATCH] ocfs2: fix a panic problem caused by o2cb_ctl
2019-01-25 6:05 [Ocfs2-devel] [PATCH] ocfs2: fix a panic problem caused by o2cb_ctl Jia Guo
@ 2019-01-25 12:30 ` Joseph Qi
2019-01-26 1:49 ` Jia Guo
0 siblings, 1 reply; 3+ messages in thread
From: Joseph Qi @ 2019-01-25 12:30 UTC (permalink / raw)
To: ocfs2-devel
On 19/1/25 14:05, Jia Guo wrote:
> In the process of creating a node, it will cause NULL pointer reference
NULL pointer dereference
> in kernel if o2cb_ctl failed in the interval
> (mkdir, o2cb_set_node_attribute(node_num)] in function o2cb_add_node.
>
> The node num is initialized to 0 in function o2nm_node_group_make_item,
> o2nm_node_group_drop_item will mistake the node number 0 for a
> valid node number when we delete the node before the node number is set
> correctly. If the local node number of the current host happens to be 0,
> cluster->cl_local_node will be set to O2NM_INVALID_NODE_NUM while
> o2hb_thread still running. The panic stack is generated as follows:
>
> o2hb_thread
> \-o2hb_do_disk_heartbeat
> \-o2hb_check_own_slot
> |-slot = ®->hr_slots[o2nm_this_node()];
> //o2nm_this_node() return O2NM_INVALID_NODE_NUM
>
> We need to check whether the node number is set when we delete the node.
>
> Signed-off-by: Jia Guo <guojia12@huawei.com>
> ---
> fs/ocfs2/cluster/nodemanager.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ocfs2/cluster/nodemanager.c b/fs/ocfs2/cluster/nodemanager.c
> index 0e4166c..992ccc0 100644
> --- a/fs/ocfs2/cluster/nodemanager.c
> +++ b/fs/ocfs2/cluster/nodemanager.c
> @@ -620,10 +620,16 @@ static void o2nm_node_group_drop_item(struct config_group *group,
> {
> struct o2nm_node *node = to_o2nm_node(item);
> struct o2nm_cluster *cluster = to_o2nm_cluster(group->cg_item.ci_parent);
> + int nd_num_set = 0;
>
> - o2net_disconnect_node(node);
> + /* nd_num might be 0 if the node number hasn't been set.. */
> + if (cluster->cl_nodes[node->nd_num] == node) {
> + nd_num_set = 1;
> +
> + if (nd_num_set)
> + o2net_disconnect_node(node);
>
> - if (cluster->cl_has_local &&
> + if (nd_num_set && cluster->cl_has_local &&
> (cluster->cl_local_node == node->nd_num)) {
> cluster->cl_has_local = 0;
> cluster->cl_local_node = O2NM_INVALID_NODE_NUM;
> @@ -638,8 +644,7 @@ static void o2nm_node_group_drop_item(struct config_group *group,
> if (node->nd_ipv4_address)
> rb_erase(&node->nd_ip_node, &cluster->cl_node_ip_tree);
Why we have to this if node number hasn't been set?
If this is also under the condition, we may rewrite this patch like:
if (nd_num_set) {
do_something();
}
Thanks,
Joseph
>
> - /* nd_num might be 0 if the node number hasn't been set.. */
> - if (cluster->cl_nodes[node->nd_num] == node) {
> + if (nd_num_set) {
> cluster->cl_nodes[node->nd_num] = NULL;
> clear_bit(node->nd_num, cluster->cl_nodes_bitmap);
> }
>
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Ocfs2-devel] [PATCH] ocfs2: fix a panic problem caused by o2cb_ctl
2019-01-25 12:30 ` Joseph Qi
@ 2019-01-26 1:49 ` Jia Guo
0 siblings, 0 replies; 3+ messages in thread
From: Jia Guo @ 2019-01-26 1:49 UTC (permalink / raw)
To: ocfs2-devel
On 2019/1/25 20:30, Joseph Qi wrote:
>
>
> On 19/1/25 14:05, Jia Guo wrote:
>> In the process of creating a node, it will cause NULL pointer reference
>
> NULL pointer dereference
Fix it in patch v2.
>
>> in kernel if o2cb_ctl failed in the interval
>> (mkdir, o2cb_set_node_attribute(node_num)] in function o2cb_add_node.
>>
>> The node num is initialized to 0 in function o2nm_node_group_make_item,
>> o2nm_node_group_drop_item will mistake the node number 0 for a
>> valid node number when we delete the node before the node number is set
>> correctly. If the local node number of the current host happens to be 0,
>> cluster->cl_local_node will be set to O2NM_INVALID_NODE_NUM while
>> o2hb_thread still running. The panic stack is generated as follows:
>>
>> o2hb_thread
>> \-o2hb_do_disk_heartbeat
>> \-o2hb_check_own_slot
>> |-slot = ®->hr_slots[o2nm_this_node()];
>> //o2nm_this_node() return O2NM_INVALID_NODE_NUM
>>
>> We need to check whether the node number is set when we delete the node.
>>
>> Signed-off-by: Jia Guo <guojia12@huawei.com>
>> ---
>> fs/ocfs2/cluster/nodemanager.c | 13 +++++++++----
>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/ocfs2/cluster/nodemanager.c b/fs/ocfs2/cluster/nodemanager.c
>> index 0e4166c..992ccc0 100644
>> --- a/fs/ocfs2/cluster/nodemanager.c
>> +++ b/fs/ocfs2/cluster/nodemanager.c
>> @@ -620,10 +620,16 @@ static void o2nm_node_group_drop_item(struct config_group *group,
>> {
>> struct o2nm_node *node = to_o2nm_node(item);
>> struct o2nm_cluster *cluster = to_o2nm_cluster(group->cg_item.ci_parent);
>> + int nd_num_set = 0;
>>
>> - o2net_disconnect_node(node);
>> + /* nd_num might be 0 if the node number hasn't been set.. */
>> + if (cluster->cl_nodes[node->nd_num] == node) {
>> + nd_num_set = 1;
>> +
>> + if (nd_num_set)
>> + o2net_disconnect_node(node);
>>
>> - if (cluster->cl_has_local &&
>> + if (nd_num_set && cluster->cl_has_local &&
>> (cluster->cl_local_node == node->nd_num)) {
>> cluster->cl_has_local = 0;
>> cluster->cl_local_node = O2NM_INVALID_NODE_NUM;
>> @@ -638,8 +644,7 @@ static void o2nm_node_group_drop_item(struct config_group *group,
>> if (node->nd_ipv4_address)
>> rb_erase(&node->nd_ip_node, &cluster->cl_node_ip_tree);
>
> Why we have to this if node number hasn't been set?
> If this is also under the condition, we may rewrite this patch like:
> if (nd_num_set) {
> do_something();
> }
o2cb_ctl set attribute in the following order:
1?mkdir, create configfs item
2?set IP port attribute
3?set IP address attribute
4?set node number attribute
IP address is set before the node number, we need to do erase whether the node number
is set successfully or not.
This patch can still rewrite as you suggest, fix it in patch v2.
Thanks,
Jia
>
> Thanks,
> Joseph
>
>>
>> - /* nd_num might be 0 if the node number hasn't been set.. */
>> - if (cluster->cl_nodes[node->nd_num] == node) {
>> + if (nd_num_set) {
>> cluster->cl_nodes[node->nd_num] = NULL;
>> clear_bit(node->nd_num, cluster->cl_nodes_bitmap);
>> }
>>
> .
>
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-01-26 1:49 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-25 6:05 [Ocfs2-devel] [PATCH] ocfs2: fix a panic problem caused by o2cb_ctl Jia Guo
2019-01-25 12:30 ` Joseph Qi
2019-01-26 1:49 ` Jia Guo
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.