OF_DYNAMIC node lifecycle

* OF_DYNAMIC node lifecycle
@ 2014-06-18 20:07 ` Grant Likely
  0 siblings, 0 replies; 48+ messages in thread
From: Grant Likely @ 2014-06-18 20:07 UTC (permalink / raw)
  To: Nathan Fontenot, Tyrel Datwyler
  Cc: Benjamin Herrenschmidt, linuxppc-dev,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Pantelis Antoniou

Hi Nathan and Tyrel,

I'm looking into lifecycle issues on nodes modified by OF_DYNAMIC, and
I'm hoping you can help me. Right now, pseries seems to be the only
user of OF_DYNAMIC, but making OF_DYNAMIC work has a huge impact on
the entire kernel because it requires all DT code to manage reference
counting with iterating over nodes. Most users simply get it wrong.
Pantelis did some investigation and found that the reference counts on
a running kernel are all over the place. I have my doubts that any
code really gets it right.

The problem is that users need to know when it is appropriate to call
of_node_get()/of_node_put(). All list traversals that exit early need
an extra call to of_node_put(), and code that is searching for a node
in the tree and holding a reference to it needs to call of_node_get().

I've got a few pseries questions:
- What are the changes being requested by pseries firmware? Is it only
CPUs and memory nodes, or does it manipulate things all over the tree?
- How frequent are the changes? How many changes would be likely over
the runtime of the system?
- Are you able to verify that removed nodes are actually able to be
freed correctly? Do you have any testcases for node removal?

I'm thinking very seriously about changing the locking semantics of DT
code entirely so that most users never have to worry about
of_node_get/put at all. If the DT code is switched to use rcu
primitives for tree iteration (which also means making DT code use
list_head, something I'm already investigating), then instead of
trying to figure out of_node_get/put rules, callers could use
rcu_read_lock()/rcu_read_unlock() to protect the region that is
searching over nodes, and only call of_node_get() if the node pointer
is needed outside the rcu read-side lock.

I'd really like to be rid of the node reference counting entirely, but
I can't figure out a way of doing that safely, so I'd settle for
making it a lot easier to get correct.

Thoughts?

g.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 48+ messages in thread