On Sun, Jun 23, 2013 at 03:51:29PM +0400, Glauber Costa wrote: > On Fri, Jun 21, 2013 at 11:00:21AM +0200, Michal Hocko wrote: > > On Thu 20-06-13 17:12:01, Michal Hocko wrote: > > > I am bisecting it again. It is quite tedious, though, because good case > > > is hard to be sure about. > > > > OK, so now I converged to 2d4fc052 (inode: convert inode lru list to generic lru > > list code.) in my tree and I have double checked it matches what is in > > the linux-next. This doesn't help much to pin point the issue I am > > afraid :/ > > > Can you revert this patch (easiest way ATM is to rewind your tree to a point > right before it) and apply the following patch? > > As Dave has mentioned, it is very likely that this bug was already there, we > were just not ever checking imbalances. The attached patch would tell us at > least if the imbalance was there before. If this is the case, I would suggest > turning the BUG condition into a WARN_ON_ONCE since we would be officially > not introducing any regression. It is no less of a bug, though, and we should > keep looking for it. > > The main change from before / after the patch is that we are now keeping things > per node. One possibility of having this BUGing would be to have an inode to be > inserted into one node-lru and removed from another. I cannot see how it could > happen, because kernel pages are stable in memory and are not moved from node > to node. We could still have some sort of weird bug in the node calculation > function. In any case, would it be possible for you to artificially restrict > your setup to a single node ? Although I have no idea how to do that, we seem > to have no parameter to disable numa. Maybe booting with less memory, enough to > fit a single node? > The patch: