* slab-nomerge (was Re: [git pull] device mapper changes for 4.3)
@ 2015-09-02 23:13 Linus Torvalds
2015-09-03 0:48 ` Andrew Morton
2015-09-03 0:51 ` Mike Snitzer
0 siblings, 2 replies; 42+ messages in thread
From: Linus Torvalds @ 2015-09-02 23:13 UTC (permalink / raw)
To: Mike Snitzer, Dave Chinner, Christoph Lameter, Pekka Enberg,
Andrew Morton, David Rientjes, Joonsoo Kim
Cc: dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka,
Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen,
linux-mm
On Wed, Sep 2, 2015 at 10:39 AM, Mike Snitzer <snitzer@redhat.com> wrote:
>
> - last but not least: add SLAB_NO_MERGE flag to mm/slab_common and
> disable slab merging for all of DM's slabs (XFS will also use
> SLAB_NO_MERGE once merged).
So I'm not at all convinced this is the right thing to do. In fact,
I'm pretty convinced it shouldn't be done this way. Since those
commits were at the top of your tree, I just didn't pull them, but
took the rest..
You are basically making this one-sided decision based on your notion
of convenience, and just forcing that thing unconditionally on people.
Your rationale seems _totally_ bogus: you say that it's to be able to
observe the sizes of the dm slabs without using slab debugging.
First off, you don't have to enable slab debugging. You can just
disable slab merging. It's called "slab_nomerge". It does exactly
what you would think it does.
And what is it that makes dm slabs such a special little princess?
What makes you think that the fact that _you_ want to look at slab
statistics means that everybody else suddenly must have separate slabs
for dm, and dm only? Or xfs?
The other "rationale" was that not merging slabs limits
cross-subsystem memory corruption. Again, what the _hell_ is special
about device mapper that dm - and only dm - would make this a special
thing? That is just pure and utter garbage. Again, we already have
that "slab_nomerge" option, exactly so that when odd slab corruption
issues happen (they are rare, but they do occasionally happen), you
can try that to see if that pinpoints the problem more. And it is
*not* limited to some random set of subsystems. Which makes it clearly
superior to your broken approach, wouldn't you agree?
The only possible true rationale for why dm is special is "because dm
is such a buggy piece of sh*t that it's much more likely to have these
slab corruption bugs than anything else, so I'm just protecting the
rest of the system".
Is that really your rationale? Somehow I doubt it. But if it is, you
really should have said so. At least then it would make sense why this
thing came in through the dm tree, and why dm is so special than it -
and only it - would disable slab merging.
So I'm not pulling things like this from the device mapper tree. There
is just no excuse that I can see for something like SLAB_NO_MERGE to
go through the dm tree in the first place, but that's doubly true when
the rationale for these things were bogus and had nothing what-so-ever
to do with dm.
Things like this aren't supposed to come in through random irrelevant
trees like this, and with no discussion (at least judging by the
commits) with the maintainers of the other pieces of code.
If you have issues with slab merging, then those should be discussed
as such, not as some magical and bogus dm or xfs special case when
they damn well aren't, and damn well will never be.
Yes, I'm annoyed. This was not done well. I realize that everybody
thinks that _their_ code is so special and exceptional that
"obviously" they should be treated specially, but I don't see that
that is the case at all in this case.
If you want to argue that slab merging should be disabled by default,
then that is an argument that I'm willing to believe might be valid
("the downsides are bigger than the upsides"). Or if you are able to
explain why dm really _is_ special, that's an option too. But this
kind of "random subsystems decide unilaterally to not follow the
normal rules" is not acceptable. Not when the "arguments" for it have
absolutely nothing in particular to do with that subsystem.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-02 23:13 slab-nomerge (was Re: [git pull] device mapper changes for 4.3) Linus Torvalds @ 2015-09-03 0:48 ` Andrew Morton 2015-09-03 0:53 ` Mike Snitzer 2015-09-03 0:51 ` Mike Snitzer 1 sibling, 1 reply; 42+ messages in thread From: Andrew Morton @ 2015-09-03 0:48 UTC (permalink / raw) To: Linus Torvalds Cc: Mike Snitzer, Dave Chinner, Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Wed, 2 Sep 2015 16:13:44 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Sep 2, 2015 at 10:39 AM, Mike Snitzer <snitzer@redhat.com> wrote: > > > > - last but not least: add SLAB_NO_MERGE flag to mm/slab_common and > > disable slab merging for all of DM's slabs (XFS will also use > > SLAB_NO_MERGE once merged). > > So I'm not at all convinced this is the right thing to do. In fact, > I'm pretty convinced it shouldn't be done this way. Since those > commits were at the top of your tree, I just didn't pull them, but > took the rest.. I don't have problems with the patch itself, really. It only affects callers who use SLAB_NO_MERGE and those developers can make their own decisions. It is a bit sad to de-optimise dm for all users for all time in order to make life a bit easier for dm's developers, but maybe that's a decent tradeoff. What I do have a problem with is that afaict the patch appeared on linux-mm for the first time just yesterday. Didn't cc slab developers, it isn't in linux-next, didn't cc linux-kernel or linux-mm or slab/mm developers on the pull request. Bad! I'd like the slab developers to have time to understand and review this change, please. Partly so they have a chance to provide feedback for the usual reasons, but also to help them understand the effect their design choice had on client subystems. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 0:48 ` Andrew Morton @ 2015-09-03 0:53 ` Mike Snitzer 0 siblings, 0 replies; 42+ messages in thread From: Mike Snitzer @ 2015-09-03 0:53 UTC (permalink / raw) To: Andrew Morton Cc: Linus Torvalds, Dave Chinner, Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Wed, Sep 02 2015 at 8:48pm -0400, Andrew Morton <akpm@linux-foundation.org> wrote: > On Wed, 2 Sep 2015 16:13:44 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Wed, Sep 2, 2015 at 10:39 AM, Mike Snitzer <snitzer@redhat.com> wrote: > > > > > > - last but not least: add SLAB_NO_MERGE flag to mm/slab_common and > > > disable slab merging for all of DM's slabs (XFS will also use > > > SLAB_NO_MERGE once merged). > > > > So I'm not at all convinced this is the right thing to do. In fact, > > I'm pretty convinced it shouldn't be done this way. Since those > > commits were at the top of your tree, I just didn't pull them, but > > took the rest.. > > I don't have problems with the patch itself, really. It only affects > callers who use SLAB_NO_MERGE and those developers can make > their own decisions. > > It is a bit sad to de-optimise dm for all users for all time in order > to make life a bit easier for dm's developers, but maybe that's a > decent tradeoff. > > > What I do have a problem with is that afaict the patch appeared on > linux-mm for the first time just yesterday. Didn't cc slab developers, > it isn't in linux-next, didn't cc linux-kernel or linux-mm or slab/mm > developers on the pull request. Bad! Yeap, noted. Won't happen again. > I'd like the slab developers to have time to understand and review this > change, please. Partly so they have a chance to provide feedback for > the usual reasons, but also to help them understand the effect their > design choice had on client subystems. Sure, sorry to force the issue like I did. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-02 23:13 slab-nomerge (was Re: [git pull] device mapper changes for 4.3) Linus Torvalds @ 2015-09-03 0:51 ` Mike Snitzer 2015-09-03 0:51 ` Mike Snitzer 1 sibling, 0 replies; 42+ messages in thread From: Mike Snitzer @ 2015-09-03 0:51 UTC (permalink / raw) To: Linus Torvalds Cc: Dave Chinner, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Wed, Sep 02 2015 at 7:13pm -0400, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Sep 2, 2015 at 10:39 AM, Mike Snitzer <snitzer@redhat.com> wrote: > > > > - last but not least: add SLAB_NO_MERGE flag to mm/slab_common and > > disable slab merging for all of DM's slabs (XFS will also use > > SLAB_NO_MERGE once merged). > > So I'm not at all convinced this is the right thing to do. In fact, > I'm pretty convinced it shouldn't be done this way. Since those > commits were at the top of your tree, I just didn't pull them, but > took the rest.. OK, thanks. > You are basically making this one-sided decision based on your notion > of convenience, and just forcing that thing unconditionally on people. The switch to slab merging was forced on everyone without proper notice. What I made possible with SLAB_NO_MERGE is for each subsystem to decide if they would prefer to not allow slab merging. > Your rationale seems _totally_ bogus: you say that it's to be able to > observe the sizes of the dm slabs without using slab debugging. > > First off, you don't have to enable slab debugging. You can just > disable slab merging. It's called "slab_nomerge". It does exactly > what you would think it does. I'm well aware of slab_nomerge. I called it out in my commit message. > And what is it that makes dm slabs such a special little princess? > What makes you think that the fact that _you_ want to look at slab > statistics means that everybody else suddenly must have separate slabs > for dm, and dm only? Or xfs? >From where I sit it is much more useful to have separate slabs. Could be if a case was actually made for slab merging I'd change my view. But as of now these trump the stated benefits of slab merging: 1) useful slab usage stats 2) fault isolation from other subsystems > The other "rationale" was that not merging slabs limits > cross-subsystem memory corruption. Again, what the _hell_ is special > about device mapper that dm - and only dm - would make this a special > thing? That is just pure and utter garbage. Again, we already have > that "slab_nomerge" option, exactly so that when odd slab corruption > issues happen (they are rare, but they do occasionally happen), you > can try that to see if that pinpoints the problem more. And it is > *not* limited to some random set of subsystems. Which makes it clearly > superior to your broken approach, wouldn't you agree? I'm not interested in deciding such things for everyone. I added a flag that enables piecewise enablement of unshared slabs for subsystems that really don't want shared slabs. Aside from improved accounting, the point is to not allow other crap code (e.g. staging or whatever) to impact other subsystems via shared slabs. > The only possible true rationale for why dm is special is "because dm > is such a buggy piece of sh*t that it's much more likely to have these > slab corruption bugs than anything else, so I'm just protecting the > rest of the system". > > Is that really your rationale? Somehow I doubt it. But if it is, you > really should have said so. At least then it would make sense why this > thing came in through the dm tree, and why dm is so special than it - > and only it - would disable slab merging. The 3 lines that added SLAB_NO_MERGE were pretty damn clean. SLAB_NO_MERGE gives subsystems a choice they didn't have before and they frankly probably never knew they had to care about it because they didn't know slabs were being merged. I asked around enough to know I'm not an idiot for having missed the memo on slab merging. Lack of awareness aside, nobody ever _convincingly_ detailed why slab merging was pushed on everyone. Look at the header for commit 12220de ("mm/slab: support slab merge") -- now that is some seriously weak justification! I sought to get more insight on "why slab merging?" and all I found was this in Documentation/vm/slub.txt: " Slab merging ------------ If no debug options are specified then SLUB may merge similar slabs together in order to reduce overhead and increase cache hotness of objects. slabinfo -a displays which slabs were merged together." " I couldn't even find which package provides slabinfo to run slabinfo -a! And the hand-wavvy "reduce overhead and increase cache hotness of objects" frankly sucks. > So I'm not pulling things like this from the device mapper tree. There > is just no excuse that I can see for something like SLAB_NO_MERGE to > go through the dm tree in the first place, but that's doubly true when > the rationale for these things were bogus and had nothing what-so-ever > to do with dm. As DM maintainer I do have a choice about how the subsystem is architected. > Things like this aren't supposed to come in through random irrelevant > trees like this, and with no discussion (at least judging by the > commits) with the maintainers of the other pieces of code. DM is irrelevant now? Because I pissed you off? Or because you trully think that? This is the first and hopefully last time I get flamed by you. I shouldn't have pushed for this change so aggressively. The lack of feedback from mm people shouldn't have been taken by me as implied "we forced it on you a year ago, fuck you". But I'm genuinely _not_ appreciative of this change to shared slabs so I took action to restore what I hold to be the right way to design system software. > If you have issues with slab merging, then those should be discussed > as such, not as some magical and bogus dm or xfs special case when > they damn well aren't, and damn well will never be. > > Yes, I'm annoyed. This was not done well. I realize that everybody > thinks that _their_ code is so special and exceptional that > "obviously" they should be treated specially, but I don't see that > that is the case at all in this case. > > If you want to argue that slab merging should be disabled by default, > then that is an argument that I'm willing to believe might be valid > ("the downsides are bigger than the upsides"). Or if you are able to > explain why dm really _is_ special, that's an option too. But this > kind of "random subsystems decide unilaterally to not follow the > normal rules" is not acceptable. Not when the "arguments" for it have > absolutely nothing in particular to do with that subsystem. DM isn't special. Never intended it to come off like it is. I don't want slab merging but as a middle ground I made it so it is left to each subsystem to decide to use it or not. I clearly was the first to take issue with slab merging by calling it out with patches. In doing so Dave Chinner said he'd rather avoid using shared slabs in XFS. Pretty sure XFS isn't irrelvant yet. I'd wager there would be a flood of other subsystems opting to use SLAB_NO_MERGE. I can appreciate that as something the pro-slab-merge camp would like to avoid (the more that opt-out the more useless slab merging becomes). It is messed up that no _real_ justification was given for slab merging yet it was pushed on everyone. Thankfully it hasn't been unstable (which backs up your point) but I'd still love to understand how it is so beneficial. Is it a significant win? If so where? Or is it a microoptimization at the expense of both accounting and fault isolation? Mike -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) @ 2015-09-03 0:51 ` Mike Snitzer 0 siblings, 0 replies; 42+ messages in thread From: Mike Snitzer @ 2015-09-03 0:51 UTC (permalink / raw) To: Linus Torvalds Cc: Dave Chinner, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Wed, Sep 02 2015 at 7:13pm -0400, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Sep 2, 2015 at 10:39 AM, Mike Snitzer <snitzer@redhat.com> wrote: > > > > - last but not least: add SLAB_NO_MERGE flag to mm/slab_common and > > disable slab merging for all of DM's slabs (XFS will also use > > SLAB_NO_MERGE once merged). > > So I'm not at all convinced this is the right thing to do. In fact, > I'm pretty convinced it shouldn't be done this way. Since those > commits were at the top of your tree, I just didn't pull them, but > took the rest.. OK, thanks. > You are basically making this one-sided decision based on your notion > of convenience, and just forcing that thing unconditionally on people. The switch to slab merging was forced on everyone without proper notice. What I made possible with SLAB_NO_MERGE is for each subsystem to decide if they would prefer to not allow slab merging. > Your rationale seems _totally_ bogus: you say that it's to be able to > observe the sizes of the dm slabs without using slab debugging. > > First off, you don't have to enable slab debugging. You can just > disable slab merging. It's called "slab_nomerge". It does exactly > what you would think it does. I'm well aware of slab_nomerge. I called it out in my commit message. > And what is it that makes dm slabs such a special little princess? > What makes you think that the fact that _you_ want to look at slab > statistics means that everybody else suddenly must have separate slabs > for dm, and dm only? Or xfs? From where I sit it is much more useful to have separate slabs. Could be if a case was actually made for slab merging I'd change my view. But as of now these trump the stated benefits of slab merging: 1) useful slab usage stats 2) fault isolation from other subsystems > The other "rationale" was that not merging slabs limits > cross-subsystem memory corruption. Again, what the _hell_ is special > about device mapper that dm - and only dm - would make this a special > thing? That is just pure and utter garbage. Again, we already have > that "slab_nomerge" option, exactly so that when odd slab corruption > issues happen (they are rare, but they do occasionally happen), you > can try that to see if that pinpoints the problem more. And it is > *not* limited to some random set of subsystems. Which makes it clearly > superior to your broken approach, wouldn't you agree? I'm not interested in deciding such things for everyone. I added a flag that enables piecewise enablement of unshared slabs for subsystems that really don't want shared slabs. Aside from improved accounting, the point is to not allow other crap code (e.g. staging or whatever) to impact other subsystems via shared slabs. > The only possible true rationale for why dm is special is "because dm > is such a buggy piece of sh*t that it's much more likely to have these > slab corruption bugs than anything else, so I'm just protecting the > rest of the system". > > Is that really your rationale? Somehow I doubt it. But if it is, you > really should have said so. At least then it would make sense why this > thing came in through the dm tree, and why dm is so special than it - > and only it - would disable slab merging. The 3 lines that added SLAB_NO_MERGE were pretty damn clean. SLAB_NO_MERGE gives subsystems a choice they didn't have before and they frankly probably never knew they had to care about it because they didn't know slabs were being merged. I asked around enough to know I'm not an idiot for having missed the memo on slab merging. Lack of awareness aside, nobody ever _convincingly_ detailed why slab merging was pushed on everyone. Look at the header for commit 12220de ("mm/slab: support slab merge") -- now that is some seriously weak justification! I sought to get more insight on "why slab merging?" and all I found was this in Documentation/vm/slub.txt: " Slab merging ------------ If no debug options are specified then SLUB may merge similar slabs together in order to reduce overhead and increase cache hotness of objects. slabinfo -a displays which slabs were merged together." " I couldn't even find which package provides slabinfo to run slabinfo -a! And the hand-wavvy "reduce overhead and increase cache hotness of objects" frankly sucks. > So I'm not pulling things like this from the device mapper tree. There > is just no excuse that I can see for something like SLAB_NO_MERGE to > go through the dm tree in the first place, but that's doubly true when > the rationale for these things were bogus and had nothing what-so-ever > to do with dm. As DM maintainer I do have a choice about how the subsystem is architected. > Things like this aren't supposed to come in through random irrelevant > trees like this, and with no discussion (at least judging by the > commits) with the maintainers of the other pieces of code. DM is irrelevant now? Because I pissed you off? Or because you trully think that? This is the first and hopefully last time I get flamed by you. I shouldn't have pushed for this change so aggressively. The lack of feedback from mm people shouldn't have been taken by me as implied "we forced it on you a year ago, fuck you". But I'm genuinely _not_ appreciative of this change to shared slabs so I took action to restore what I hold to be the right way to design system software. > If you have issues with slab merging, then those should be discussed > as such, not as some magical and bogus dm or xfs special case when > they damn well aren't, and damn well will never be. > > Yes, I'm annoyed. This was not done well. I realize that everybody > thinks that _their_ code is so special and exceptional that > "obviously" they should be treated specially, but I don't see that > that is the case at all in this case. > > If you want to argue that slab merging should be disabled by default, > then that is an argument that I'm willing to believe might be valid > ("the downsides are bigger than the upsides"). Or if you are able to > explain why dm really _is_ special, that's an option too. But this > kind of "random subsystems decide unilaterally to not follow the > normal rules" is not acceptable. Not when the "arguments" for it have > absolutely nothing in particular to do with that subsystem. DM isn't special. Never intended it to come off like it is. I don't want slab merging but as a middle ground I made it so it is left to each subsystem to decide to use it or not. I clearly was the first to take issue with slab merging by calling it out with patches. In doing so Dave Chinner said he'd rather avoid using shared slabs in XFS. Pretty sure XFS isn't irrelvant yet. I'd wager there would be a flood of other subsystems opting to use SLAB_NO_MERGE. I can appreciate that as something the pro-slab-merge camp would like to avoid (the more that opt-out the more useless slab merging becomes). It is messed up that no _real_ justification was given for slab merging yet it was pushed on everyone. Thankfully it hasn't been unstable (which backs up your point) but I'd still love to understand how it is so beneficial. Is it a significant win? If so where? Or is it a microoptimization at the expense of both accounting and fault isolation? Mike -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 0:51 ` Mike Snitzer (?) @ 2015-09-03 1:21 ` Linus Torvalds 2015-09-03 2:31 ` Mike Snitzer 2015-09-03 6:02 ` Dave Chinner -1 siblings, 2 replies; 42+ messages in thread From: Linus Torvalds @ 2015-09-03 1:21 UTC (permalink / raw) To: Mike Snitzer Cc: Dave Chinner, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Wed, Sep 2, 2015 at 5:51 PM, Mike Snitzer <snitzer@redhat.com> wrote: > > What I made possible with SLAB_NO_MERGE is for each subsystem to decide > if they would prefer to not allow slab merging. .. and why is that a choice that even makes sense at that level? Seriously. THAT is the fundamental issue here. There are absolutely zero reasons this is dm-specific, but it is equally true that there are absolutely zero reasons that it is xyzzy-specific, for any random value of 'xyzzy'. And THAT is why I'm fairly convinced that the whole approach is bogus and broken. And note that that bogosity is separate from how this was done. It's a broken approach, but it was also done wrong. Two totally separate issues, but together it sure is annoying. > From where I sit it is much more useful to have separate slabs. Could > be if a case was actually made for slab merging I'd change my view. But > as of now these trump the stated benefits of slab merging: > 1) useful slab usage stats > 2) fault isolation from other subsystems .. and again, absolutely NEITHER of those have anything to do with "subsystem X". Can you really not see how *illogical* it is to make this a subsystem choice? So explain to me why you made it so? > The 3 lines that added SLAB_NO_MERGE were pretty damn clean. No. It really seriously wasn't. The code may be simple, but it sure isn't "pretty damn clean", exactly because I think the whole concept is fundamentally illogical. See above. As I mentioned in my email: if your point is that "slab_nomerge" has the wrong default value, then that is a different discussion, and one that may well be valid. But the whole concept of "random slabs can mark themselves no-merge for no obvious reasons" is broken. That was my argument, and you don't seem to get it. And even if it turns out not to be broken (please explain), it still should have been discussed. > SLAB_NO_MERGE gives subsystems a choice they didn't have before and they > frankly probably never knew they had to care about it because they didn't > know slabs were being merged. I asked around enough to know I'm not an > idiot for having missed the memo on slab merging. Put another way: things have been merged for years, and you didn't even notice. Seriously. I'm not exaggerating about "for years". At least for slub, it's been that way since it was initially merged, back in 2007. Yeah, it may have taken a while for slub to then become one of the major allocators, but it's been the default in at least Fedora for years and years too, afaik, so it's not like slub is something odd and unusual. You seem to argue that "not being aware of it" means that it's surprising and should be disabled. But quite frankly, wouldn't you say that "it hasn't caused any obvious problems" is at _least_ as likely an explanation for you not being aware of it? Because clearly, that lack of statistics and the possible cross-subsystem corruption hasn't actually been a pressing concern in reality. But suddenly it became such a big issue that you just _had_ to fix it, right? After seven years it's suddenly *so* important that dm absolutely has to disable it. And it really had to be dm that did it for its caches, rather than just use "slab_nomerge". Despite there not being anything dm-specific about that choice. Now tell me, what was the rationale for this all again? Because really, I'm not seeing it. And I'm _particularly_ not seeing why it then had to be sneaked in like this. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 1:21 ` Linus Torvalds @ 2015-09-03 2:31 ` Mike Snitzer 2015-09-03 3:10 ` Christoph Lameter 2015-09-03 3:11 ` Linus Torvalds 2015-09-03 6:02 ` Dave Chinner 1 sibling, 2 replies; 42+ messages in thread From: Mike Snitzer @ 2015-09-03 2:31 UTC (permalink / raw) To: Linus Torvalds Cc: Heinz Mauelshagen, Andrew Morton, Viresh Kumar, Dave Chinner, Joe Thornber, Pekka Enberg, linux-mm, dm-devel, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, David Rientjes, Joonsoo Kim, Christoph Lameter, Alasdair G Kergon On Wed, Sep 02 2015 at 9:21pm -0400, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Sep 2, 2015 at 5:51 PM, Mike Snitzer <snitzer@redhat.com> wrote: > > > > What I made possible with SLAB_NO_MERGE is for each subsystem to decide > > if they would prefer to not allow slab merging. > > .. and why is that a choice that even makes sense at that level? > > Seriously. > > THAT is the fundamental issue here. > > There are absolutely zero reasons this is dm-specific, but it is > equally true that there are absolutely zero reasons that it is > xyzzy-specific, for any random value of 'xyzzy'. > > And THAT is why I'm fairly convinced that the whole approach is bogus > and broken. Why do we even have slab creation flags? Andrew seemed much more reasonable about this. > And note that that bogosity is separate from how this was done. It's a > broken approach, but it was also done wrong. Two totally separate > issues, but together it sure is annoying. > > > From where I sit it is much more useful to have separate slabs. Could > > be if a case was actually made for slab merging I'd change my view. But > > as of now these trump the stated benefits of slab merging: > > 1) useful slab usage stats > > 2) fault isolation from other subsystems > > .. and again, absolutely NEITHER of those have anything to do with > "subsystem X". OK, I get that I'm unimportant. You can stop beating me over my irrelevant subsystem maintainer head now... But when longstanding isolation and functionality is removed in the name of microoptimizations its difficult to accept -- even if the realization occurs years after the fact. > Can you really not see how *illogical* it is to make this a subsystem choice? > > So explain to me why you made it so? > > > The 3 lines that added SLAB_NO_MERGE were pretty damn clean. > > No. It really seriously wasn't. > > The code may be simple, but it sure isn't "pretty damn clean", exactly > because I think the whole concept is fundamentally illogical. See > above. Yeah, your circular logic doesn't help me. You defined your argument in terms of unsubstantiated claims of me being illogical. What is illogical about wanting DM to: 1) have useful slab accounting 2) have fault isolation from other slab consumers 3) not impose 1+2 on all other subsystems ? I guess I'm just supposed to accept that slab merging is or isn't. There is no in-between (unless I create a slab with SLAB_DESTROY_BY_RCU) > As I mentioned in my email: if your point is that "slab_nomerge" has > the wrong default value, then that is a different discussion, and one > that may well be valid. > > But the whole concept of "random slabs can mark themselves no-merge > for no obvious reasons" is broken. That was my argument, and you don't > seem to get it. I'm not getting it because I don't understand why you really care. What implied benefits come with slab merging that I'm painfully unaware of? Andrew said DM would miss out on performance benefits. I'd obviously not want to do that; but said performance benefits haven't been made apparent. > And even if it turns out not to be broken (please explain), it still > should have been discussed. See above ;) > > SLAB_NO_MERGE gives subsystems a choice they didn't have before and they > > frankly probably never knew they had to care about it because they didn't > > know slabs were being merged. I asked around enough to know I'm not an > > idiot for having missed the memo on slab merging. > > Put another way: things have been merged for years, and you didn't even notice. > > Seriously. I'm not exaggerating about "for years". At least for slub, > it's been that way since it was initially merged, back in 2007. > Yeah, it may have taken a while for slub to then become one of the > major allocators, but it's been the default in at least Fedora for > years and years too, afaik, so it's not like slub is something odd and > unusual. You're also coming at this from a position that shared slabs are automatically good because they have been around for years. For those years I've not had a need to debug a leak in code I maintain; so I didn't notice slabs were merged. I also haven't observed slab corruption being the cause of crashes in DM, block or SCSI. > You seem to argue that "not being aware of it" means that it's > surprising and should be disabled. But quite frankly, wouldn't you say > that "it hasn't caused any obvious problems" is at _least_ as likely > an explanation for you not being aware of it? Sure. > Because clearly, that lack of statistics and the possible > cross-subsystem corruption hasn't actually been a pressing concern in > reality. Agreed. > But suddenly it became such a big issue that you just _had_ to fix it, > right? After seven years it's suddenly *so* important that dm > absolutely has to disable it. And it really had to be dm that did it > for its caches, rather than just use "slab_nomerge". The ship sailed on disabling it for everyone. It is the new norm. I cannot push RHEL to flip-flop slab characteristics (at least not until the next major release). > Despite there not being anything dm-specific about that choice. > > Now tell me, what was the rationale for this all again? I was the first to want the option to opt-out on a per slab basis. And you're shooting the messenger. Calling me illogical. > Because really, I'm not seeing it. And I'm _particularly_ not seeing > why it then had to be sneaked in like this. And I'm a sneaky too... Sneaking isn't what this was. Apologies if that's how it came off. I can appreciate why you might think that. But like I said to Andrew: won't happen again. I'm off the next 5 days. I don't think either of us care _that_ strongly about this particular issue. I've noted my process flaws. I'll calm down and this will just be some unfortunate thing that happened. But I'd still like some pointers/help on what makes slab merging so beneficial. I'm sure Christoph and others have justification. But if not then yes the default to slab merging probably should be revisited. Mike -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 2:31 ` Mike Snitzer @ 2015-09-03 3:10 ` Christoph Lameter 2015-09-03 4:55 ` Andrew Morton 2015-09-03 3:11 ` Linus Torvalds 1 sibling, 1 reply; 42+ messages in thread From: Christoph Lameter @ 2015-09-03 3:10 UTC (permalink / raw) To: Mike Snitzer Cc: Linus Torvalds, Heinz Mauelshagen, Andrew Morton, Viresh Kumar, Dave Chinner, Joe Thornber, Pekka Enberg, linux-mm, dm-devel, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, David Rientjes, Joonsoo Kim, Alasdair G Kergon On Wed, 2 Sep 2015, Mike Snitzer wrote: > You're also coming at this from a position that shared slabs are > automatically good because they have been around for years. > > For those years I've not had a need to debug a leak in code I maintain; > so I didn't notice slabs were merged. I also haven't observed slab > corruption being the cause of crashes in DM, block or SCSI. Hmmm... Thats unusual. I have seen numerous leaks and corruptions that were debugged using the additional debug code in the slab allocators. Merging and debugging can be switched on at runtime if necessary and then you will have a clear separation to be able to track down the offending code as well as detailed problem reports that help to figure out what was wrong. It is then typically even possible to fix these bugs without getting the subsystem specialists involved. > > Because clearly, that lack of statistics and the possible > > cross-subsystem corruption hasn't actually been a pressing concern in > > reality. > > Agreed. To the effect now that even SLAB has adopted cache merging. > But I'd still like some pointers/help on what makes slab merging so > beneficial. I'm sure Christoph and others have justification. But if > not then yes the default to slab merging probably should be revisited. Well, we have discussed the pros and cons for merging a couple of times but the general consensus was that it is beneficial. Performance on modern cpu is very sensitive to cache footprint and reducing the overhead of meta data for object allocation is a worthwhile goal. Also objects are more likely to be kept cache hot if they can be used by multiple subsystems. Slab merging also helps with reducing fragmentation since the free objects on one page can be used for other purposes. Check out the linux-mm archives for these dissussions. This has been such an advantage that the feature was ported to SLAB (to much more signficant effect than SLUB since SLAB is a pig with metadata per node, per cpu and per kmem_cache). And yes sorry the consequence is now that you do no longer have a choice. Both slab allocators default to merging. SLAB had some difficulty staying competitive in performance without that. Joonsoo Kim made SLAB more competitive last year and one of the optimizations was to also support merging. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 3:10 ` Christoph Lameter @ 2015-09-03 4:55 ` Andrew Morton 2015-09-03 6:09 ` Pekka Enberg 0 siblings, 1 reply; 42+ messages in thread From: Andrew Morton @ 2015-09-03 4:55 UTC (permalink / raw) To: Christoph Lameter Cc: Mike Snitzer, Linus Torvalds, Heinz Mauelshagen, Viresh Kumar, Dave Chinner, Joe Thornber, Pekka Enberg, linux-mm, dm-devel, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, David Rientjes, Joonsoo Kim, Alasdair G Kergon On Wed, 2 Sep 2015 22:10:12 -0500 (CDT) Christoph Lameter <cl@linux.com> wrote: > > But I'd still like some pointers/help on what makes slab merging so > > beneficial. I'm sure Christoph and others have justification. But if > > not then yes the default to slab merging probably should be revisited. > > ... > > Check out the linux-mm archives for these dissussions. Somewhat OT, but... The question Mike asks should be comprehensively answered right there in the switch-to-merging patch's changelog. The fact that it is not answered in the appropriate place and that we're reduced to vaguely waving at the list archives is a fail. And a lesson! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 4:55 ` Andrew Morton @ 2015-09-03 6:09 ` Pekka Enberg 2015-09-03 8:53 ` Dave Chinner 0 siblings, 1 reply; 42+ messages in thread From: Pekka Enberg @ 2015-09-03 6:09 UTC (permalink / raw) To: Andrew Morton Cc: Christoph Lameter, Mike Snitzer, Linus Torvalds, Heinz Mauelshagen, Viresh Kumar, Dave Chinner, Joe Thornber, linux-mm, dm-devel, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, David Rientjes, Joonsoo Kim, Alasdair G Kergon Hi Andrew, On Wed, 2 Sep 2015 22:10:12 -0500 (CDT) Christoph Lameter <cl@linux.com> wrote: >> > But I'd still like some pointers/help on what makes slab merging so >> > beneficial. I'm sure Christoph and others have justification. But if >> > not then yes the default to slab merging probably should be revisited. >> >> ... >> >> Check out the linux-mm archives for these dissussions. On Thu, Sep 3, 2015 at 7:55 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > Somewhat OT, but... The question Mike asks should be comprehensively > answered right there in the switch-to-merging patch's changelog. > > The fact that it is not answered in the appropriate place and that > we're reduced to vaguely waving at the list archives is a fail. And a > lesson! Slab merging is a technique to reduce memory footprint and memory fragmentation. Joonsoo reports 3% slab memory reduction after boot when he added the feature to SLAB: commit 12220dea07f1ac6ac717707104773d771c3f3077 Author: Joonsoo Kim <iamjoonsoo.kim@lge.com> Date: Thu Oct 9 15:26:24 2014 -0700 mm/slab: support slab merge Slab merge is good feature to reduce fragmentation. If new creating slab have similar size and property with exsitent slab, this feature reuse it rather than creating new one. As a result, objects are packed into fewer slabs so that fragmentation is reduced. Below is result of my testing. * After boot, sleep 20; cat /proc/meminfo | grep Slab <Before> Slab: 25136 kB <After> Slab: 24364 kB We can save 3% memory used by slab. For supporting this feature in SLAB, we need to implement SLAB specific kmem_cache_flag() and __kmem_cache_alias(), because SLUB implements some SLUB specific processing related to debug flag and object size change on these functions. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> We don't have benchmarks to directly measure its performance impact but you should see its effect via something like netperf that stresses the allocator heavily. The assumed benefit is that you're able to recycle cache hot objects much more efficiently as SKB cache and friends are merged to regular kmalloc caches. In any case, reducing kernel memory footprint already is a big win for various use cases, so keeping slab merging on by default is desirable. - Pekka -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 6:09 ` Pekka Enberg @ 2015-09-03 8:53 ` Dave Chinner 0 siblings, 0 replies; 42+ messages in thread From: Dave Chinner @ 2015-09-03 8:53 UTC (permalink / raw) To: Pekka Enberg Cc: Andrew Morton, Christoph Lameter, Mike Snitzer, Linus Torvalds, Heinz Mauelshagen, Viresh Kumar, Joe Thornber, linux-mm, dm-devel, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, David Rientjes, Joonsoo Kim, Alasdair G Kergon On Thu, Sep 03, 2015 at 09:09:24AM +0300, Pekka Enberg wrote: > Hi Andrew, > > On Wed, 2 Sep 2015 22:10:12 -0500 (CDT) Christoph Lameter <cl@linux.com> wrote: > >> > But I'd still like some pointers/help on what makes slab merging so > >> > beneficial. I'm sure Christoph and others have justification. But if > >> > not then yes the default to slab merging probably should be revisited. > >> > >> ... > >> > >> Check out the linux-mm archives for these dissussions. > > On Thu, Sep 3, 2015 at 7:55 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > > Somewhat OT, but... The question Mike asks should be comprehensively > > answered right there in the switch-to-merging patch's changelog. > > > > The fact that it is not answered in the appropriate place and that > > we're reduced to vaguely waving at the list archives is a fail. And a > > lesson! > > Slab merging is a technique to reduce memory footprint and memory > fragmentation. Joonsoo reports 3% slab memory reduction after boot > when he added the feature to SLAB: I'm not sure whether you are trying to indicate that it was justified inteh commit message or indicate how little justification there was... > commit 12220dea07f1ac6ac717707104773d771c3f3077 > Author: Joonsoo Kim <iamjoonsoo.kim@lge.com> > Date: Thu Oct 9 15:26:24 2014 -0700 > > mm/slab: support slab merge > > Slab merge is good feature to reduce fragmentation. If new creating slab > have similar size and property with exsitent slab, this feature reuse it > rather than creating new one. As a result, objects are packed into fewer > slabs so that fragmentation is reduced. A partial page or two in a newly allocated slab in not "fragmentation". They are simply free objects in the cache that haven't been allocated yet. Fragmentation occurs when large numbers of objects are freed so the pages end up mostly empty but cannot be freed because there is still 1 or 2 objects in use of them. As such, if there was fragementation and slab merging fixed it, I'd expect to be seeing a much larger reduction in memory usage.... > Below is result of my testing. > > * After boot, sleep 20; cat /proc/meminfo | grep Slab > > <Before> > Slab: 25136 kB > > <After> > Slab: 24364 kB > > We can save 3% memory used by slab. The numbers don't support the conclusion. Memory used from boot to boot always varies by a small amount - a slight difference in the number of files accessed by the boot process can account for this. Also, you can't 't measure slab fragmentation by measuring the amount of memory used. You have to look at object counts in each slab and work out the percentage of free vs allocated objects. So there's no evidence that this 772kb difference in memory footprint can even be attributed to slab merging. What about the rest of the slab fragmentation problem space? It's not even mentioned in the commit, but that's really what is important to long running machines. IOWs, where's description of the problem that needs fixing? What's the example workload that demonstrates the problem? What's the before and after measurements of the workloads that generate significant slab fragmentation? What's the long term impact of the change (e.g. a busy server with a uptime of several weeks)? is the fragmentation level reduced? increased? not significant? What impact does this have on subsystems with shrinkers that are now operating on shared slabs? Do the shrinkers still work as effectively as they used to? Do they now cause slab fragmentation, and if they do, does it self correct under continued memory pressure? And with the patch being merged without a single reviewed-by or acked-by, I'm sitting here wondering how we managed to fail software engineering 101 so badly here? Cheers, Dave. -- Dave Chinner dchinner@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 2:31 ` Mike Snitzer 2015-09-03 3:10 ` Christoph Lameter @ 2015-09-03 3:11 ` Linus Torvalds 1 sibling, 0 replies; 42+ messages in thread From: Linus Torvalds @ 2015-09-03 3:11 UTC (permalink / raw) To: Mike Snitzer Cc: Heinz Mauelshagen, Andrew Morton, Viresh Kumar, Dave Chinner, Joe Thornber, Pekka Enberg, linux-mm, dm-devel, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, David Rientjes, Joonsoo Kim, Christoph Lameter, Alasdair G Kergon On Wed, Sep 2, 2015 at 7:31 PM, Mike Snitzer <snitzer@redhat.com> wrote: > > Why do we even have slab creation flags? Ehh? Because they are meaningful? Things like SLAB_DESTROY_BY_RCU have real semantic meaning. The subsystem that creates the slab *cares*, and it makes sense because that kind of choice really fundamentally is a per-slab choice. >> .. and again, absolutely NEITHER of those have anything to do >> "subsystem X". > > OK, I get that I'm unimportant. You can stop beating me over my > irrelevant subsystem maintainer head now... What the hell is your problem? At no point did I state that you are any less important than anything else. But this isssue is simply not in any way specific to dm. dm is not any less important than anything else, but dm is also not magically *more* important than everything else. Really. Then you seem to take it personally, but please realize that that is *your* issue, not mine. > But when longstanding isolation and functionality is removed in the name > of microoptimizations its difficult to accept -- even if the realization > occurs years after the fact. Bullshit. You didn't notice. For years. It just wasn't important. Just admit it. Those things you now tout as so important are complete non-issues. But more importantly, and this is what you seem to not really get at all, is that it's STILL not dm-specific. If you think that isolation is so important, then tell me why isolation is only important for dm? Why isn't it important for everything else? What makes dm so special? Really. I've asked you three times now, and you seem to not get it, you just think I'm trying to put you in your place. I'm not. I'm asking a serious question: what makes dm so special that it has to have different allocation logic from everything else. And *THAT* is why SLAB_DESTROY_BY_RCU is different from your SLAB_NO_MERGE. Because I can actually answer the question: "What makes sighand_cachep need SLAB_DESTROY_BY_RCU but not other users?" with a real technical reason. > I'm not getting it because I don't understand why you really care. What > implied benefits come with slab merging that I'm painfully unaware of? It does actually have less overhead, for one thing. The separation of slabs doesn't cost you just in the slab data structure itself, but in the memory fragmentation. Having multiple slabs share the backing pool of pages uses less memory. > You're also coming at this from a position that shared slabs are > automatically good because they have been around for years. No, I'm really not. Christ, have you read anything I wrote? I'm ok with discussing the "the defaults should be turned around". But at least we *have* an option to turn that default around, so when people care (because they are trying to chase down a slab corruption issue, for example), they can do so. Your patch actually gets rid of that choice, and forces things the other way around. So I would argue that your patch actually makes things *worse*. It hardcodes an arbitrary choice, and it's not even a choice that makes obvious sense. And no, the memory fragmentation issue isn't just made up. One of the downsides of slab was historically that it used a lot of memory, and to be honest, I suspect the percpu queues have made things worse. At least sharing the backing store minimizes the effect of that somewhat. We used to have numbers for this all, but it's really approaching a decade since the whole initial SLUB vs SLAB things, so I don't know where to point you. But the reason I say "it's not a choice that makes obvious sense" isn't even because I'm convinced that the merging is always the best option. I *am* convinced that it has real upsides, but I also agree that it has downsides. But at least as it is right now, the system admin can make a choice. You arbitrarily wanted to take that choice away for dm, without apparently even knowing what the upsides of merging might be. But the *real* issue I have with it is the completely random "dm is different from everything else" thing. Which is bogus. That's what I wanted to know: what makes dm so special that it should be different from everything else? And apparently you don't have an answer to that. You just took my repeated questioning to mean that you're worthless. That wasn't the intent. It was very much a literal "what's so different about dm that it would act differently from everything else"? > The ship sailed on disabling it for everyone. It is the new norm. I > cannot push RHEL to flip-flop slab characteristics (at least not until > the next major release). But you can. Today. Put "slab_nomerge" on the kernel command line. Really. If you care, you can do that. And if you _don't_ care, then clearly not doing that doesn't hurt either. > I was the first to want the option to opt-out on a per slab basis. And > you're shooting the messenger. Calling me illogical. But the opt-in shouldn't be *you*, it should be the system maintainer who can actually tune for his load, or cares about memory use, or wants to debug, or any number of issues. See? Btw, I do agree that the "all or nothing" approach of "slab_nomerge" isn't optimal. But you made things *worse*. You took a tunable, and made it non-tunable, without apparently even knowing what it tuned for. Sure, it was a damn coarse-grained tunable, but you made *that* worse too, since with your code it's not tunable at all for dm. So your version isn't actually any more "fine-grained". Now, what might be interesting - *if* people actually want to tune just one set of slabs and not another - migth be to extend the "slab_nomerge" option to actually take a pattern of slab names, and match that way. So then you could say "slab_nomerge=dm_* slab_nomerge=xfs*", and you'd not merge dm or xfs slabs. I wouldn't mind that kind of approach at all. But please understand _why_ I wouldn't mind it: I wouldn't mind it exactly because you didn't take tuning choice away from people, but because such a patch would actually give people control of it. And it *wouldn't* be dm-specific, because other people might ask to not merge ext4 slabs or whatever. And for a similar reason, I actually wouldn't mind switching the default around for merging. I'm *not* married to the "we have to merge slab caches by default" model. It used to make sense, and I know I've seen numbers (I'm pretty sure Christoph Lameter had several talks about it back in the days), but things can change. But what doesn't make sense is to make random willy-nilly decisions on a basis that makes no sense. And I do claim that random subsystems just unilaterally deciding that they don't care about system default memory management falls under that "makes no sense" heading. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 1:21 ` Linus Torvalds 2015-09-03 2:31 ` Mike Snitzer @ 2015-09-03 6:02 ` Dave Chinner 2015-09-03 6:13 ` Pekka Enberg ` (2 more replies) 1 sibling, 3 replies; 42+ messages in thread From: Dave Chinner @ 2015-09-03 6:02 UTC (permalink / raw) To: Linus Torvalds Cc: Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Wed, Sep 02, 2015 at 06:21:02PM -0700, Linus Torvalds wrote: > On Wed, Sep 2, 2015 at 5:51 PM, Mike Snitzer <snitzer@redhat.com> wrote: > > > > What I made possible with SLAB_NO_MERGE is for each subsystem to decide > > if they would prefer to not allow slab merging. > > .. and why is that a choice that even makes sense at that level? > > Seriously. > > THAT is the fundamental issue here. It makes a lot more sense than you think, Linus. One of the reasons slab caches exist is to separate objects of identical characteristics from the heap allocator so that they are all grouped together in memory and so can be allocated/freed efficiently. This helps prevent heap fragmentation, allows objects to pack as tightly together as possible, gives direct measurement of the number of objects, the memory usage, the fragmentation factor, etc. Containment of memory corruption is another historical reason for slab separation (proof: current memory debugging options always causes slab separation). Slab merging is the exact opposite of this - we're taking homogenous objects and mixing them with other homogneous containing different objects with different life times. Indeed, we are even mixing them back into the slabs used for the heap, despite the fact the original purpose of named slabs was to separate allocation from the heap... Don't get me wrong - this isn't necessarily bad - but I'm just pointing out that slab merging is doing the opposite of what slabs were originally intended for. Indeed, a lot of people use slab caches just because it's anice encapsulation, not for any specific performance, visibility or anti-fragmentation purposes. I have no problems with automatically merging slabs created like this. However the fact that we are merging slabs automatically for all slabs now has made me think a bit deeper about the problems that can result from this. > There are absolutely zero reasons this is dm-specific, but it is > equally true that there are absolutely zero reasons that it is > xyzzy-specific, for any random value of 'xyzzy'. Right, it's not xyzzy-specific where 'xyzzy' is a subsystem. The flag application is actually *object specific*. That is, the use of the individual objects that determines whether it should be merged or not. e.g. Slab fragmentation levels are affected more than anything by mixing objects with different life times in the same slab. i.e. if we free all the short lived objects from a page but there is one long lived object on the page then that page is pinned and we free no memory. Do that to enough pages in the slab, and we end up with a badly fragmented slab. With slab merging, we have no control over what slabs are merged. We may be merging slabs with objects that have vastly different life times. Hence merging may actually be making one of the underlying cause of slab fragmentation worse rather than better. It really depends on what slabs get merged together and that's largely random chance - you don't get to pick the size of your structures.... Another contributor to slab fragmentation is when allocation order is very different to object freeing order. Pages in the slab get fill up using an algorithm that optimises for temporal locality. i.e. it will fill a partial page before moving on to the next partial page or allocating a new page. If the freeing of objects doesn't have the same temporal locality as allocation then when the slab grows and shrinks we end up with fragmentation. Mixing different object types into the same pages pretty much guarantees that we'll be mixing objects of different alloc/freeing order. Further, rapid growth and shrinking of a slab cache due to memory demand can cause fragmentation. Caches that have this problem are usually those that have a shrinker associated with them. The shrinker causes objects to have a variable, unpredictable lifetime and hence can break allocation/freeing locality (as per above, even for single object slabs). Minmising the effect of this reclaim fragmentation is often held up as the example of why slab merging is good - the other object types fill all the holes and hence reduces the overall fragmentation of the slab. Further, the density of the reclaimable objects is lower, so the slab doesn't fragment as much. On the surface, this looks like a big win but it's not - it's actually a major problem for slab reclaim and it manifests when there are large bursts of allocation activity followed by sudden reclaim activity. When the slab grows rapidly, we get the majority of objects on a page being of one type, but a couple will be of a different type. Than under memory pressure, the shrinker can then only free the majority of objects on a page, guaranteeing the slab will remain fragmented under memory pressure. Continuing to run the shrinker won't result in any more memory being freed from the merged slab and so we are stuck with unfixable slab fragmentation. However, if the slab with a shrinker only contains one kind of object, when it becomes fragmented due to variable object lifetime, continued memory pressure will cause it to keep shrinking and hence will eventually correct the fragmentation problem. This is a much more robust configuration - the system will self correct without user intervention being necessary. IOWs, slab merging prevents us from implementing effective active fragmentation management algorithms and hence prevents us from reducing slab fragmentation via improved shrinker reclaim algorithms. Simply put: slab merging reduces the effectiveness of shrinker based slab reclaim. A key observation I just made: we are extremely lucky that many of the critical slab caches in the system are not affected by merging. A slab cache with a constructor will not get merged and that means inode caches do not get merged. Hence, despite slab merging being enabled, one of the largest memory consuming slabs in the system does not get merged and hence it means the shrinker has been able to do it's job without interference. hence we've avoided the worst outcome of merging slabs by default by luck rather than good managment. Moving on from fragmentation: Slab caches can also back mempools. mempools ar eused to guarantee forwards progress under memory pressure, so it's important to have visibility into their behaviour. Hence it makes sense to ensure these don't get merged with other slabs so they are accounted accurately and we can see exactly the demand being placed on these critical slabs under heavy memory pressure. I've made use of this several times over the past few years to discover why a system is floundering under heavy memory pressure (e.g. writeback way slower than it should have been because the xfs_ioend mempool was operating in 1-in, 1-out mode)... So, when I said that I could use the SLAB_NO_MERGE for some caches in XFS and acked the patch, I was refering to exactly this sort of usage - the slabs that back mempools and the slabs that have a shrinker for reclaim should have this flag set. 4 of 17 named slabs in XFS need this flag - the rest I don't really care about because their memory usage can be inferred from the shrinkable slab cache sizes. Managing slab caches and fragmentation is anything but simple and there is no one right solution. Slab merging in some cases makes sense, but there are several very good reasons for not merging a slab. The right solution is often difficult for people without object-specific expertise to understand, but that goes for just about everything in the kernel these days. BTW, it is trivial to achieve SLAB_NO_MERGE simply by supplying a dummy constructor to the slab initialisation. I'd much prefer SLAB_NO_MERGE or some variant, though. Cheers, Dave. -- Dave Chinner dchinner@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 6:02 ` Dave Chinner @ 2015-09-03 6:13 ` Pekka Enberg 2015-09-03 10:29 ` Jesper Dangaard Brouer 2015-09-03 15:02 ` Linus Torvalds 2 siblings, 0 replies; 42+ messages in thread From: Pekka Enberg @ 2015-09-03 6:13 UTC (permalink / raw) To: Dave Chinner Cc: Linus Torvalds, Mike Snitzer, Christoph Lameter, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Thu, Sep 3, 2015 at 9:02 AM, Dave Chinner <dchinner@redhat.com> wrote: > One of the reasons slab caches exist is to separate objects of > identical characteristics from the heap allocator so that they are > all grouped together in memory and so can be allocated/freed > efficiently. This helps prevent heap fragmentation, allows objects > to pack as tightly together as possible, gives direct measurement of > the number of objects, the memory usage, the fragmentation factor, > etc. Containment of memory corruption is another historical reason > for slab separation (proof: current memory debugging options always > causes slab separation). > > Slab merging is the exact opposite of this - we're taking homogenous > objects and mixing them with other homogneous containing different > objects with different life times. Indeed, we are even mixing them > back into the slabs used for the heap, despite the fact the original > purpose of named slabs was to separate allocation from the heap... > > Don't get me wrong - this isn't necessarily bad - but I'm just > pointing out that slab merging is doing the opposite of what slabs > were originally intended for. Indeed, a lot of people use slab > caches just because it's anice encapsulation, not for any specific > performance, visibility or anti-fragmentation purposes. I have no > problems with automatically merging slabs created like this. Yes, absolutely. Alternative to slab merging is to actually reduce the number of caches we create in the first place and use kmalloc() wherever possible. - Pekka -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 6:02 ` Dave Chinner 2015-09-03 6:13 ` Pekka Enberg @ 2015-09-03 10:29 ` Jesper Dangaard Brouer 2015-09-03 16:19 ` Christoph Lameter 2015-09-04 6:35 ` Sergey Senozhatsky 2015-09-03 15:02 ` Linus Torvalds 2 siblings, 2 replies; 42+ messages in thread From: Jesper Dangaard Brouer @ 2015-09-03 10:29 UTC (permalink / raw) To: Dave Chinner Cc: brouer, Linus Torvalds, Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Thu, 3 Sep 2015 16:02:47 +1000 Dave Chinner <dchinner@redhat.com> wrote: > On Wed, Sep 02, 2015 at 06:21:02PM -0700, Linus Torvalds wrote: > > On Wed, Sep 2, 2015 at 5:51 PM, Mike Snitzer <snitzer@redhat.com> wrote: > > > > > > What I made possible with SLAB_NO_MERGE is for each subsystem to decide > > > if they would prefer to not allow slab merging. > > > > .. and why is that a choice that even makes sense at that level? > > > > Seriously. > > > > THAT is the fundamental issue here. > > It makes a lot more sense than you think, Linus. > [...] > > On the surface, this looks like a big win but it's not - it's > actually a major problem for slab reclaim and it manifests when > there are large bursts of allocation activity followed by sudden > reclaim activity. When the slab grows rapidly, we get the majority > of objects on a page being of one type, but a couple will be of a > different type. Than under memory pressure, the shrinker can then > only free the majority of objects on a page, guaranteeing the slab > will remain fragmented under memory pressure. Continuing to run the > shrinker won't result in any more memory being freed from the merged > slab and so we are stuck with unfixable slab fragmentation. > > However, if the slab with a shrinker only contains one kind of > object, when it becomes fragmented due to variable object lifetime, > continued memory pressure will cause it to keep shrinking and hence > will eventually correct the fragmentation problem. This is a much > more robust configuration - the system will self correct without > user intervention being necessary. > > IOWs, slab merging prevents us from implementing effective active > fragmentation management algorithms and hence prevents us from > reducing slab fragmentation via improved shrinker reclaim > algorithms. Simply put: slab merging reduces the effectiveness of > shrinker based slab reclaim. I'm buying into the problem of variable object lifetime sharing the same slub. With the SLAB bulk free API I'm introducing, we can speedup slub slowpath, by free several objects with a single cmpxchg_double, BUT these objects need to belong to the same page. Thus, as Dave describe with merging, other users of the same size objects might end up holding onto objects scattered across several pages, which gives the bulk free less opportunities. That would be a technical argument for introducing a SLAB_NO_MERGE flag per slab. But I want to do some measurement before making any decision. And it might be hard to show for my use-case of SKB free, because SKB allocs will likely be dominating 256 bytes slab anyhow. -- Best regards, Jesper Dangaard Brouer MSc.CS, Sr. Network Kernel Developer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 10:29 ` Jesper Dangaard Brouer @ 2015-09-03 16:19 ` Christoph Lameter 2015-09-04 9:10 ` Jesper Dangaard Brouer 2015-09-04 6:35 ` Sergey Senozhatsky 1 sibling, 1 reply; 42+ messages in thread From: Christoph Lameter @ 2015-09-03 16:19 UTC (permalink / raw) To: Jesper Dangaard Brouer Cc: Dave Chinner, Linus Torvalds, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Thu, 3 Sep 2015, Jesper Dangaard Brouer wrote: > > IOWs, slab merging prevents us from implementing effective active > > fragmentation management algorithms and hence prevents us from > > reducing slab fragmentation via improved shrinker reclaim > > algorithms. Simply put: slab merging reduces the effectiveness of > > shrinker based slab reclaim. > > I'm buying into the problem of variable object lifetime sharing the > same slub. Well yeah I see the logic of the argument but what I have seen in practice is that the access to objects becomes rather random over time. inodes and denties are used by multiple underlying volumes/mountpoints etc. They are expired individually etc etc. The references to objects become garbled over time anyways. What I would be interested in is some means by which locality of objects of different caches can be explicitly specified. This would allow the placing together of multiple objects in the same page frame. F.e. dentries and inodes and other metadata of a filesystem that is related. This would enhance the locality of the data and allow better defragmentation. But we are talking here about a totally different allocator design. > With the SLAB bulk free API I'm introducing, we can speedup slub > slowpath, by free several objects with a single cmpxchg_double, BUT > these objects need to belong to the same page. > Thus, as Dave describe with merging, other users of the same size > objects might end up holding onto objects scattered across several > pages, which gives the bulk free less opportunities. This happens regardless as far as I can tell. On boot up you may end up for a time in special situations where that is true. > That would be a technical argument for introducing a SLAB_NO_MERGE flag > per slab. But I want to do some measurement before making any > decision. And it might be hard to show for my use-case of SKB free, > because SKB allocs will likely be dominating 256 bytes slab anyhow. With the skbs you would want to place the skb data together with the packet data and other network related objects right? Maybe we can think out an allocator that can store objects related to a specific action in a page frame that can then be tossed as a whole. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 16:19 ` Christoph Lameter @ 2015-09-04 9:10 ` Jesper Dangaard Brouer 2015-09-04 14:13 ` Christoph Lameter 0 siblings, 1 reply; 42+ messages in thread From: Jesper Dangaard Brouer @ 2015-09-04 9:10 UTC (permalink / raw) To: Christoph Lameter Cc: Dave Chinner, Linus Torvalds, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, brouer On Thu, 3 Sep 2015 11:19:53 -0500 (CDT) Christoph Lameter <cl@linux.com> wrote: > On Thu, 3 Sep 2015, Jesper Dangaard Brouer wrote: > > > I'm buying into the problem of variable object lifetime sharing the > > same slub. > [...] > > > With the SLAB bulk free API I'm introducing, we can speedup slub > > slowpath, by free several objects with a single cmpxchg_double, BUT > > these objects need to belong to the same page. > > Thus, as Dave describe with merging, other users of the same size > > objects might end up holding onto objects scattered across several > > pages, which gives the bulk free less opportunities. > > This happens regardless as far as I can tell. On boot up you may end up > for a time in special situations where that is true. That is true, which is also why below measurements should be taken with a grain of salt, as benchmarking is done within 10 min of boot up. > > That would be a technical argument for introducing a SLAB_NO_MERGE flag > > per slab. But I want to do some measurement before making any > > decision. And it might be hard to show for my use-case of SKB free, > > because SKB allocs will likely be dominating 256 bytes slab anyhow. I'll give you some preliminary measurements on my patchset which uses the new SLAB bulk free API of SKBs in the TX completion on ixgbe NIC driver (function ixgbe_clean_tx_irq() will bulk free max 32 SKBs). Basic test-type is IPv4 forwarding, on a single CPU (i7-4790K CPU @ 4.00GHz), with generator pktgen sending 14Mpps (using script samples/pktgen/pktgen_sample03_burst_single_flow.sh). Test setup notes * Kernel: 4.1.0-mmotm-2015-08-24-16-12+ #261 SMP - with patches "detached freelist" and Christophs irqon/off fix. Config /etc/sysctl.conf :: net/ipv4/conf/default/rp_filter = 0 net/ipv4/conf/all/rp_filter = 0 # Forwarding performance is affected by early demux net/ipv4/ip_early_demux = 0 net.ipv4.ip_forward = 1 Setup:: $ base_device_setup.sh ixgbe3 $ base_device_setup.sh ixgbe4 $ netfilter_unload_modules.sh ; netfilter_unload_modules.sh; rmmod nf_reject_ipv4 $ ip neigh add 172.16.0.66 dev ixgbe4 lladdr 00:aa:aa:aa:aa:aa # GRO negatively affect forwarding performance (as least for UDP test) $ ethtool -K ixgbe4 gro off tso off gso off $ ethtool -K ixgbe3 gro off tso off gso off First I tested a none patched kernel with/without "slab_nomerge". (Single CPU IP-forwarding of UDP packets) * Normal : 2049166 pps * slab_nomerge: 2053440 pps * Diff: +4274pps and -1.02ns * Nanosec diff show we are below accuracy of system Thus, results are the same. Using bulking changes the picture: Bulk free of max 32 SKBs in ixgbe TX-DMA-completion: * Bulk-free32: 2091218 pps * Diff to "Normal" case above: +42052 pps and 9.81ns * Nanosec diff is significant (enough above accuracy level of system) * Summary: Pretty nice improvement! Same test with "slab_nomerge": * slab_nomerge: 2121703 pps * Diff to above: +30485 pps and -6.87 ns * Nanosec diff were upto 3ns in testrun, this 6ns is still valid * Summary: slab_nomerge did make a difference! Total improvement is quite significant: +72537 pps and -16.68ns (+3.5%) It is important to be critical about your own measurements. What is the real cause of this change. Lets see that happens if we tune SLUB per CPU structures to have more "room", instead of using "slab_nomerge". Tuning:: echo 256 > /sys/kernel/slab/skbuff_head_cache/cpu_partial echo 9 > /sys/kernel/slab/skbuff_head_cache/min_partial Test with bulk-free32 and SLUB-tuning: * slub-tuned: 2110842 pps * Note this gets very close to "slab_nomerge" - 2121703 - 2110842 = 10861 pps - (1/2121703*10^9)-(1/2110842*10^9) = -2.42 ns * Nanosec diff around 2.5ns is not significant enough, call results the same Thus, I could achieve the same performance results by tuning SLUB as I could with "slab_nomerge". Maybe the advantage from "slab_nomerge" was just that I got my "own" per CPU structures, and this implicitly larger per CPU memory for myself? -- Best regards, Jesper Dangaard Brouer MSc.CS, Sr. Network Kernel Developer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 9:10 ` Jesper Dangaard Brouer @ 2015-09-04 14:13 ` Christoph Lameter 0 siblings, 0 replies; 42+ messages in thread From: Christoph Lameter @ 2015-09-04 14:13 UTC (permalink / raw) To: Jesper Dangaard Brouer Cc: Dave Chinner, Linus Torvalds, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Fri, 4 Sep 2015, Jesper Dangaard Brouer wrote: > Thus, I could achieve the same performance results by tuning SLUB as I > could with "slab_nomerge". Maybe the advantage from "slab_nomerge" was > just that I got my "own" per CPU structures, and this implicitly larger > per CPU memory for myself? Well if multiple slabs are merged then there is potential pressure on the per node locks if huge amounts of objects are concurrently retrieved from the per node partial lists by two different subsystems. So cache merging can increase contention and thereby reduce performance. What you did with tuning is to reduce that contention by increasing the per cpu pages that do not require locks. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 10:29 ` Jesper Dangaard Brouer 2015-09-03 16:19 ` Christoph Lameter @ 2015-09-04 6:35 ` Sergey Senozhatsky 2015-09-04 7:01 ` Linus Torvalds 1 sibling, 1 reply; 42+ messages in thread From: Sergey Senozhatsky @ 2015-09-04 6:35 UTC (permalink / raw) To: Jesper Dangaard Brouer Cc: Dave Chinner, Linus Torvalds, Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, Sergey Senozhatsky [-- Attachment #1: Type: text/plain, Size: 1841 bytes --] On (09/03/15 12:29), Jesper Dangaard Brouer wrote: [..] > I'm buying into the problem of variable object lifetime sharing the > same slub. > > With the SLAB bulk free API I'm introducing, we can speedup slub > slowpath, by free several objects with a single cmpxchg_double, BUT > these objects need to belong to the same page. > Thus, as Dave describe with merging, other users of the same size > objects might end up holding onto objects scattered across several > pages, which gives the bulk free less opportunities. > > That would be a technical argument for introducing a SLAB_NO_MERGE flag > per slab. But I want to do some measurement before making any > decision. And it might be hard to show for my use-case of SKB free, > because SKB allocs will likely be dominating 256 bytes slab anyhow. Out of curiosity, I did some quite simple-minded "slab_nomerge = 0" vs. "slab_nomerge = 1" tests today on my old x86_64 box (4gigs of RAM, ext4, 4.2.0-next-20150903): - git clone https://github.com/git/git; make -j8; package (archlinux); cleanup; - create a container; untar gcc-5-20150901.tar.bz2; make -j6; package; cleanup; I modified /proc/slabinfo to show a total `unused` space size, accounted in cache_show() (reset in slab_start()): __unused_objs_sz += (sinfo.num_objs - sinfo.active_objs) * s->size; and captured that value every second tail -1 /proc/slabinfo >> slab_unused FWIW, files with the numbers attached (+ gnuplot graph). Those numbers are not really representative, but well, they are what they are -- on my small x86_64 box with very limited resources and under pretty general tests "slab_nomerge = 1" shows better numbers. (hm... can embedded benefit from disabling merging?). I think (as Dave Chinner said), preventing active and passive slabs merging does not sound so crazy. Just my 2 cents. -ss [-- Attachment #2: slab_unused_nomerge --] [-- Type: text/plain, Size: 14836 bytes --] 101440 135360 135360 135360 135360 135360 118976 127168 127168 112832 156032 88448 129408 102784 102784 102784 102784 102784 98816 176640 176640 145920 137728 137728 129536 84480 143872 88256 78016 78016 78016 74048 74048 74048 74048 96576 96576 96576 96576 96576 96576 96576 153920 153920 102720 104768 93248 105536 80960 80960 66624 66624 66624 66624 66624 66624 138752 142848 130560 130560 130560 130560 124928 100352 100352 100352 69632 69632 53248 53248 53248 53248 53248 53248 69632 69632 69632 69632 100352 112640 98304 110592 95232 86016 66560 66560 44032 117760 117760 117760 87040 87040 82944 82944 82944 82944 118784 118784 118784 104448 104448 104448 104448 104448 104448 65536 49152 49152 49152 49152 124928 124928 77824 160768 160768 214016 150528 150528 121856 89088 89088 89088 160768 160768 142336 142336 142336 142336 107520 107520 109568 109568 109568 109568 109568 109568 109568 109568 109568 109568 109568 109568 109568 109568 111616 111616 91136 91136 91136 91136 91136 91136 91136 91136 91136 76800 76800 66560 95808 95808 95808 95808 159296 147008 147008 147008 196160 196160 196160 196160 196160 196160 163392 163392 163392 163392 134144 197312 216384 211456 213888 179136 204480 125952 205824 205824 183936 145728 203328 199296 183296 150016 128128 128128 130048 141568 231288 223392 204984 204984 217272 231736 163480 189576 153408 209024 205760 218760 177344 222400 191680 157248 227136 201088 240640 213912 203392 123904 146432 146432 146432 205312 122944 268800 256768 222944 272000 260480 288128 220608 224704 246056 137040 133120 154496 234496 276352 276352 318976 364224 300608 306688 254464 244352 173440 217664 227456 209408 185408 164480 149504 149504 179392 240768 235648 182336 152448 201152 238504 168976 226088 188896 141312 184000 181824 174912 155264 139968 186368 302848 341824 275376 260480 338816 288640 308840 245952 269120 255808 248416 248416 306008 291672 230616 201984 142000 187568 181248 194240 188416 143936 204672 217472 166080 104448 155392 139776 150016 208256 178496 178752 183488 104448 189568 164488 157200 122048 167872 176640 163712 196992 182848 100352 149248 141568 210432 194176 214784 213056 237824 210304 210304 260976 237808 386960 340816 321360 243760 209536 212272 192352 282208 198648 213816 245880 211896 244280 166392 293552 331616 328448 231680 362624 202752 202752 310592 348736 291456 199936 209536 217992 202816 284032 259392 259392 259392 215360 229440 229888 241152 161792 161792 204608 183336 195944 173432 173376 173376 159552 208960 178304 178304 133120 170432 207104 201536 198784 187712 250952 195688 150312 142864 142864 142864 187344 154800 143984 170352 203056 180448 168288 162944 151808 235008 226112 226112 186880 183936 133120 133120 133120 110592 98304 98304 81920 81920 81920 237568 276480 276480 276480 276480 276480 276480 276480 276480 260096 260096 260096 227328 227328 227328 266112 266112 266112 236224 236224 245312 226048 223616 223616 218624 181504 174848 170752 170752 170752 170752 170752 219904 285824 230848 208320 208320 208320 205504 205504 205504 194560 194560 194560 194560 194560 194560 194560 169984 169984 169984 159744 185856 181504 179328 179328 179328 169088 169088 150528 150528 150528 219584 219584 219584 178624 164416 162240 152000 224448 224576 219072 211712 206272 208256 271744 271744 271744 216320 216320 171008 171008 168192 168192 168192 168192 157952 155648 155648 146304 146304 144128 144128 144128 144128 207616 207616 207616 207616 207616 180864 180864 138560 138560 138624 138624 138624 122240 122240 98112 98112 98112 98112 85888 85888 85888 85888 81920 81920 81920 81920 168512 162752 162752 230336 197888 179328 179328 179328 164288 164288 164288 164288 164288 164288 160704 118784 101056 81920 81920 81920 81920 81920 221184 221184 221184 221184 221184 221184 190464 190464 225280 225280 225280 225280 227328 229376 194560 244008 245200 227000 265080 202832 172032 174080 214680 222944 196744 226312 221776 153936 181120 171520 118784 118784 118784 170752 118784 180928 184640 184640 205808 152488 142696 122480 118784 202880 177600 147456 147456 225536 181440 204800 128512 194688 90432 149184 147720 238464 161096 223752 220112 213296 183504 154320 211728 187152 162048 162048 162176 162176 158336 154560 154560 154560 118016 101056 89408 177472 173696 164672 169728 169728 169728 169728 166400 166400 144960 144960 131008 165632 141248 111808 177472 189120 170624 93760 161920 80256 165888 271296 259264 145280 233600 230592 230912 229056 232960 238592 247616 213120 206720 223872 190528 233280 228800 259008 219584 242752 221312 210432 232000 248512 246656 247808 236352 244224 205504 252032 255936 226112 213312 204160 171136 226560 281408 204032 262720 272576 273088 274560 271872 271936 273920 269888 274880 240704 224192 223168 211456 224960 204352 219520 178944 165248 169664 181440 152512 205760 190336 163392 205760 163264 237824 236992 231104 228416 213064 189320 285064 256456 270920 257728 186624 196224 185792 150976 169600 180224 159232 189952 225152 140544 197504 151552 175168 185536 165696 226624 209792 264576 249216 248640 254912 266368 249664 278720 249984 281472 251520 254912 203136 242816 239872 269120 189568 253120 197184 186624 203456 177600 346528 348216 334768 322416 334880 303776 305376 307232 340640 386784 372824 397976 360216 366040 406424 333528 324696 282840 303448 331032 349336 288536 285656 281816 231576 245144 216984 239576 310040 315736 323480 362968 233624 344856 347864 237336 320088 258136 335576 254040 363544 311064 357080 285976 296152 264664 306520 374232 364760 407832 407640 403544 368024 376600 405336 303576 397400 376664 334240 279968 375072 277920 375072 329312 378592 320864 337184 337504 293536 293280 285464 268480 259336 279752 299016 270280 276040 313480 265224 297288 290440 259464 273224 277320 238216 262728 241544 275272 274312 252488 179080 183176 220168 196168 195720 171592 210248 178376 200200 177984 192192 163584 203968 179328 157376 181568 166272 181632 182016 174592 179136 242432 231104 226112 227712 185088 210944 224640 180608 177536 192192 249856 267584 262784 270272 266688 247104 282944 266176 247296 340096 312256 310080 251520 272768 230784 230080 230080 230080 230080 230080 222016 222016 157568 214720 154368 236992 185664 214208 214080 214080 214080 201216 245240 206784 309848 306136 251736 354904 337688 364752 399200 228000 314848 244768 223968 260768 223968 213728 217568 266208 266208 251104 251104 247072 290432 241344 248784 190472 278408 247688 223944 223944 223944 205320 205320 205320 197128 193416 193416 183240 182344 163024 163024 163024 163024 188368 175504 175504 142992 106336 100192 100192 184104 93136 142088 121288 187592 172744 87816 150920 153224 256072 188104 209352 195144 221768 184520 175432 184832 118656 230528 228224 201792 109056 232640 187456 187456 181056 201216 247296 241728 241728 238144 148608 148608 148608 148608 148608 148608 251456 244416 148800 148608 202176 231680 158080 198144 175104 185408 158656 121152 111360 172480 117760 219072 276608 204544 223872 308032 215680 215936 201856 142464 142464 134272 226688 150528 147456 128640 169600 142336 64640 64640 161792 156672 134016 134016 130368 123840 179136 176768 176768 166208 168128 164224 163648 153856 174752 168200 227912 235208 216072 246600 146384 196496 171344 202000 160152 169560 236416 232384 141632 263424 266368 244800 244608 247808 244536 265848 174328 276032 216040 272864 246432 218520 306008 201480 262088 252232 205768 224968 212296 218440 185480 195408 198288 184416 182816 136576 184768 149056 193280 181056 184576 200064 202048 203968 218240 212992 218112 253824 252864 284992 191360 234944 242624 181248 172096 241984 178880 233344 227904 176448 241280 164928 258112 149952 160128 116288 199168 242816 289024 207936 297856 162048 178880 146240 142400 166144 104448 234944 219264 214528 141440 144640 138432 144832 168512 177088 174208 126272 142144 137984 147136 137856 132800 122112 127936 218752 223360 217920 192256 226624 205248 191296 224128 222464 199936 232128 223680 248448 217920 206144 230272 214528 197248 246784 202240 198656 185728 219520 193792 172928 198016 211584 174144 220032 209664 175552 169216 166848 152128 160768 108736 136576 212096 144000 247296 207232 235904 227328 216960 205440 253696 200512 339200 219648 260928 320768 260480 262912 249240 213328 244648 216488 267560 190296 177496 191272 173152 183856 153840 233008 252992 237232 244848 226304 235352 241672 211856 233488 189512 253480 209656 276856 230648 287288 220352 268480 148672 146560 203136 248064 142528 304640 256800 270816 262944 375808 240592 351464 290168 350200 267128 305656 185976 185976 171896 222528 165824 244288 209872 217616 215992 229048 172472 182264 173984 163744 171808 172064 153568 216096 243616 131832 195368 140272 136240 140656 101744 30568 30568 59240 59240 78312 134056 90344 90344 70696 59296 59296 119840 130520 95448 89112 152728 140312 138912 203584 136512 113536 111552 107840 86592 89536 83968 69504 139584 78848 120064 123072 100416 78848 78848 194176 48128 68608 116608 176832 115456 89728 116480 76544 63232 135680 169024 132544 157568 105856 115200 87744 104320 90752 86400 86144 79872 80128 50112 66432 81344 40128 45184 203840 127872 174016 222976 151424 106240 107328 79808 107200 87232 74432 70400 201984 89088 147264 196864 153984 110592 108992 100736 81664 81664 112064 112448 112448 112448 112448 115200 81152 124800 124800 68288 37888 105472 102848 190888 69832 138080 133616 150296 27968 144512 9216 69440 93760 50176 68480 116992 143464 160872 126248 152424 116200 142568 81552 229672 216528 138240 129920 120192 109632 149376 181968 161680 206336 201952 137920 190720 127952 223632 175672 171072 114648 232920 205592 184408 237720 282112 236736 304960 253824 258176 212224 205440 159872 147456 100928 100800 111232 88704 88704 62144 82176 64320 50176 50176 50176 76096 115904 128768 140096 44032 40576 63360 63808 49536 9216 80448 110400 153280 176320 152896 225408 160192 207040 200064 190400 178176 253952 292992 226560 226560 192512 189632 208000 183424 212096 156672 125952 125952 125952 197952 195904 225304 301272 196296 188648 233384 203752 182192 189872 125640 247312 269024 306304 306304 236608 274120 230408 219320 241176 197328 199248 233616 233616 199248 210256 203600 181424 207624 139720 139720 171960 157176 129872 172080 140816 162832 113848 90680 52472 101576 112776 65160 165320 165320 183944 216120 200464 141432 171032 160632 147576 189544 186712 195792 202128 206368 217704 177640 197384 136656 96520 121776 121776 121776 166120 166888 163624 137128 171112 166440 166440 139560 130344 144616 187304 183208 123000 121720 119504 122448 112592 99984 120912 175440 119376 174992 264144 182480 194448 193808 215184 229264 242192 225552 333328 207696 218064 181392 227280 188048 201808 165960 181512 145544 141656 152080 136912 139856 120592 94672 112016 125440 121984 126784 103488 113344 105664 63616 111176 162696 182152 149896 185152 155776 227712 158848 180224 70720 98752 91520 101760 41984 99648 152832 192768 123392 133888 65984 46208 151680 116928 103552 87424 168960 77888 75264 183488 103680 117888 168320 156288 158144 133568 88448 130432 130432 109696 113344 116992 111936 114240 130176 102720 98688 168632 157120 148928 169344 154368 129856 113472 128320 136064 71104 64768 212032 215552 207360 222088 193608 156808 163008 160376 121112 154832 149200 113136 208240 175664 218192 89672 210312 212608 168256 155552 163936 163936 112736 112736 116480 71424 60352 93952 77568 71296 91456 91456 83968 69696 85248 85248 71552 80320 100096 69760 92032 79488 70720 142784 160000 296192 177152 286080 302760 283752 260112 227104 276552 251976 251976 251976 251976 256200 256200 223496 223496 106640 142928 143056 143056 143184 156752 179792 179792 179792 179792 160336 139544 140080 177472 169800 148040 168984 160088 126376 150632 148264 153448 143208 135848 128936 124584 117664 207392 157472 152320 132480 91072 134016 129152 56320 56320 136704 180224 184000 219072 160984 262104 262104 240216 195096 213528 186520 167672 167672 135024 89712 116144 244976 240944 240944 206576 203272 245104 210656 290016 293472 301752 153784 246272 150592 222464 222720 188928 188928 188928 189056 189056 299776 275200 350848 337216 373440 339520 310976 310504 281392 254664 253768 253768 253768 253768 249736 249736 249736 216648 313128 298472 296424 282728 270600 270600 254216 254216 254216 254216 254216 254216 254216 236104 236104 236104 236104 236104 236104 236104 236104 236104 236104 236104 236104 236104 236104 236104 225864 225864 227912 227912 227912 227912 242512 229624 231040 227872 120448 211584 195200 205568 117056 113152 129344 235072 202496 109952 282144 98496 190656 118656 72640 51456 145344 190176 305672 226544 239552 98304 180608 159680 94976 139008 145160 257880 163120 130048 59008 115176 67688 144488 178752 87680 123584 197312 297152 382016 326016 244224 327424 328576 200448 280320 148352 148352 187008 251584 153152 255488 122112 277056 302656 303552 274880 274880 274880 274880 274880 274880 274880 274880 274880 274880 274880 274880 274880 275008 275008 275008 336448 336448 336448 336448 336448 336448 336448 336448 311872 293440 293440 293440 293440 293440 275520 206400 256768 202304 180672 218560 271360 258048 229504 227008 227008 229056 229056 265792 265792 265792 265792 206400 206400 206400 206400 157568 151424 151424 151424 151424 151424 151424 186752 186752 186752 186752 186752 186752 186752 208320 166720 124608 124608 126784 174720 251264 237952 258688 274176 306432 182528 197888 125568 181376 181376 150336 185024 185024 198976 168256 132096 196992 116032 205248 253696 269504 348544 435904 411328 411328 411456 343872 343872 343872 343872 343872 343872 343872 335552 335552 335680 335680 335680 335680 335680 335680 335680 335808 332096 332096 332096 332096 332096 332096 332096 332096 270080 265280 255040 255040 255040 255040 255040 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 244800 134208 166080 141312 108544 108544 108544 108928 108928 108928 108928 108928 108928 219520 219520 219520 219520 219520 219520 219520 219520 219520 219520 219520 219520 219520 219520 219520 219520 215872 216256 228544 228544 228544 228544 224704 226496 189632 189632 191808 202688 202688 203264 254656 251072 238784 242112 242112 242112 195776 195776 185536 144576 144576 158912 158912 158912 134336 158912 158912 201920 191680 193728 203968 283200 252288 217600 188928 188928 147968 225792 225792 188928 172544 172544 221184 221184 221184 221184 221184 154816 154816 154816 154816 154816 154816 154816 154816 154816 154816 154816 154816 146624 146624 146624 146624 146624 146624 146624 146624 146624 200640 [-- Attachment #3: slab_unused_merge --] [-- Type: text/plain, Size: 15652 bytes --] 321152 368448 357056 399552 371072 367104 367104 364800 364800 360256 346304 364480 395328 397632 390400 390400 390400 390400 390400 390400 390400 390400 390400 345088 352192 379328 388032 383040 350528 316352 341440 361664 347840 343744 343744 343744 367680 339776 344384 344384 339776 339776 375616 374208 341952 349248 357440 346432 342400 348800 351872 366208 352640 345984 290432 290944 293504 286080 279424 279424 295552 285056 279424 279424 279424 279424 279424 279424 302976 295168 295168 295168 288768 332544 332544 321536 306944 313856 318720 346368 318464 319488 326144 318464 282112 291584 281344 297216 293376 292352 291584 276736 280064 265216 281088 281088 289536 261120 307712 278784 262912 263424 275968 281344 279040 265216 265216 284416 282368 269248 291520 286656 259520 271552 266816 253504 250432 250432 235712 235712 235712 257472 235712 252608 247744 277440 273600 276928 276928 248512 273088 273088 258240 254272 256320 256320 254464 231424 231424 229120 226560 224256 224256 219904 244480 244480 258048 264448 264448 260608 260608 260608 260608 260608 257536 257536 235008 251392 233216 248320 244992 239360 262912 239360 243968 259072 271872 246144 248960 246912 261248 255616 246144 240512 185984 170368 126976 126976 166144 188544 146432 126976 130688 126976 155392 184832 168064 155008 184960 184960 184960 185216 185216 262208 262464 262720 272000 273024 497280 460096 374144 360576 352768 383168 393472 355136 291264 297920 275264 333120 338688 323072 271104 271104 271104 271360 271360 271360 271360 271360 271360 271360 271360 271360 259392 310784 259392 270976 259392 387328 437120 413312 458112 435264 432896 252032 252032 278784 278784 245504 295680 301952 295680 317440 430784 490688 488128 457408 472128 462080 529216 502656 468992 505792 351936 339136 313600 359616 335360 443008 457856 422336 471616 421696 487360 511104 530624 527488 522432 497408 506240 490752 496512 482304 498048 454016 484224 439360 525760 538048 472960 513088 493888 508224 200064 200064 284544 200064 282496 322496 340800 346176 291520 408960 371392 448448 425728 375168 477120 458752 528768 525056 503296 469568 437632 497984 535296 557504 471104 318080 312512 305856 342464 385280 318400 381504 342016 288448 219584 213696 269888 237056 235904 387520 258560 286208 293952 367680 346944 361984 324736 305216 314240 402816 422400 423168 384576 169408 339328 305024 338368 356352 275520 266752 212672 209024 282944 286848 255168 290432 228224 200832 322240 379520 395200 285568 359872 359808 274240 334784 290496 252416 336832 353920 353472 312704 368256 302208 316416 298112 318336 309760 304128 278272 379584 324864 234496 283008 221824 221760 236800 220160 414016 436096 399744 370944 245056 244480 311360 347264 276864 249920 250112 321088 213888 182912 303872 352320 339008 345280 370176 245504 232192 252608 275520 266880 258624 244224 231936 176064 236672 198848 206336 183616 209152 229504 258048 258240 243264 243264 231552 235840 239104 277568 242688 237696 314176 316416 316672 332544 368512 300544 286592 268864 268352 268352 282112 267264 262528 251072 263936 355968 324544 370880 371008 335872 439040 449792 440768 390848 407488 377920 402752 435904 373056 424448 350784 368000 370368 401920 361216 310080 460032 464128 451008 451008 451008 451008 452672 445248 445248 445248 444096 441024 443456 435776 207360 201472 246720 246720 246720 316992 321664 319744 316288 310464 292992 291328 308928 305088 305088 366784 366784 366784 243776 243776 280256 280256 280256 280256 280256 280256 280256 280256 251520 256256 249024 219072 245504 236800 258496 234368 292544 223808 265408 253312 244480 256832 230400 190272 180416 175040 116096 233216 197696 198528 198528 176704 171008 154240 182080 182080 247232 223232 234304 234304 297472 251328 251328 251328 243136 243136 273472 470272 314816 314944 281664 293760 409024 507712 508480 478080 527232 527232 529280 614720 508224 601536 495488 538816 339200 309696 989952 303744 303744 303744 303744 396096 398144 398144 364800 481664 356224 281984 255296 277888 215232 234304 241088 217792 245312 321536 322752 319936 342080 338368 352704 408128 464128 399808 504704 426368 92416 515712 616256 511552 556992 459456 60928 1958400 130560 2141120 189312 264704 651200 398656 470848 396544 431360 423616 375360 359680 384000 350272 400832 356736 354432 348800 271552 324288 321856 311616 350656 381312 322112 346624 352512 345216 313216 174272 256960 278208 266112 272384 350912 324352 399168 404416 350336 384768 366336 372288 367488 321408 365888 415872 392256 380032 408064 335232 298816 297408 267456 336960 339904 344192 371584 325120 382464 376896 399872 423936 370048 428352 372352 394432 332608 385152 380672 351104 367552 408832 430080 395264 437248 268032 296576 239872 285312 286464 310912 286656 311104 356416 381952 346944 327808 429888 432640 425600 419392 418816 444032 274368 287744 351424 368768 386880 398400 374016 365120 365120 382592 288512 310400 339136 349696 353536 374272 359360 381056 363072 355328 338176 323136 349696 262336 354432 344512 350592 318912 259008 311040 301184 317760 324416 354048 334912 346176 330560 341888 318848 323264 322816 323520 322304 305856 277376 305024 324544 324096 343232 334976 353088 336896 324736 346368 321856 327616 348416 341056 359936 345408 369408 364288 352960 336192 361408 348672 347392 279168 310400 156096 280960 248256 373440 300416 377856 343616 272320 350912 337152 355584 334144 325376 311424 345984 318080 354432 324736 348352 327616 316288 295872 283584 283520 277952 282112 302592 322368 290240 281984 268032 298816 356800 465344 488000 466816 436416 445440 445952 429440 465792 408064 459392 369088 454784 413632 445888 514688 478016 496192 434560 345600 411392 387584 370688 236608 427328 377856 425728 419136 365504 385856 392128 448256 449984 456448 456512 450944 384512 392960 381504 358784 342848 369408 372544 376320 349120 416832 353408 390528 433024 474112 464320 426048 437248 399488 417280 421120 416384 421184 421824 426240 493824 453504 390528 391616 393984 424000 395712 433088 449984 419008 379328 346880 481600 463680 499264 439424 412928 471936 471744 460096 364672 411456 393728 443584 396992 442816 446912 446912 388800 444352 351488 432576 371840 399552 365376 445888 462336 470016 397952 444032 449152 306752 377024 310464 370752 352768 442176 434176 337152 402560 447104 453376 413120 393920 324352 295680 275968 352256 150016 178688 370688 346112 159232 263488 444736 458624 454592 226624 246848 226688 296320 229632 255936 371392 333888 300736 270016 280512 327104 285504 301888 298688 282944 281216 295872 311616 345664 303744 308992 275840 278272 221760 248768 237696 214464 174144 192960 226880 226176 265728 201920 146816 213376 167360 157632 111488 168448 259712 276096 262400 347648 269632 402560 461696 384384 394688 369472 306560 299136 308160 277184 242880 441216 419712 441152 383872 421824 481792 396480 395392 501440 539264 381696 396864 346688 344448 288832 326272 380096 363776 373248 297024 318016 319424 262912 260352 318144 315200 315200 345088 345472 345728 346112 346112 241152 240000 249472 324800 220672 349568 308352 465536 506752 531392 477952 404608 521728 363520 404928 445504 359104 394432 284736 354176 359680 344576 338752 492288 420928 426368 393984 340288 387904 351936 289792 283776 305536 306304 331072 393728 401024 434176 336448 373184 420800 430336 417792 341312 370176 364416 366656 354368 302592 268160 346688 333248 347200 302208 311360 326720 347968 377152 373312 381760 363136 347648 351488 386624 410688 415104 453504 512576 488832 449600 455680 488512 375104 396416 382528 374656 308352 406784 387712 417024 385472 370432 371264 417152 396032 359872 326912 349696 343232 346944 342784 334272 335616 374720 366784 380032 420672 410944 424384 401280 448576 364864 457024 414912 413376 391744 391360 341376 338432 365056 355776 395968 330624 376960 386944 401088 403648 404160 405952 279232 387328 380160 360576 393664 372608 385024 350336 379520 367424 375744 385280 364864 371584 363392 359360 385472 361792 378176 353856 342016 326016 328832 348032 278592 354432 343232 348160 298496 336704 262976 301632 307584 330240 335744 360064 363392 332288 365312 327232 342656 337920 315264 323904 326080 322944 318784 322496 283776 287168 280576 266624 258176 320256 320896 323712 313472 318080 319040 322816 242752 298880 310912 309056 291776 299520 312448 290176 317568 311040 287104 302272 273920 308608 275392 247488 286656 234688 201920 195008 193728 178752 207104 273280 281408 302976 347072 318912 311680 358720 399616 363968 326528 361408 340672 364096 344384 327360 309888 283328 315648 283200 312256 353792 352192 358592 326464 319872 309696 325312 326336 348032 359168 363584 365120 293312 330368 295488 361408 356608 337984 300288 338816 269376 324672 309760 255040 347648 249088 285568 346496 376960 325120 381248 300480 362752 369984 400768 368640 368448 275392 368000 363648 377856 290240 336192 312064 353344 313280 240704 241472 241472 181312 222976 252672 266496 260160 238144 238144 202304 197952 197952 197952 197952 251520 259968 271872 293440 286208 291328 282176 281152 261376 261376 257280 257280 298432 298432 283136 299520 299520 299520 299520 327936 315392 315392 315392 308544 283648 256256 234560 230464 248128 242368 239808 239808 239808 240000 386496 367168 446528 431360 374720 392576 403456 367616 302592 319936 367680 371776 334848 338240 328192 317632 326912 375104 349248 311808 266432 258240 295680 253760 331904 266752 359936 334528 411904 481408 486976 432384 432384 491584 422080 326400 392000 344384 322496 324352 318848 336960 328832 306560 269568 311936 371584 379328 351360 377536 373888 385920 472000 450688 433536 446400 387584 430848 355584 358144 344256 424704 377664 438272 444992 460992 426048 426304 412352 384768 374208 384064 386496 383424 410304 406720 427776 413760 351680 449600 436480 437248 438720 421440 423232 325888 390848 417280 440704 381952 411392 393216 488256 397504 390912 346176 288896 278464 265792 363392 381376 345792 364544 366592 366400 313792 366080 340800 386496 366464 392000 410368 433664 433728 446080 423616 464192 435840 425088 448320 429184 377536 407552 322944 400320 354880 358848 353984 315008 377216 319808 329280 272320 341056 308800 294976 287040 374336 339712 278912 279872 276416 285056 256000 318336 401984 422656 379072 396800 355200 431424 425664 416064 413632 426304 363392 364992 345856 379776 368704 340928 283648 236736 313920 377792 358400 381376 381504 395008 397440 390144 363456 363456 460672 426624 418112 379136 375616 347776 311616 341824 400512 431168 391872 406528 388736 402112 392640 490496 459520 492416 449472 491392 440832 424000 450048 379712 416192 324608 448896 336576 425088 385920 385920 379264 372096 346304 275712 309632 305856 307328 320000 345472 346496 362368 362368 335104 363968 391424 368768 353344 301312 329536 313408 310208 354176 374400 371328 404672 299712 405184 445312 382848 369472 416576 425280 397376 414912 411776 415296 409408 426816 327872 367744 356032 329408 378048 360512 313920 388032 397888 406016 385792 422336 376384 320000 285376 376256 296320 334080 332288 328832 275520 283264 278656 385984 385984 266432 370304 334272 356352 403136 388160 381760 351680 349376 349248 320960 261952 389504 315328 311488 385408 302976 282560 297728 257088 309824 173952 351424 414336 414400 371648 285440 329088 284480 306752 329984 373760 376064 428032 411264 464448 460864 480256 504384 500992 491392 408896 406848 388224 399488 362240 401536 390720 339200 345344 247808 370752 381824 435328 463168 388032 410816 401600 397952 327744 333696 378432 349952 367552 468992 427200 454720 389632 484416 393664 504448 378304 503616 367936 472448 447680 432128 413696 420672 422592 509952 697152 701760 685696 735168 706880 654400 680384 659200 553216 557760 578048 624640 609472 621760 586688 618944 534208 609792 538752 508032 629120 530432 579072 535360 568640 608768 483648 624000 632000 600704 596672 584768 554240 589376 588224 519040 557248 614016 617088 589888 608192 608192 626688 628672 621120 589120 607872 684736 706496 682240 683200 631296 651456 627840 667648 656768 692416 607680 707072 572800 683648 540608 358208 513856 567104 580480 476608 318784 648960 628608 634624 679360 658304 461824 808512 758208 758208 768192 520960 542336 483200 663424 744064 739072 637184 760320 773120 672384 793728 691456 783360 742400 720128 683328 748992 660416 540352 506816 543808 552704 541248 481728 506816 548160 511552 542016 506304 510976 511232 452736 471488 467328 511744 485120 456576 435648 542912 651136 559936 595648 637504 596224 492352 554496 535232 497984 569152 562112 539264 556672 492480 376960 427968 429760 435648 497408 519168 502976 334272 463680 279616 404352 491776 486912 434624 508096 519488 474816 462272 503808 480768 468736 468736 529728 487872 487872 490240 490752 510080 449856 570112 582848 492096 479552 499648 536960 536960 536960 629440 658432 578368 549056 564736 524992 509440 540608 530688 526912 511936 503552 475456 470400 440768 431744 402304 393984 401152 416640 446720 448896 398208 483776 464320 376256 479040 370816 369792 466752 466432 368128 324480 390208 383040 447744 488064 367104 465920 297984 387392 291648 356352 450304 453760 458048 468544 370496 587264 589312 566400 584576 573248 663872 647360 644480 558784 507584 582656 611776 521984 439040 573312 596608 530176 465856 525184 505152 458048 396928 439104 667264 437440 493888 613632 597952 413056 590208 543104 374848 530432 491200 598400 607104 590784 548672 572352 572352 572352 491264 484096 582400 582400 582400 582400 582400 668096 613568 613568 613568 613568 629376 629376 629376 629376 629504 543680 539840 539840 533376 525312 568192 569536 569536 529856 539264 507136 381824 516672 395968 477056 434752 447360 455040 474752 474752 451264 449472 449472 449472 394944 333568 369472 348928 311360 346816 364032 177088 314368 390784 343552 338112 335680 332288 332288 332288 373824 373824 373824 373824 416960 430784 419264 407680 427392 408832 366976 438080 406400 357120 292736 281536 333888 360704 336064 423296 222848 250304 426688 429248 382464 362432 347392 453184 536704 398976 376256 443072 395008 396288 368704 393408 399872 366784 381376 415744 366336 346048 349632 329216 344384 283712 281600 205632 295552 345664 343488 304576 364864 361984 375040 344448 371840 356352 344192 367424 324352 295552 286400 281600 281600 281600 278080 278080 278080 278080 278080 235968 279936 278080 243520 244352 244352 244352 375360 370752 368448 368448 374208 374208 323904 376256 220736 247104 222016 222016 222464 220672 282560 262784 262784 241664 241664 290432 290432 290432 291264 286272 286272 280192 280192 329856 274624 275712 355072 288704 288704 288704 272448 260160 260160 206016 206272 197760 204416 204416 272768 283968 280064 336896 336896 336896 306688 306688 321728 316480 309184 309184 300608 300608 300608 298560 279168 279808 279808 267136 264832 166720 117568 117568 117568 230080 279040 307008 313152 311936 316288 380352 381824 342912 342912 395200 321088 305856 305856 305856 306048 309888 395776 398016 395008 353920 353920 322048 354624 354624 404928 401856 376320 376320 376320 368384 368384 368384 336192 314368 311104 311104 311104 302080 293888 322176 322304 313984 313984 313984 313984 287104 285568 285568 283072 315072 315072 339648 339648 339648 339648 331904 308800 308800 584640 563136 573440 537408 537408 526400 526400 523840 523840 523840 523840 523840 523840 523840 523840 520768 520768 520768 520768 520768 496128 496128 496128 557248 557248 557248 554304 548928 570944 570944 570944 570944 570944 570944 570944 568256 565952 561984 559104 555648 551296 551296 551296 551296 621888 [-- Attachment #4: merge_vs_nomerge.png --] [-- Type: image/png, Size: 34956 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 6:35 ` Sergey Senozhatsky @ 2015-09-04 7:01 ` Linus Torvalds 2015-09-04 7:59 ` Sergey Senozhatsky 0 siblings, 1 reply; 42+ messages in thread From: Linus Torvalds @ 2015-09-04 7:01 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Jesper Dangaard Brouer, Dave Chinner, Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, Sergey Senozhatsky On Thu, Sep 3, 2015 at 11:35 PM, Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote: > > Out of curiosity, I did some quite simple-minded > "slab_nomerge = 0" vs. "slab_nomerge = 1" tests today on my old > x86_64 box (4gigs of RAM, ext4, 4.2.0-next-20150903): So out of interest, was this slab or slub? Also, how repeatable is this? The memory usage between two boots tends to be rather fragile - some of the bigger slab users are dentries and inodes, and various filesystem scanning events will end up skewing things a _lot_. But if it turns out that the numbers are pretty stable, and sharing really doesn't save memory, then that is certainly a big failure. I think Christoph did much of his work for bigger machines where one of the SLAB issues was the NUMA overhead, and who knows - maybe it worked well for the load and machine in question, but not necessarily elsewhere. Interesting. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 7:01 ` Linus Torvalds @ 2015-09-04 7:59 ` Sergey Senozhatsky 2015-09-04 9:56 ` Sergey Senozhatsky ` (2 more replies) 0 siblings, 3 replies; 42+ messages in thread From: Sergey Senozhatsky @ 2015-09-04 7:59 UTC (permalink / raw) To: Linus Torvalds Cc: Sergey Senozhatsky, Jesper Dangaard Brouer, Dave Chinner, Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, Sergey Senozhatsky [-- Attachment #1: Type: text/plain, Size: 16835 bytes --] On (09/04/15 00:01), Linus Torvalds wrote: > On Thu, Sep 3, 2015 at 11:35 PM, Sergey Senozhatsky > <sergey.senozhatsky.work@gmail.com> wrote: > > > > Out of curiosity, I did some quite simple-minded > > "slab_nomerge = 0" vs. "slab_nomerge = 1" tests today on my old > > x86_64 box (4gigs of RAM, ext4, 4.2.0-next-20150903): > > So out of interest, was this slab or slub? Also, how repeatable is > this? The memory usage between two boots tends to be rather fragile - > some of the bigger slab users are dentries and inodes, and various > filesystem scanning events will end up skewing things a _lot_. > > But if it turns out that the numbers are pretty stable, and sharing > really doesn't save memory, then that is certainly a big failure. I > think Christoph did much of his work for bigger machines where one of > the SLAB issues was the NUMA overhead, and who knows - maybe it worked > well for the load and machine in question, but not necessarily > elsewhere. > > Interesting. > grep SLAB .config # CONFIG_SLAB is not set CONFIG_SLABINFO=y grep SLUB .config CONFIG_SLUB_DEBUG=y CONFIG_SLUB=y CONFIG_SLUB_CPU_PARTIAL=y # CONFIG_SLUB_DEBUG_ON is not set # CONFIG_SLUB_STATS is not set The numbers are stable on my box. Did another round of tests. Please find attached (hope attachments are OK): -- git clone glibc; make -j8; package; clean up It differs on both busy and idle systems. I was a bit surprised to see 0 unused memory .. 33472 56128 56128 0 0 0 0 0 0 0 0 0 0 0 59392 59392 59392 .. But I went through the corresponding slabinfo (I track slabinfo too); and yes, zero unused objects. slabinfo - version: 2.1 # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> ext4_groupinfo_1k 36 36 224 18 1 : tunables 0 0 0 : slabdata 2 2 0 ext4_groupinfo_4k 7412 7412 232 17 1 : tunables 0 0 0 : slabdata 436 436 0 sda2 117 117 104 39 1 : tunables 0 0 0 : slabdata 3 3 0 sd_ext_cdb 128 128 32 128 1 : tunables 0 0 0 : slabdata 1 1 0 scsi_sense_cache 224 224 128 32 1 : tunables 0 0 0 : slabdata 7 7 0 scsi_cmd_cache 166 234 448 18 2 : tunables 0 0 0 : slabdata 13 13 0 sgpool-128 16 16 4096 8 8 : tunables 0 0 0 : slabdata 2 2 0 sgpool-64 48 48 2048 16 8 : tunables 0 0 0 : slabdata 3 3 0 sgpool-32 64 64 1024 16 4 : tunables 0 0 0 : slabdata 4 4 0 sgpool-16 64 64 512 16 2 : tunables 0 0 0 : slabdata 4 4 0 sgpool-8 176 176 256 16 1 : tunables 0 0 0 : slabdata 11 11 0 scsi_data_buffer 0 0 24 170 1 : tunables 0 0 0 : slabdata 0 0 0 ip6-frags 0 0 280 29 2 : tunables 0 0 0 : slabdata 0 0 0 fib6_nodes 128 128 64 64 1 : tunables 0 0 0 : slabdata 2 2 0 ip6_dst_cache 42 42 384 21 2 : tunables 0 0 0 : slabdata 2 2 0 PINGv6 0 0 1472 22 8 : tunables 0 0 0 : slabdata 0 0 0 RAWv6 22 22 1472 22 8 : tunables 0 0 0 : slabdata 1 1 0 UDPLITEv6 0 0 1472 22 8 : tunables 0 0 0 : slabdata 0 0 0 UDPv6 44 44 1472 22 8 : tunables 0 0 0 : slabdata 2 2 0 tw_sock_TCPv6 0 0 272 30 2 : tunables 0 0 0 : slabdata 0 0 0 request_sock_TCPv6 0 0 312 26 2 : tunables 0 0 0 : slabdata 0 0 0 TCPv6 0 0 2752 11 8 : tunables 0 0 0 : slabdata 0 0 0 bsg_cmd 0 0 312 26 2 : tunables 0 0 0 : slabdata 0 0 0 mqueue_inode_cache 25 25 1280 25 8 : tunables 0 0 0 : slabdata 1 1 0 hugetlbfs_inode_cache 18 18 872 18 4 : tunables 0 0 0 : slabdata 1 1 0 jbd2_transaction_s 100 100 320 25 2 : tunables 0 0 0 : slabdata 4 4 0 jbd2_inode 340 340 48 85 1 : tunables 0 0 0 : slabdata 4 4 0 jbd2_journal_handle 204 204 80 51 1 : tunables 0 0 0 : slabdata 4 4 0 jbd2_journal_head 136 136 120 34 1 : tunables 0 0 0 : slabdata 4 4 0 jbd2_revoke_table_s 1024 1024 16 256 1 : tunables 0 0 0 : slabdata 4 4 0 jbd2_revoke_record_s 128 128 32 128 1 : tunables 0 0 0 : slabdata 1 1 0 ext4_inode_cache 2178 2178 1744 18 8 : tunables 0 0 0 : slabdata 121 121 0 ext4_free_data 192 192 64 64 1 : tunables 0 0 0 : slabdata 3 3 0 ext4_allocation_context 64 64 128 32 1 : tunables 0 0 0 : slabdata 2 2 0 ext4_prealloc_space 52 52 152 26 1 : tunables 0 0 0 : slabdata 2 2 0 ext4_system_zone 816 816 40 102 1 : tunables 0 0 0 : slabdata 8 8 0 ext4_io_end 224 224 72 56 1 : tunables 0 0 0 : slabdata 4 4 0 ext4_extent_status 3876 3876 40 102 1 : tunables 0 0 0 : slabdata 38 38 0 kioctx 0 0 896 18 4 : tunables 0 0 0 : slabdata 0 0 0 aio_kiocb 0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0 dio 0 0 704 23 4 : tunables 0 0 0 : slabdata 0 0 0 fasync_cache 42 42 96 42 1 : tunables 0 0 0 : slabdata 1 1 0 pid_namespace 0 0 2256 14 8 : tunables 0 0 0 : slabdata 0 0 0 posix_timers_cache 0 0 264 31 2 : tunables 0 0 0 : slabdata 0 0 0 UNIX 110 110 1472 22 8 : tunables 0 0 0 : slabdata 5 5 0 ip4-frags 0 0 264 31 2 : tunables 0 0 0 : slabdata 0 0 0 ip_mrt_cache 0 0 192 21 1 : tunables 0 0 0 : slabdata 0 0 0 UDP-Lite 0 0 1344 24 8 : tunables 0 0 0 : slabdata 0 0 0 tcp_bind_bucket 64 64 64 64 1 : tunables 0 0 0 : slabdata 1 1 0 inet_peer_cache 0 0 192 21 1 : tunables 0 0 0 : slabdata 0 0 0 ip_fib_trie 340 340 48 85 1 : tunables 0 0 0 : slabdata 4 4 0 ip_fib_alias 292 292 56 73 1 : tunables 0 0 0 : slabdata 4 4 0 ip_dst_cache 64 64 256 16 1 : tunables 0 0 0 : slabdata 4 4 0 PING 0 0 1280 25 8 : tunables 0 0 0 : slabdata 0 0 0 RAW 25 25 1280 25 8 : tunables 0 0 0 : slabdata 1 1 0 UDP 96 96 1344 24 8 : tunables 0 0 0 : slabdata 4 4 0 tw_sock_TCP 0 0 272 30 2 : tunables 0 0 0 : slabdata 0 0 0 request_sock_TCP 0 0 312 26 2 : tunables 0 0 0 : slabdata 0 0 0 TCP 12 12 2560 12 8 : tunables 0 0 0 : slabdata 1 1 0 eventpoll_pwq 224 224 72 56 1 : tunables 0 0 0 : slabdata 4 4 0 eventpoll_epi 192 192 128 32 1 : tunables 0 0 0 : slabdata 6 6 0 inotify_inode_mark 120 120 136 30 1 : tunables 0 0 0 : slabdata 4 4 0 blkdev_queue 22 22 2816 11 8 : tunables 0 0 0 : slabdata 2 2 0 blkdev_requests 322 322 344 23 2 : tunables 0 0 0 : slabdata 14 14 0 blkdev_ioc 88 88 184 22 1 : tunables 0 0 0 : slabdata 4 4 0 bio-0 315 315 192 21 1 : tunables 0 0 0 : slabdata 15 15 0 biovec-256 56 96 4096 8 8 : tunables 0 0 0 : slabdata 12 12 0 biovec-128 16 16 2048 16 8 : tunables 0 0 0 : slabdata 1 1 0 biovec-64 64 64 1024 16 4 : tunables 0 0 0 : slabdata 4 4 0 biovec-16 64 64 256 16 1 : tunables 0 0 0 : slabdata 4 4 0 uid_cache 64 64 128 32 1 : tunables 0 0 0 : slabdata 2 2 0 sock_inode_cache 153 153 960 17 4 : tunables 0 0 0 : slabdata 9 9 0 skbuff_fclone_cache 90 90 448 18 2 : tunables 0 0 0 : slabdata 5 5 0 skbuff_head_cache 320 320 256 16 1 : tunables 0 0 0 : slabdata 20 20 0 configfs_dir_cache 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 file_lock_cache 64 64 256 16 1 : tunables 0 0 0 : slabdata 4 4 0 file_lock_ctx 156 156 104 39 1 : tunables 0 0 0 : slabdata 4 4 0 net_namespace 0 0 4480 7 8 : tunables 0 0 0 : slabdata 0 0 0 shmem_inode_cache 1023 1023 1048 31 8 : tunables 0 0 0 : slabdata 33 33 0 pool_workqueue 64 64 256 16 1 : tunables 0 0 0 : slabdata 4 4 0 proc_inode_cache 1309 1309 928 17 4 : tunables 0 0 0 : slabdata 77 77 0 sigqueue 100 100 160 25 1 : tunables 0 0 0 : slabdata 4 4 0 bdev_cache 96 96 1344 24 8 : tunables 0 0 0 : slabdata 4 4 0 kernfs_node_cache 17836 17836 152 26 1 : tunables 0 0 0 : slabdata 686 686 0 mnt_cache 108 108 448 18 2 : tunables 0 0 0 : slabdata 6 6 0 filp 1757 1998 448 18 2 : tunables 0 0 0 : slabdata 111 111 0 inode_cache 9234 9234 872 18 4 : tunables 0 0 0 : slabdata 513 513 0 dentry 15036 15036 288 28 2 : tunables 0 0 0 : slabdata 537 537 0 names_cache 32 32 4096 8 8 : tunables 0 0 0 : slabdata 4 4 0 buffer_head 11427 11427 104 39 1 : tunables 0 0 0 : slabdata 293 293 0 nsproxy 170 170 48 85 1 : tunables 0 0 0 : slabdata 2 2 0 vm_area_struct 4462 4462 176 23 1 : tunables 0 0 0 : slabdata 194 194 0 mm_struct 112 112 1152 28 8 : tunables 0 0 0 : slabdata 4 4 0 fs_cache 105 105 192 21 1 : tunables 0 0 0 : slabdata 5 5 0 files_cache 95 95 832 19 4 : tunables 0 0 0 : slabdata 5 5 0 signal_cache 225 225 1280 25 8 : tunables 0 0 0 : slabdata 9 9 0 sighand_cache 182 182 2240 14 8 : tunables 0 0 0 : slabdata 13 13 0 task_struct 187 192 4928 6 8 : tunables 0 0 0 : slabdata 32 32 0 cred_jar 2179 2368 128 32 1 : tunables 0 0 0 : slabdata 74 74 0 Acpi-Operand 1680 1680 72 56 1 : tunables 0 0 0 : slabdata 30 30 0 Acpi-ParseExt 204 204 80 51 1 : tunables 0 0 0 : slabdata 4 4 0 Acpi-Parse 292 292 56 73 1 : tunables 0 0 0 : slabdata 4 4 0 Acpi-State 204 204 80 51 1 : tunables 0 0 0 : slabdata 4 4 0 Acpi-Namespace 1122 1122 40 102 1 : tunables 0 0 0 : slabdata 11 11 0 anon_vma_chain 4096 4096 64 64 1 : tunables 0 0 0 : slabdata 64 64 0 anon_vma 2472 2472 168 24 1 : tunables 0 0 0 : slabdata 103 103 0 pid 256 256 128 32 1 : tunables 0 0 0 : slabdata 8 8 0 radix_tree_node 2016 2016 584 28 4 : tunables 0 0 0 : slabdata 72 72 0 trace_event_file 1058 1058 88 46 1 : tunables 0 0 0 : slabdata 23 23 0 ftrace_event_field 2550 2550 48 85 1 : tunables 0 0 0 : slabdata 30 30 0 idr_layer_cache 300 300 2096 15 8 : tunables 0 0 0 : slabdata 20 20 0 page->ptl 2117 2117 56 73 1 : tunables 0 0 0 : slabdata 29 29 0 dma-kmalloc-8192 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-4096 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-2048 0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-1024 0 0 1024 16 4 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-512 0 0 512 16 2 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-256 0 0 256 16 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-128 0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-64 0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-32 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-16 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-8 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-192 0 0 192 21 1 : tunables 0 0 0 : slabdata 0 0 0 dma-kmalloc-96 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-8192 44 44 8192 4 8 : tunables 0 0 0 : slabdata 11 11 0 kmalloc-4096 200 200 4096 8 8 : tunables 0 0 0 : slabdata 25 25 0 kmalloc-2048 816 816 2048 16 8 : tunables 0 0 0 : slabdata 51 51 0 kmalloc-1024 672 672 1024 16 4 : tunables 0 0 0 : slabdata 42 42 0 kmalloc-512 544 544 512 16 2 : tunables 0 0 0 : slabdata 34 34 0 kmalloc-256 1344 1344 256 16 1 : tunables 0 0 0 : slabdata 84 84 0 kmalloc-192 903 903 192 21 1 : tunables 0 0 0 : slabdata 43 43 0 kmalloc-128 3168 3168 128 32 1 : tunables 0 0 0 : slabdata 99 99 0 kmalloc-96 1092 1092 96 42 1 : tunables 0 0 0 : slabdata 26 26 0 kmalloc-64 7424 7424 64 64 1 : tunables 0 0 0 : slabdata 116 116 0 kmalloc-32 1792 1792 32 128 1 : tunables 0 0 0 : slabdata 14 14 0 kmalloc-16 3584 3584 16 256 1 : tunables 0 0 0 : slabdata 14 14 0 kmalloc-8 5120 5120 8 512 1 : tunables 0 0 0 : slabdata 10 10 0 kmem_cache_node 224 224 128 32 1 : tunables 0 0 0 : slabdata 7 7 0 kmem_cache 189 189 192 21 1 : tunables 0 0 0 : slabdata 9 9 0 -ss [-- Attachment #2: slab_glibc_nomerge --] [-- Type: text/plain, Size: 3491 bytes --] 46016 90688 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 33472 33472 56128 56128 0 0 0 0 0 0 0 0 0 0 0 59392 59392 59392 43008 43008 43008 26624 26624 26624 26624 7168 7168 61440 61440 61440 57344 49152 29696 47104 47104 47104 19456 77824 77824 77824 77824 45056 45056 45056 45056 45056 45056 63488 63488 19456 19456 19456 19456 19456 19456 19456 47104 47104 19456 19456 19456 19456 19456 19456 19456 19456 19456 19456 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 67584 63488 63488 121280 121280 125120 90112 114368 63488 105728 142720 159488 155584 158400 195200 117952 208768 70272 72704 89664 134336 80512 109120 109120 109120 54016 30720 30720 67072 134912 134912 132928 132928 132928 117120 117120 74240 48576 115648 53248 173056 122496 129152 69440 127808 107456 51392 100544 50944 110976 103040 46336 56704 153216 91520 161472 161472 139264 167872 81920 96704 96704 96704 96704 96704 96704 96704 96704 142976 142976 142976 81920 122176 122176 139712 172480 170176 168192 143872 143872 168256 143552 188864 188864 188864 99328 81920 151616 81920 94016 94016 94016 94016 94016 94016 94016 157248 81920 150144 198464 174784 136640 121216 92608 153856 174144 174144 174144 127296 127296 127296 121792 121792 121792 121792 113280 113280 113280 113280 113280 113280 113280 113280 113280 113280 113280 113280 109504 109504 109504 109504 74176 111936 66560 125440 124352 128064 74368 123136 68160 118912 118912 135616 59904 139904 136320 188544 188544 169088 169088 76352 104128 102208 133440 109824 114560 101184 120064 114432 119936 108352 104128 177088 97984 112128 120320 106944 187328 85952 74624 74496 122944 112704 122816 48512 40448 40448 102720 104448 149824 101504 91008 245632 192384 206144 206592 173696 195648 113152 156416 106368 262144 186304 155712 114880 183296 107328 102592 161728 153728 80064 124672 110528 116288 141888 95104 40448 153920 138240 40448 40448 40448 40448 40448 168064 175488 115712 112768 94272 52672 103680 87040 122944 100416 36352 79360 79360 79360 83328 112192 112192 112192 112192 89088 86144 96896 94656 36352 114624 115712 106496 96128 96128 96128 72064 72064 72064 40064 40064 36352 36352 36352 134912 125248 56384 37504 36352 36352 36352 36352 36352 36352 36352 36352 82240 71616 59776 36352 36352 36352 38656 36352 103232 88896 36352 59840 36352 133440 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 91968 91968 91968 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 80640 36352 55936 78912 36352 74112 36352 135744 167808 193088 157120 157120 125376 174784 232128 232128 222144 206912 206912 206912 206912 198976 178816 154880 154880 167680 208128 151296 139520 113792 80640 76032 76032 76032 76032 76032 76032 76032 62592 62592 62592 62592 62592 62592 62592 62592 58624 48384 73856 73856 73856 73088 40832 40832 40832 40832 40832 40832 40832 40832 40832 40832 40832 40832 40832 36352 36352 36352 36352 36352 36352 36352 36352 64320 36352 36352 36352 36352 36352 36352 [-- Attachment #3: slab_glibc_merge --] [-- Type: text/plain, Size: 3780 bytes --] 180032 119360 106176 200640 193472 156864 156864 156864 181440 181440 169152 169152 122048 179136 179392 203200 232832 232832 302848 303360 303872 304384 308608 326656 354240 327744 334272 347456 375808 326592 308672 304960 344768 324352 318464 340736 340736 351488 340736 340736 336768 356736 352640 336768 353856 352576 346560 343488 337600 340928 359616 340672 337600 340416 355008 360640 341504 365568 331456 353152 353600 330496 366464 358784 371072 318336 318336 304256 330880 327552 296832 295040 295040 297088 312192 298624 304256 296320 411392 431104 414720 416512 421888 429312 422656 411392 394240 415744 361728 366848 356864 372480 372480 390656 378624 382208 371968 381440 364544 364288 318336 318336 307328 307328 378688 358016 365824 372352 367488 346752 359168 339520 351168 334656 341568 330816 327488 335936 410368 388608 379648 373120 375680 426752 358464 409216 396864 417216 364480 433472 410240 472000 436800 373440 398080 398080 398080 398080 361984 361984 381184 381440 381440 381824 373632 457728 427968 436160 422016 463680 412096 416768 382976 347264 380416 332928 349568 285760 302912 248128 342720 338624 338624 335360 331456 331712 325184 302272 303104 303104 303104 329856 327424 483968 391232 368320 445312 395456 421376 344320 269824 256576 313600 304896 366464 355264 362112 456000 398208 526272 497280 518144 507200 511040 425536 349248 357824 389376 391616 372416 365760 422784 449664 538560 492992 537408 528192 524608 409280 468608 434304 398272 397568 468928 412160 381184 412224 372096 375296 390656 316352 307264 371648 486720 403776 400064 365120 389760 490944 355200 447168 428416 439680 501056 595200 591872 541312 549120 554048 646080 545344 572288 527872 561344 601472 595520 593728 665984 557248 683584 607808 553920 612032 603392 523264 589888 469504 690816 777728 677312 562240 407488 429440 500992 478592 405760 427456 534016 369472 369216 468352 396672 443392 418432 477504 445632 382720 282816 279424 303808 273408 250176 304256 277632 283136 297472 256320 244096 219328 267584 281216 259456 284160 341824 333248 315392 283008 263360 313280 401728 329344 291200 428032 284800 384960 371904 303232 342400 271168 292992 297408 363008 366336 401920 426176 362944 315840 337792 298752 306560 332352 323968 243520 329088 338432 310336 322112 319872 263552 395520 290240 278592 314304 254592 388416 342208 504640 501568 389248 430528 506112 465984 413952 474624 456896 377344 298048 308736 376768 381184 378432 406080 387840 401408 394560 299200 338880 293952 256576 282368 247872 276224 292032 360576 305920 364160 323584 416448 307904 289856 367040 322048 341120 397888 331072 306688 340928 319872 319872 328960 388352 382336 428224 380480 366720 313792 285120 321024 298304 285120 239168 388288 454464 430336 427904 520704 472576 525312 487296 591296 541760 523968 523072 548096 453056 509504 524800 499520 484992 440000 444160 440576 439360 420224 369344 372608 372608 372608 372608 372608 372608 372608 372608 372608 372608 356224 356224 356224 356224 356224 411584 399296 399488 399488 377472 541312 525760 525760 525760 525760 462784 462784 437248 431872 477120 563968 562048 559296 556224 547520 565888 582336 563968 552320 552320 543680 604928 601920 654720 633792 624704 575168 562112 545088 559872 594880 577024 585600 568128 559104 614400 576256 485504 529408 529408 524672 505664 497664 497664 495488 495488 483520 452032 454656 451136 451136 510592 514048 514048 514048 532032 487104 463040 426176 426176 426176 426176 403392 403392 401472 368832 368832 368832 404800 404800 404800 422336 422336 422336 422336 390400 361920 362048 359360 392704 390336 387968 388672 388672 426112 352704 356672 356800 356800 356800 357248 357248 357248 357248 357248 357248 357248 405952 405952 405952 405952 405952 444032 444032 442880 [-- Attachment #4: glibc-merge_vs_nomerge.png --] [-- Type: image/png, Size: 30161 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 7:59 ` Sergey Senozhatsky @ 2015-09-04 9:56 ` Sergey Senozhatsky 2015-09-04 14:05 ` Christoph Lameter 2015-09-04 14:11 ` Linus Torvalds 2 siblings, 0 replies; 42+ messages in thread From: Sergey Senozhatsky @ 2015-09-04 9:56 UTC (permalink / raw) To: Linus Torvalds Cc: Jesper Dangaard Brouer, Dave Chinner, Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, Sergey Senozhatsky, Sergey Senozhatsky [-- Attachment #1: Type: text/plain, Size: 354 bytes --] On (09/04/15 16:59), Sergey Senozhatsky wrote: > > It differs on both busy and idle systems. > 1) IDLE system right after reboot and 2) under `desktop machine` load (ssh, firefox, vim, etc.). So yes, the behaviour seems to be stable on my box. Only gnuplot graphs attached this time (let me know if files with the numbers are of any interest). -ss [-- Attachment #2: idle-merge_vs_nomerge.png --] [-- Type: image/png, Size: 17240 bytes --] [-- Attachment #3: desktop-merge_vs_nomerge.png --] [-- Type: image/png, Size: 20249 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 7:59 ` Sergey Senozhatsky 2015-09-04 9:56 ` Sergey Senozhatsky @ 2015-09-04 14:05 ` Christoph Lameter 2015-09-04 14:11 ` Linus Torvalds 2 siblings, 0 replies; 42+ messages in thread From: Christoph Lameter @ 2015-09-04 14:05 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Linus Torvalds, Jesper Dangaard Brouer, Dave Chinner, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, Sergey Senozhatsky [-- Attachment #1: Type: TEXT/PLAIN, Size: 532 bytes --] On Fri, 4 Sep 2015, Sergey Senozhatsky wrote: > But I went through the corresponding slabinfo (I track slabinfo too); and yes, > zero unused objects. Please use the slabinfo tool. What you see in /proc/slabinfo is generated for slab compatibility and may not show useful numbers. Run gcc -o slabinfo tools/vm/slabinfo.c slabinfo -T to get an overview of the fragmentation etc state of the slab caches. Run slabinfo to get individual cache statistics It would be helpful to compare the output with and without merging. [-- Attachment #2: Type: TEXT/PLAIN, Size: 3491 bytes --] 46016 90688 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 36608 33472 33472 56128 56128 0 0 0 0 0 0 0 0 0 0 0 59392 59392 59392 43008 43008 43008 26624 26624 26624 26624 7168 7168 61440 61440 61440 57344 49152 29696 47104 47104 47104 19456 77824 77824 77824 77824 45056 45056 45056 45056 45056 45056 63488 63488 19456 19456 19456 19456 19456 19456 19456 47104 47104 19456 19456 19456 19456 19456 19456 19456 19456 19456 19456 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 3072 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 63488 67584 63488 63488 121280 121280 125120 90112 114368 63488 105728 142720 159488 155584 158400 195200 117952 208768 70272 72704 89664 134336 80512 109120 109120 109120 54016 30720 30720 67072 134912 134912 132928 132928 132928 117120 117120 74240 48576 115648 53248 173056 122496 129152 69440 127808 107456 51392 100544 50944 110976 103040 46336 56704 153216 91520 161472 161472 139264 167872 81920 96704 96704 96704 96704 96704 96704 96704 96704 142976 142976 142976 81920 122176 122176 139712 172480 170176 168192 143872 143872 168256 143552 188864 188864 188864 99328 81920 151616 81920 94016 94016 94016 94016 94016 94016 94016 157248 81920 150144 198464 174784 136640 121216 92608 153856 174144 174144 174144 127296 127296 127296 121792 121792 121792 121792 113280 113280 113280 113280 113280 113280 113280 113280 113280 113280 113280 113280 109504 109504 109504 109504 74176 111936 66560 125440 124352 128064 74368 123136 68160 118912 118912 135616 59904 139904 136320 188544 188544 169088 169088 76352 104128 102208 133440 109824 114560 101184 120064 114432 119936 108352 104128 177088 97984 112128 120320 106944 187328 85952 74624 74496 122944 112704 122816 48512 40448 40448 102720 104448 149824 101504 91008 245632 192384 206144 206592 173696 195648 113152 156416 106368 262144 186304 155712 114880 183296 107328 102592 161728 153728 80064 124672 110528 116288 141888 95104 40448 153920 138240 40448 40448 40448 40448 40448 168064 175488 115712 112768 94272 52672 103680 87040 122944 100416 36352 79360 79360 79360 83328 112192 112192 112192 112192 89088 86144 96896 94656 36352 114624 115712 106496 96128 96128 96128 72064 72064 72064 40064 40064 36352 36352 36352 134912 125248 56384 37504 36352 36352 36352 36352 36352 36352 36352 36352 82240 71616 59776 36352 36352 36352 38656 36352 103232 88896 36352 59840 36352 133440 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 109888 91968 91968 91968 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 92096 80640 36352 55936 78912 36352 74112 36352 135744 167808 193088 157120 157120 125376 174784 232128 232128 222144 206912 206912 206912 206912 198976 178816 154880 154880 167680 208128 151296 139520 113792 80640 76032 76032 76032 76032 76032 76032 76032 62592 62592 62592 62592 62592 62592 62592 62592 58624 48384 73856 73856 73856 73088 40832 40832 40832 40832 40832 40832 40832 40832 40832 40832 40832 40832 40832 36352 36352 36352 36352 36352 36352 36352 36352 64320 36352 36352 36352 36352 36352 36352 [-- Attachment #3: Type: TEXT/PLAIN, Size: 3780 bytes --] 180032 119360 106176 200640 193472 156864 156864 156864 181440 181440 169152 169152 122048 179136 179392 203200 232832 232832 302848 303360 303872 304384 308608 326656 354240 327744 334272 347456 375808 326592 308672 304960 344768 324352 318464 340736 340736 351488 340736 340736 336768 356736 352640 336768 353856 352576 346560 343488 337600 340928 359616 340672 337600 340416 355008 360640 341504 365568 331456 353152 353600 330496 366464 358784 371072 318336 318336 304256 330880 327552 296832 295040 295040 297088 312192 298624 304256 296320 411392 431104 414720 416512 421888 429312 422656 411392 394240 415744 361728 366848 356864 372480 372480 390656 378624 382208 371968 381440 364544 364288 318336 318336 307328 307328 378688 358016 365824 372352 367488 346752 359168 339520 351168 334656 341568 330816 327488 335936 410368 388608 379648 373120 375680 426752 358464 409216 396864 417216 364480 433472 410240 472000 436800 373440 398080 398080 398080 398080 361984 361984 381184 381440 381440 381824 373632 457728 427968 436160 422016 463680 412096 416768 382976 347264 380416 332928 349568 285760 302912 248128 342720 338624 338624 335360 331456 331712 325184 302272 303104 303104 303104 329856 327424 483968 391232 368320 445312 395456 421376 344320 269824 256576 313600 304896 366464 355264 362112 456000 398208 526272 497280 518144 507200 511040 425536 349248 357824 389376 391616 372416 365760 422784 449664 538560 492992 537408 528192 524608 409280 468608 434304 398272 397568 468928 412160 381184 412224 372096 375296 390656 316352 307264 371648 486720 403776 400064 365120 389760 490944 355200 447168 428416 439680 501056 595200 591872 541312 549120 554048 646080 545344 572288 527872 561344 601472 595520 593728 665984 557248 683584 607808 553920 612032 603392 523264 589888 469504 690816 777728 677312 562240 407488 429440 500992 478592 405760 427456 534016 369472 369216 468352 396672 443392 418432 477504 445632 382720 282816 279424 303808 273408 250176 304256 277632 283136 297472 256320 244096 219328 267584 281216 259456 284160 341824 333248 315392 283008 263360 313280 401728 329344 291200 428032 284800 384960 371904 303232 342400 271168 292992 297408 363008 366336 401920 426176 362944 315840 337792 298752 306560 332352 323968 243520 329088 338432 310336 322112 319872 263552 395520 290240 278592 314304 254592 388416 342208 504640 501568 389248 430528 506112 465984 413952 474624 456896 377344 298048 308736 376768 381184 378432 406080 387840 401408 394560 299200 338880 293952 256576 282368 247872 276224 292032 360576 305920 364160 323584 416448 307904 289856 367040 322048 341120 397888 331072 306688 340928 319872 319872 328960 388352 382336 428224 380480 366720 313792 285120 321024 298304 285120 239168 388288 454464 430336 427904 520704 472576 525312 487296 591296 541760 523968 523072 548096 453056 509504 524800 499520 484992 440000 444160 440576 439360 420224 369344 372608 372608 372608 372608 372608 372608 372608 372608 372608 372608 356224 356224 356224 356224 356224 411584 399296 399488 399488 377472 541312 525760 525760 525760 525760 462784 462784 437248 431872 477120 563968 562048 559296 556224 547520 565888 582336 563968 552320 552320 543680 604928 601920 654720 633792 624704 575168 562112 545088 559872 594880 577024 585600 568128 559104 614400 576256 485504 529408 529408 524672 505664 497664 497664 495488 495488 483520 452032 454656 451136 451136 510592 514048 514048 514048 532032 487104 463040 426176 426176 426176 426176 403392 403392 401472 368832 368832 368832 404800 404800 404800 422336 422336 422336 422336 390400 361920 362048 359360 392704 390336 387968 388672 388672 426112 352704 356672 356800 356800 356800 357248 357248 357248 357248 357248 357248 357248 405952 405952 405952 405952 405952 444032 444032 442880 [-- Attachment #4: Type: IMAGE/PNG, Size: 30161 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 7:59 ` Sergey Senozhatsky 2015-09-04 9:56 ` Sergey Senozhatsky 2015-09-04 14:05 ` Christoph Lameter @ 2015-09-04 14:11 ` Linus Torvalds 2015-09-05 2:09 ` Sergey Senozhatsky 2 siblings, 1 reply; 42+ messages in thread From: Linus Torvalds @ 2015-09-04 14:11 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Jesper Dangaard Brouer, Dave Chinner, Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, Sergey Senozhatsky On Fri, Sep 4, 2015 at 12:59 AM, Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote: > > But I went through the corresponding slabinfo (I track slabinfo too); and yes, > zero unused objects. Ahh. I should have realized - the number you are actually tracking is meaningless. The "unused objects" thing is not really tracked well. /proc/slabinfo ends up not showing the percpu queue state, so things look "used" when they are really just on the percpu queues for that slab.So the "unused" number you are tracking is not really meaningful, and the zeroes you are seeing is just a symptom of that: slabinfo isn't "exact" enough. So you should probably do the statistics on something that is more meaningful: the actual number of pages that have been allocated (which would be numslabs times pages-per-slab). Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 14:11 ` Linus Torvalds @ 2015-09-05 2:09 ` Sergey Senozhatsky 0 siblings, 0 replies; 42+ messages in thread From: Sergey Senozhatsky @ 2015-09-05 2:09 UTC (permalink / raw) To: Linus Torvalds, Christoph Lameter Cc: Sergey Senozhatsky, Jesper Dangaard Brouer, Dave Chinner, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, Sergey Senozhatsky [-- Attachment #1: Type: text/plain, Size: 2566 bytes --] On (09/04/15 07:11), Linus Torvalds wrote: > > > > But I went through the corresponding slabinfo (I track slabinfo too); and yes, > > zero unused objects. > > Ahh. I should have realized - the number you are actually tracking is > meaningless. The "unused objects" thing is not really tracked well. > > /proc/slabinfo ends up not showing the percpu queue state, so things > look "used" when they are really just on the percpu queues for that > slab.So the "unused" number you are tracking is not really meaningful, > and the zeroes you are seeing is just a symptom of that: slabinfo > isn't "exact" enough. > > So you should probably do the statistics on something that is more > meaningful: the actual number of pages that have been allocated (which > would be numslabs times pages-per-slab). Aha... Didn't know that, sorry. Christoph Lameter wrote: > Please use the slabinfo tool. What you see in /proc/slabinfo is generated > for slab compatibility and may not show useful numbers. > OK. I did another round of tests git clone git://sourceware.org/git/glibc.git make -j8 package (xz) rm -fr glibc >From slabinfo -T output Slabcaches : 91 Aliases : 118->69 Active: 65 Memory used: 60.0M # Loss : 13.2M MRatio: 28% # Objects : 162.4K # PartObj: 10.6K ORatio: 6% Per Cache Average Min Max Total --------------------------------------------------------- #Objects 2.4K 11 19.0K 162.4K #Slabs 108 1 1.8K 7.0K #PartSlab 34 0 1.6K 2.2K %PartSlab 7% 0% 86% 31% PartObjs 6 0 4.7K 10.6K % PartObj 3% 0% 33% 6% Memory 923.9K 8.1K 10.2M 60.0M Used 720.3K 8.0K 9.7M 46.8M Loss 203.6K 0 6.1M 13.2M Per Object Average Min Max --------------------------------------------- Memory 290 8 8.1K User 288 8 8.1K Loss 1 0 64 I took the "Memory used: 60.0M # Loss : 13.2M MRatio: 28%" line and generated 3 graphs: -- "Memory used" MM -- "Loss" LOSS -- "MRatio" RATION for "slab_nomerge = 0" and "slab_nomerge = 1". ... And those are sort of interesting. I was expecting to see more diverged behaviours. Attached. Please let me know if you want to see files with the numbers (slabinfo -T only). -ss [-- Attachment #2: glibc-RATIO-merge_vs_nomerge.png --] [-- Type: image/png, Size: 15874 bytes --] [-- Attachment #3: glibc-LOSS-merge_vs_nomerge.png --] [-- Type: image/png, Size: 16482 bytes --] [-- Attachment #4: glibc-MM-merge_vs_nomerge.png --] [-- Type: image/png, Size: 16937 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) @ 2015-09-05 2:09 ` Sergey Senozhatsky 0 siblings, 0 replies; 42+ messages in thread From: Sergey Senozhatsky @ 2015-09-05 2:09 UTC (permalink / raw) To: Linus Torvalds, Christoph Lameter Cc: Sergey Senozhatsky, Jesper Dangaard Brouer, Dave Chinner, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, Sergey Senozhatsky [-- Attachment #1: Type: text/plain, Size: 2565 bytes --] On (09/04/15 07:11), Linus Torvalds wrote: > > > > But I went through the corresponding slabinfo (I track slabinfo too); and yes, > > zero unused objects. > > Ahh. I should have realized - the number you are actually tracking is > meaningless. The "unused objects" thing is not really tracked well. > > /proc/slabinfo ends up not showing the percpu queue state, so things > look "used" when they are really just on the percpu queues for that > slab.So the "unused" number you are tracking is not really meaningful, > and the zeroes you are seeing is just a symptom of that: slabinfo > isn't "exact" enough. > > So you should probably do the statistics on something that is more > meaningful: the actual number of pages that have been allocated (which > would be numslabs times pages-per-slab). Aha... Didn't know that, sorry. Christoph Lameter wrote: > Please use the slabinfo tool. What you see in /proc/slabinfo is generated > for slab compatibility and may not show useful numbers. > OK. I did another round of tests git clone git://sourceware.org/git/glibc.git make -j8 package (xz) rm -fr glibc From slabinfo -T output Slabcaches : 91 Aliases : 118->69 Active: 65 Memory used: 60.0M # Loss : 13.2M MRatio: 28% # Objects : 162.4K # PartObj: 10.6K ORatio: 6% Per Cache Average Min Max Total --------------------------------------------------------- #Objects 2.4K 11 19.0K 162.4K #Slabs 108 1 1.8K 7.0K #PartSlab 34 0 1.6K 2.2K %PartSlab 7% 0% 86% 31% PartObjs 6 0 4.7K 10.6K % PartObj 3% 0% 33% 6% Memory 923.9K 8.1K 10.2M 60.0M Used 720.3K 8.0K 9.7M 46.8M Loss 203.6K 0 6.1M 13.2M Per Object Average Min Max --------------------------------------------- Memory 290 8 8.1K User 288 8 8.1K Loss 1 0 64 I took the "Memory used: 60.0M # Loss : 13.2M MRatio: 28%" line and generated 3 graphs: -- "Memory used" MM -- "Loss" LOSS -- "MRatio" RATION for "slab_nomerge = 0" and "slab_nomerge = 1". ... And those are sort of interesting. I was expecting to see more diverged behaviours. Attached. Please let me know if you want to see files with the numbers (slabinfo -T only). -ss [-- Attachment #2: glibc-RATIO-merge_vs_nomerge.png --] [-- Type: image/png, Size: 15874 bytes --] [-- Attachment #3: glibc-LOSS-merge_vs_nomerge.png --] [-- Type: image/png, Size: 16482 bytes --] [-- Attachment #4: glibc-MM-merge_vs_nomerge.png --] [-- Type: image/png, Size: 16937 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-05 2:09 ` Sergey Senozhatsky (?) @ 2015-09-05 20:33 ` Linus Torvalds 2015-09-07 8:44 ` Sergey Senozhatsky -1 siblings, 1 reply; 42+ messages in thread From: Linus Torvalds @ 2015-09-05 20:33 UTC (permalink / raw) To: Sergey Senozhatsky Cc: Christoph Lameter, Jesper Dangaard Brouer, Dave Chinner, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, Sergey Senozhatsky On Fri, Sep 4, 2015 at 7:09 PM, Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> wrote: > > Aha... Didn't know that, sorry. Hey, I didn't react to it either. until you pointed out the oddity of "no free slab memory" Very easy to overlook. > ... And those are sort of interesting. I was expecting to see more > diverged behaviours. > > Attached. So I'm not sure how really conclusive these graphs are, but they are certainly fun to look at. So I have a few reactions: - that 'nomerge' spike at roughly 780s is interesting. I wonder why it does that. - it would be interesting to see - for example - which slabs are the top memory users, and not _just_ the total (it could clarify the spike, for example). That's obviously something that works much better for the no-merge case, but could your script be changed to show (say) the "top 5 slabs". Showing all of them would probably be too messy, but "top 5" could be interesting. - assuming the times are comparable, it looks like 'merge' really is noticeably faster. But that might just be noise too, so this may not be real data. - regardless of how meaningful the graphs are, and whether they really tell us anything, I do like the concept, and I'd love to see people do things like this more often. Visualization to show behavior is great. That last point in particular means that if you scripted this and your scripts aren't *too* ugly and not too tied to your particular setup, I think it would perhaps not be a bad idea to encourage plots like this by making those kinds of scripts available in the kernel tree. That's particularly true if you used something like the tools/testing/ktest/ scripts to run these things automatically (which can be a *big* issue to show that something is actually stable across multiple boots, and see the variance). So maybe these graphs are meaningful, and maybe they aren't. But I'd still like to see more of them ;) Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-05 20:33 ` Linus Torvalds @ 2015-09-07 8:44 ` Sergey Senozhatsky 2015-09-08 0:22 ` Sergey Senozhatsky 0 siblings, 1 reply; 42+ messages in thread From: Sergey Senozhatsky @ 2015-09-07 8:44 UTC (permalink / raw) To: Linus Torvalds Cc: Sergey Senozhatsky, Christoph Lameter, Jesper Dangaard Brouer, Dave Chinner, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, Sergey Senozhatsky [-- Attachment #1: Type: text/plain, Size: 19729 bytes --] On (09/05/15 13:33), Linus Torvalds wrote: > > ... And those are sort of interesting. I was expecting to see more > > diverged behaviours. > > > > Attached. Hello, sorry for long reply. > So I'm not sure how really conclusive these graphs are, but they are > certainly fun to look at. So I have a few reactions: > > - that 'nomerge' spike at roughly 780s is interesting. I wonder why > it does that. > Please find some stats below (with TOP 5 slabs). ~780s looks like the time when glibc build script begins to package glibc (gzip, xz...). > - it would be interesting to see - for example - which slabs are the > top memory users, and not _just_ the total (it could clarify the > spike, for example). That's obviously something that works much better > for the no-merge case, but could your script be changed to show (say) > the "top 5 slabs". Showing all of them would probably be too messy, > but "top 5" could be interesting. OFFTOP: Capturing is not a problem; visualizing -- is. With a huge number of samples the graph quickly becomes impossible to read. We have different N `top' slabs after every measurement, labeling them on a graph is a bit messy. So my script right now just picks the first slab (most Memory Used or biggest Loss value) per sample (e.g. every second) and does something like this (in png): 20 +-+---+------------+------------+------------+---+-+ | + + + + | | +------------+ SIZE +-----+ | 18 +-+ | | LOSS +-----+-+ | | | | | | | | | | | | 16 +-+ | | +-+ | | | | |------------+ | | 14 +-+ | | +-+ | | | | | | | +------------| | | | | | 12 +-+ |------------| | +-+ | | | | | | | | | | 10 +-+ | |-----------+ +-+ | | | | | | | | | | | | | | | 8 +-+----------| | | +-+ | | | |------------| | + | + | + | + | 6 +-+---+------------+------------+------------+---+-+ slab1 slab2 slab3 slab1 samples ^ ^ ^ ^ 1s 2s 3s 4s ... (<< not part of the graph) BACK to spikes. I modified `slabinfo' tool to report top N (5 in this case) slabs sorted by Memory usage and by Loss, along with Slab totals (+report everything in bytes, w/o the dynamic G/M/K scaling. well, techically Loss is `Space - Objects * Objsize' and can be calculated from the existing output, but I'm lazy. Besides top N biggest slabs and top N most fragmented ones do not necessarily overlap, so I print both sets). Some of the spikes. Samples are separated by "Sample #d". Test =============================================================================================== Sample -- 1 second. 98828288 -> 107409408 -> 100171776 Sample #408 Slabcache Totals ---------------- Slabcaches : 140 Aliases : 0->0 Active: 105 Memory used: 98828288 # Loss : 3872736 MRatio: 4% # Objects : 329484 # PartObj: 484 ORatio: 0% Per Cache Average Min Max Total --------------------------------------------------------- #Objects 3137 16 92313 329484 #Slabs 93 1 2367 9766 #PartSlab 0 0 8 57 %PartSlab 2% 0% 58% 0% PartObjs 0 0 142 484 % PartObj 0% 0% 38% 0% Memory 941221 4096 35258368 98828288 Used 904338 4096 33622848 94955552 Loss 36883 0 1635520 3872736 Per Object Average Min Max --------------------------------------------- Memory 289 8 8192 User 288 8 8192 Loss 1 0 64 Slabs sorted by size (5) --------------------------------------------------------- Name Objects Objsize Space Slabs/Part/Cpu O/S O %Fr %Ef Flg ext4_inode_cache 19368 1736 35258368 1072/0/4 18 3 0 95 a dentry 46200 288 13516800 1635/0/15 28 1 0 98 a inode_cache 12150 864 11059200 665/0/10 18 2 0 94 a buffer_head 92313 104 9695232 2363/0/4 39 0 0 99 a radix_tree_node 6832 576 3997696 240/0/4 28 2 0 98 a Slabs sorted by loss (5) --------------------------------------------------------- ext4_inode_cache 19368 1736 1635520 1072/0/4 18 3 0 95 a inode_cache 12150 864 561600 665/0/10 18 2 0 94 a dentry 46200 288 211200 1635/0/15 28 1 0 98 a biovec-256 46 4096 204800 7/7/5 8 3 58 47 A task_struct 174 4928 125568 19/3/11 6 3 10 87 Sample #409 Slabcache Totals ---------------- Slabcaches : 140 Aliases : 0->0 Active: 105 Memory used: 107409408 # Loss : 3782600 MRatio: 3% # Objects : 335908 # PartObj: 485 ORatio: 0% Per Cache Average Min Max Total --------------------------------------------------------- #Objects 3199 16 92742 335908 #Slabs 96 1 2378 10081 #PartSlab 0 0 39 67 %PartSlab 1% 0% 50% 0% # Objects : 335908 # PartObj: 485 ORatio: 0% # Objects : 335908 # PartObj: 485 ORatio: 0% Per Cache Average Min Max Total --------------------------------------------------------- #Objects 3199 16 92742 335908 #Slabs 96 1 2378 10081 #PartSlab 0 0 39 67 %PartSlab 1% 0% 50% 0% PartObjs 0 0 274 485 % PartObj 0% 0% 38% 0% Memory 1022946 4096 35422208 107409408 Used 986921 4096 33779088 103626808 Loss 36024 0 1643120 3782600 Per Object Average Min Max --------------------------------------------- Memory 310 8 8192 User 308 8 8192 Loss 1 0 64 Slabs sorted by size (5) --------------------------------------------------------- Name Objects Objsize Space Slabs/Part/Cpu O/S O %Fr %Ef Flg ext4_inode_cache 19458 1736 35422208 1077/0/4 18 3 0 95 a dentry 46620 288 13639680 1658/0/7 28 1 0 98 a inode_cache 12150 864 11059200 665/0/10 18 2 0 94 a buffer_head 92742 104 9740288 2367/0/11 39 0 0 99 a biovec-256 2128 4096 8749056 263/0/4 8 3 0 99 A Slabs sorted by loss (5) --------------------------------------------------------- ext4_inode_cache 19458 1736 1643120 1077/0/4 18 3 0 95 a inode_cache 12150 864 561600 665/0/10 18 2 0 94 a filp 2169 432 267216 134/39/13 18 1 26 77 A dentry 46620 288 213120 1658/0/7 28 1 0 98 a task_struct 165 4928 104384 18/2/10 6 3 7 88 Sample #410 Slabcache Totals ---------------- Slabcaches : 140 Aliases : 0->0 Active: 105 Memory used: 100171776 # Loss : 3975712 MRatio: 4% # Objects : 334759 # PartObj: 633 ORatio: 0% Per Cache Average Min Max Total --------------------------------------------------------- #Objects 3188 16 92859 334759 #Slabs 94 1 2381 9922 #PartSlab 0 0 12 74 %PartSlab 2% 0% 57% 0% PartObjs 0 0 209 633 % PartObj 0% 0% 38% 0% Memory 954016 4096 35618816 100171776 Used 916152 4096 33966576 96196064 Loss 37863 0 1652240 3975712 Per Object Average Min Max --------------------------------------------- Memory 289 8 8192 User 287 8 8192 Loss 1 0 64 Slabs sorted by size (5) --------------------------------------------------------- Name Objects Objsize Space Slabs/Part/Cpu O/S O %Fr %Ef Flg ext4_inode_cache 19566 1736 35618816 1083/0/4 18 3 0 95 a dentry 46788 288 13688832 1661/0/10 28 1 0 98 a inode_cache 12150 864 11059200 665/0/10 18 2 0 94 a buffer_head 92859 104 9752576 2371/0/10 39 0 0 99 a radix_tree_node 6888 576 4030464 242/0/4 28 2 0 98 a Slabs sorted by loss (5) --------------------------------------------------------- ext4_inode_cache 19566 1736 1652240 1083/0/4 18 3 0 95 a inode_cache 12150 864 561600 665/0/10 18 2 0 94 a biovec-256 54 4096 237568 8/8/6 8 3 57 48 A dentry 46788 288 213888 1661/0/10 28 1 0 98 a task_struct 169 4928 182976 20/5/11 6 3 16 81 Another test. =============================================================================================== Sample -- 1 second. 251637760 -> 306782208 -> 252264448 Sample #426 Slabcache Totals ---------------- Slabcaches : 140 Aliases : 0->0 Active: 107 Memory used: 251637760 # Loss : 11002192 MRatio: 4% # Objects : 528119 # PartObj: 6437 ORatio: 1% Per Cache Average Min Max Total --------------------------------------------------------- #Objects 4935 11 114582 528119 #Slabs 164 1 4718 17594 #PartSlab 3 0 141 394 %PartSlab 4% 0% 65% 2% PartObjs 1 0 2422 6437 % PartObj 2% 0% 42% 1% Memory 2351754 4096 154599424 251637760 Used 2248930 3584 147428064 240635568 Loss 102824 0 7171360 11002192 Per Object Average Min Max --------------------------------------------- Memory 457 8 8192 User 455 8 8192 Loss 2 0 64 Slabs sorted by size (5) --------------------------------------------------------- Name Objects Objsize Space Slabs/Part/Cpu O/S O %Fr %Ef Flg ext4_inode_cache 84924 1736 154599424 4714/0/4 18 3 0 95 a dentry 114408 288 33472512 4080/0/6 28 1 0 98 a buffer_head 114582 104 12034048 2934/0/4 39 0 0 99 a inode_cache 12186 864 11091968 667/0/10 18 2 0 94 a radix_tree_node 10388 576 6078464 367/0/4 28 2 0 98 a Slabs sorted by loss (5) --------------------------------------------------------- ext4_inode_cache 84924 1736 7171360 4714/0/4 18 3 0 95 a inode_cache 12186 864 563264 667/0/10 18 2 0 94 a dentry 114408 288 523008 4080/0/6 28 1 0 98 a kmalloc-128 4117 128 353664 160/141/55 32 0 65 59 kmalloc-2048 1421 2048 202752 80/27/15 16 3 28 93 Sample #427 Slabcache Totals ---------------- Slabcaches : 140 Aliases : 0->0 Active: 107 Memory used: 306782208 # Loss : 11304176 MRatio: 3% # Objects : 569050 # PartObj: 6538 ORatio: 1% Per Cache Average Min Max Total --------------------------------------------------------- #Objects 5318 11 114777 569050 #Slabs 187 1 4725 20096 #PartSlab 3 0 141 391 %PartSlab 3% 0% 65% 1% PartObjs 1 0 2422 6538 % PartObj 1% 0% 42% 1% Memory 2867123 4096 154828800 306782208 Used 2761476 3584 147646800 295478032 Loss 105646 0 7182000 11304176 Per Object Average Min Max --------------------------------------------- Memory 521 8 8192 User 519 8 8192 Loss 2 0 64 Slabs sorted by size (5) --------------------------------------------------------- Name Objects Objsize Space Slabs/Part/Cpu O/S O %Fr %Ef Flg ext4_inode_cache 85050 1736 154828800 4721/0/4 18 3 0 95 a biovec-256 12416 4096 50954240 1550/3/5 8 3 0 99 A dentry 114548 288 33513472 4075/0/16 28 1 0 98 a buffer_head 114777 104 12054528 2939/0/4 39 0 0 99 a inode_cache 12186 864 11091968 667/0/10 18 2 0 94 a Slabs sorted by loss (5) --------------------------------------------------------- ext4_inode_cache 85050 1736 7182000 4721/0/4 18 3 0 95 a inode_cache 12186 864 563264 667/0/10 18 2 0 94 a dentry 114548 288 523648 4075/0/16 28 1 0 98 a kmalloc-128 4117 128 353664 160/141/55 32 0 65 59 bio-0 12852 176 244800 589/0/23 21 0 0 90 A Sample #428 Slabcache Totals ---------------- Slabcaches : 140 Aliases : 0->0 Active: 107 Memory used: 252264448 # Loss : 11537008 MRatio: 4% # Objects : 529408 # PartObj: 8649 ORatio: 1% Per Cache Average Min Max Total --------------------------------------------------------- #Objects 4947 11 115947 529408 #Slabs 165 1 4725 17655 #PartSlab 5 0 141 566 %PartSlab 5% 0% 65% 3% PartObjs 1 0 2422 8649 % PartObj 2% 0% 42% 1% Memory 2357611 4096 154828800 252264448 Used 2249789 3584 147646800 240727440 Loss 107822 0 7182000 11537008 Per Object Average Min Max --------------------------------------------- Memory 456 8 8192 User 454 8 8192 Loss 2 0 64 Slabs sorted by size (5) --------------------------------------------------------- Name Objects Objsize Space Slabs/Part/Cpu O/S O %Fr %Ef Flg ext4_inode_cache 85050 1736 154828800 4721/0/4 18 3 0 95 a dentry 114660 288 33546240 4075/0/20 28 1 0 98 a buffer_head 115947 104 12177408 2942/0/31 39 0 0 99 a inode_cache 12186 864 11091968 667/0/10 18 2 0 94 a radix_tree_node 10444 576 6111232 369/0/4 28 2 0 98 a Slabs sorted by loss (5) --------------------------------------------------------- ext4_inode_cache 85050 1736 7182000 4721/0/4 18 3 0 95 a inode_cache 12186 864 563264 667/0/10 18 2 0 94 a dentry 114660 288 524160 4075/0/20 28 1 0 98 a filp 3572 432 447552 227/113/16 18 1 46 77 A kmalloc-128 4117 128 353664 160/141/55 32 0 65 59 Attached some graphs for NOMERGE kernel. So far, I haven't seen those spikes for 'merge' kernel. > - assuming the times are comparable, it looks like 'merge' really is > noticeably faster. But that might just be noise too, so this may not > be real data. > > - regardless of how meaningful the graphs are, and whether they > really tell us anything, I do like the concept, and I'd love to see > people do things like this more often. Visualization to show behavior > is great. > > That last point in particular means that if you scripted this and your > scripts aren't *too* ugly and not too tied to your particular setup, I > think it would perhaps not be a bad idea to encourage plots like this > by making those kinds of scripts available in the kernel tree. That's > particularly true if you used something like the tools/testing/ktest/ > scripts to run these things automatically (which can be a *big* issue > to show that something is actually stable across multiple boots, and > see the variance). Oh, that's a good idea. I didn't use tools/testing/ktest/, it's a bit too massive for my toy script. I have some modifications to slabinfo and a rather ugly script to parse files and feed them to gnuplot (and yes, I use gnuplot for plotting). slabinfo patches are not entirely dumb and close to being ready (well.. except that I need to clean up all those %6s sprintfs that worked fine for dynamically scalled sizes and do not work so nicely for sizes in bytes). I can send them out later. Less sure about the script (bash) tho. In a nutshell it's just a number of grep | awk > FOO; gnuplot ... FOO So I'll finish some plotting improvements first (not ready yet) and then I'll take a look how quickly I can land it (rewrite in perl) in tools/testing/ktest/. > So maybe these graphs are meaningful, and maybe they aren't. But I'd > still like to see more of them ;) Thanks. -ss [-- Attachment #2: nomerge-mm-loss-usage-1.png --] [-- Type: image/png, Size: 12580 bytes --] [-- Attachment #3: nomerge-mm-loss-usage-2.png --] [-- Type: image/png, Size: 12283 bytes --] ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-07 8:44 ` Sergey Senozhatsky @ 2015-09-08 0:22 ` Sergey Senozhatsky 0 siblings, 0 replies; 42+ messages in thread From: Sergey Senozhatsky @ 2015-09-08 0:22 UTC (permalink / raw) To: Linus Torvalds Cc: Sergey Senozhatsky, Christoph Lameter, Jesper Dangaard Brouer, Dave Chinner, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, Sergey Senozhatsky On (09/07/15 17:44), Sergey Senozhatsky wrote: [...] > Oh, that's a good idea. I didn't use tools/testing/ktest/, it's a bit too > massive for my toy script. I have some modifications to slabinfo and a rather > ugly script to parse files and feed them to gnuplot (and yes, I use gnuplot > for plotting). slabinfo patches are not entirely dumb and close to being ready > (well.. except that I need to clean up all those %6s sprintfs that worked fine > for dynamically scalled sizes and do not work so nicely for sizes in bytes). I > can send them out later. Less sure about the script (bash) tho. In a nutshell > it's just a number of > grep | awk > FOO; gnuplot ... FOO > > So I'll finish some plotting improvements first (not ready yet) and then > I'll take a look how quickly I can land it (rewrite in perl) in > tools/testing/ktest/. Hi, uploaded my scripts to https://github.com/sergey-senozhatsky/slabinfo A set of very simple bash scripts. The README file contains some sort of documentation and a 'tutorial'. ================================================================== To start collecting samples, record file name is NOMERGE, note sudo sudo ./slabinfo-plotter.sh -r NOMERGE #^C or reboot pre-process records file for gnuplot ./slabinfo-plotter.sh -p NOMERGE -b gnuplot File gnuplot_slabs-by-loss-NOMERGE File gnuplot_slabs-by-size-NOMERGE File gnuplot_totals-NOMERGE generate grphs from 'slabinfo totals' ./gnuplot-totals.sh -f gnuplot_totals-NOMERGE Graph file name -- gnuplot_totals-NOMERGE.png ... ================================================================== Two things: -- it wants a patched version of slabinfo (some sort of patches are in kernel_patches/ dir) -- it wants slabinfo to be in PATH For now on it does what it does -- captures numbers and picks only ones that are interesting to me and generates plots. I'm doing this in my spare time, but I'm surely accepting improvement requests/ideas, pull requests, and everything that follows. Will play around with the scripts for some time to make sure they are usable and then we can decide if there is a place for something like this in the kernel or it's better be done somehow differently. -ss -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 6:02 ` Dave Chinner 2015-09-03 6:13 ` Pekka Enberg 2015-09-03 10:29 ` Jesper Dangaard Brouer @ 2015-09-03 15:02 ` Linus Torvalds 2015-09-04 3:26 ` Dave Chinner 2 siblings, 1 reply; 42+ messages in thread From: Linus Torvalds @ 2015-09-03 15:02 UTC (permalink / raw) To: Dave Chinner Cc: Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Wed, Sep 2, 2015 at 11:02 PM, Dave Chinner <dchinner@redhat.com> wrote: > On Wed, Sep 02, 2015 at 06:21:02PM -0700, Linus Torvalds wrote: >> On Wed, Sep 2, 2015 at 5:51 PM, Mike Snitzer <snitzer@redhat.com> wrote: >> > >> > What I made possible with SLAB_NO_MERGE is for each subsystem to decide >> > if they would prefer to not allow slab merging. >> >> .. and why is that a choice that even makes sense at that level? >> >> Seriously. >> >> THAT is the fundamental issue here. > > It makes a lot more sense than you think, Linus. Not really. Even your argument isn't at all arguing for doing things at a per-subsystem level - it's an argument about the potential sanity of marking _individual_ slab caches non-mergable, not an argument for something clearly insane like "mark all slabs for subsystem X unmergable". Can you just admit that that was insane? There is *no* sense in that kind of behavior. > Right, it's not xyzzy-specific where 'xyzzy' is a subsystem. The > flag application is actually *object specific*. That is, the use of > the individual objects that determines whether it should be merged > or not. Yes. I do agree that something like SLAB_NO_MERGE can make sense on an actual object-specific level, if you have very specific allocation pattern knowledge and can show that the merging actually hurts. But making the subsystem decide that all its slab caches should be "no-merge" is just BS. You know that. It makes no sense, just admit it. > e.g. Slab fragmentation levels are affected more than anything by > mixing objects with different life times in the same slab. i.e. if > we free all the short lived objects from a page but there is one > long lived object on the page then that page is pinned and we free > no memory. Do that to enough pages in the slab, and we end up with a > badly fragmented slab. The thing is, *if* you can show that kind of behavior for a particular slab, and have numbers for it, then mark that slab as no-merge, and document why you did it. Even then, I'd personally probably prefer to name the bit differently: rather than talk about an internal implementation detail within slab ("don't merge") it would probably be better to try to frame it in the semantic different you are looking for (ie in "I want a slab with private allocation patterns"). But aside from that kind of naming issue, that's very obviously not what the patch series discussed was doing. And quite frankly, I don't actually think you have the numbers to show that theoretical bad behavior. In contrast, there really *are* numbers to show the advantages of merging. So the fragmentation argument has been shown to generally be in favor of merging, _not_ in favor of that "no-merge" behavior. If you have an actual real load where that isn't the case, and can show it, then that would be interesting, but at no point is that "the subsystem just decided to mark all its slabs no-merge". Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-03 15:02 ` Linus Torvalds @ 2015-09-04 3:26 ` Dave Chinner 2015-09-04 3:51 ` Linus Torvalds 2015-09-04 13:55 ` Christoph Lameter 0 siblings, 2 replies; 42+ messages in thread From: Dave Chinner @ 2015-09-04 3:26 UTC (permalink / raw) To: Linus Torvalds Cc: Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Thu, Sep 03, 2015 at 08:02:40AM -0700, Linus Torvalds wrote: > On Wed, Sep 2, 2015 at 11:02 PM, Dave Chinner <dchinner@redhat.com> wrote: > > On Wed, Sep 02, 2015 at 06:21:02PM -0700, Linus Torvalds wrote: > > Right, it's not xyzzy-specific where 'xyzzy' is a subsystem. The > > flag application is actually *object specific*. That is, the use of > > the individual objects that determines whether it should be merged > > or not. > > Yes. > > I do agree that something like SLAB_NO_MERGE can make sense on an > actual object-specific level, if you have very specific allocation > pattern knowledge and can show that the merging actually hurts. There are generic cases where it hurts, so no justification should be needed for those cases... > > e.g. Slab fragmentation levels are affected more than anything by > > mixing objects with different life times in the same slab. i.e. if > > we free all the short lived objects from a page but there is one > > long lived object on the page then that page is pinned and we free > > no memory. Do that to enough pages in the slab, and we end up with a > > badly fragmented slab. > > The thing is, *if* you can show that kind of behavior for a particular > slab, and have numbers for it, then mark that slab as no-merge, and > document why you did it. The double standard is the problem here. No notification, proof, discussion or review was needed to turn on slab merging for everyone, but you're setting a very high bar to jump if anyone wants to turn it off in their code. > And quite frankly, I don't actually think you have the numbers to show > that theoretical bad behavior. I don't keep numbers close handy. I've been dealing with these problems for ten years, to I just know what workloads demonstrate this "theoretical bad behaviour" within specific slabs and test them when relevant. I'll do a couple of quick "merging is better" verification tests this afternoon, but other than that I don't have time in the next couple of weeks... But speaking of workloads, internal inode cache slab fragmentation is simple to reproduce on any filesystem. XFS just happens to be the only one that really actively manages it as a result of long term developer awareness of the problem. I first tripped over it in early 2005 with SpecSFS, and then with other similar NFS benchmarks like filebench. That's where Christoph Lameter was introduced to the problem, too: https://lwn.net/Articles/371892/ " The problem is that sparse use of objects in slab caches can cause large amounts of memory to become unusable. The first ideas to address this were developed in 2005 by various people." FYI, with appropriate manual "drop slab" hacks during the benchmark, we could get 20-25% higher throughput from the NFS server because dropping the entire slab cache before the measurement phase meant we avoided the slab fragmentation issue and had ~50% more free memory to use for the page cache during the measurement period... Similar problems have been reported over the years by users with backup programs or scripts that used find, rsync and/or 'cp -R' on large filesystems. It used to be easy to cause these sorts of problems in the XFS inode cache. There's quite a few other workloads, but it easily to reproduce inode slab fragmetnation with find, bulkstat and cp. Basically all you need to do is populate the inode cache, randomise the LRU order, then trigger combined inode cache and memory demand. It's that simple. The biggest problem with using a workload like this to "prove" that slab merging degrades behaviour is that we don't know what slabs have been merged. Hence it's extremely hard to generate a workload definition that demonstrates it. Indeed, change kernel config options, structures change size and the slab is merged with different objects, so the workload that generates problems has to be changed, too. And it doesn't even need to be a kernel with a different config - just a different set of modules loaded because the hardware and software config is different will change what slabs are merged. IOWs, what produces a problem on one kernel on one machine will not reproduce the same problem on a different kernel or machine. Numbers are a crapshoot here, especially as the cause of the problem is trivially easy to understand. Linus, you always say that at some point you've just got to step back, read the code and understand the underlying issue that is being dealt with because some things are way too complex to reproduce reliably. This is one of those cases - it's obvious that slab merging does not fix or prevent internal slab cache fragmentation and that it only serves to minimise the impact of fragmentation by amortising it across multiple similar slabs. Really, this is the best we can do with passive slab caches where you can't control freeing patterns. However, we also have actively managed slab caches, and they can and do work to prevent fragmetnation and clear it quickly when it happens. Merging these actively managed slabs with other passive slab is just a bad idea because the passive slab objects can only reduce the effectiveness of the active management algorithms. We don't need numbers to understand this - it's clear and obvious from an algorithmic point of view. > In contrast, there really *are* > numbers to show the advantages of merging. I have never denied that. Please listen to what I'm saying. > So the fragmentation argument has been shown to generally be in favor > of merging, _not_ in favor of that "no-merge" behavior. Yes, all the numbers and research I've seen has been on passive slab cache behaviour. I *agree* that passive slab caches should be merged, but I don't recall anyone documenting the behavioural distinction between active/passive slabs before now, even though it's been something I've had in my head for several years. Actively managed slabs are very different in their behaviour to passive slabs, and so what holds true for passive slabs is not necessarily true for actively managed slabs. Really, we don't need some stupidly high bar to jump over here - whether merging should be allowed can easily be answered with a simple question: "Does the slab have a shrinker or does it back a mempool?" If the answer is yes then using SLAB_SHRINKER or SLAB_MEMPOOL to trigger the no-merge case doesn't need any more justification from subsystem maintainers at all. Cheers, Dave. -- Dave Chinner dchinner@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 3:26 ` Dave Chinner @ 2015-09-04 3:51 ` Linus Torvalds 2015-09-05 0:36 ` Dave Chinner 2015-09-07 9:30 ` Jesper Dangaard Brouer 2015-09-04 13:55 ` Christoph Lameter 1 sibling, 2 replies; 42+ messages in thread From: Linus Torvalds @ 2015-09-04 3:51 UTC (permalink / raw) To: Dave Chinner Cc: Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Thu, Sep 3, 2015 at 8:26 PM, Dave Chinner <dchinner@redhat.com> wrote: > > The double standard is the problem here. No notification, proof, > discussion or review was needed to turn on slab merging for > everyone, but you're setting a very high bar to jump if anyone wants > to turn it off in their code. Ehh. You realize that almost the only load that is actually seriously allocator-limited is networking? And slub was beating slab on that? And slub has been doing the merging since day one. Slab was just changed to try to keep up with the winning strategy. Really. You seem to think that this merging thing is new. It's really not. Where did you miss the part that it's been done since 2007? It's only new for slab, and the reason it was introduced for slab was that it was losing most relevant benchmarks to slub. So do you now want a "SLAB_NO_MERGE_IF_NOT_SLUB" flag, which keeps the traditional behavior for slab and slub? Just because its' traditional? One that says "if the allocator is slub, then merge, but if the allocator is slab, then don't merge". Really, Dave. You have absolutely nothing to back up your points with. Merging is *not* some kind of "new" thing that was silently enabled recently to take you by surprise. That seems to be your *only* argument: that the behavior changed behind your back. IT IS NOT TRUE. It's only true since you don't seem to realize that a large portion of the world moved on to SLUB a long time ago. Do you seriously believe that a "SLAB_NO_MERGE_IF_NOT_SLUB" flag is a good idea, just to justify your position of "let's keep the merging behavior the way it has been"? Or do you seriously think that it's a good idea to take the non-merging behavior from the allocator that was falling behind? So no. The switch to merging behavior was not some kind of "no discussion" thing. It was very much part of the whole original _point_ of SLUB. And the point of having allocator choices was to see which one worked best. SLUB essentially won. We could have just deleted SLAB. I don't think that would necessarily have been a bad idea. Instead, slab was taught to try to do some of the same things that worked for slub. At what point do you just admit that your arguments aren't holding water? So the fact remains: if you can actually show that not merging is a good idea for particular slabs, then that's real data. But right now you are just ignoring the real data and the SLUB we've had over the years. And if you continue to spout nonsense about "silent behavioral changes", the only thing you show is that you don't know what the hell you are talking about. So your claim of "double standard" is pure and utter shit. Get over it. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 3:51 ` Linus Torvalds @ 2015-09-05 0:36 ` Dave Chinner 2015-09-07 9:30 ` Jesper Dangaard Brouer 1 sibling, 0 replies; 42+ messages in thread From: Dave Chinner @ 2015-09-05 0:36 UTC (permalink / raw) To: Linus Torvalds Cc: Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Thu, Sep 03, 2015 at 08:51:09PM -0700, Linus Torvalds wrote: > On Thu, Sep 3, 2015 at 8:26 PM, Dave Chinner <dchinner@redhat.com> wrote: > > > > The double standard is the problem here. No notification, proof, > > discussion or review was needed to turn on slab merging for > > everyone, but you're setting a very high bar to jump if anyone wants > > to turn it off in their code. > > Ehh. You realize that almost the only load that is actually seriously > allocator-limited is networking? Of course I do - I've been following Jesper's work quite closely because we might be able to make use of the batch allocation mechanism in the XFS inode cache in certain workloads where we are burning through a million inode slab allocations a second... But again, you're bringing up justifications for a change that were not documented in the commit message for the change. It didn't even mention performance (just fragmentation and memory savings). If this was such a critical factor in making this decision, then why weren't such workloads and numbers provided with the commit? And why didn't someone from netowkring actually review the change and ack/test that it did actually do what it was supposed to? If you are going to make an assertion, then you damn well better provide numbers to go along with that assertion. What's you're phrase, Linus? "Numbers talk and BS walks?" Where are the numbers, Linus? Hmmmm? Indeed, with network slabs that hot, mixing them with random other slab caches could have a negative effect on performance by increasing contention on the slab over what the network load already brings. I learnt that lesson 12 years ago when optimisng the mbuf slab allocator in the Irix network stack to scale to >1Mpps through 16 GbE cards: It worked just fine until we started doing something with the data that the network was delivering and created more load on the shared slab.... But, I digress. I've been trying to explain why we shouldn't be merging slabs with shrinkers and you've shifted the goal posts rather than addressing the discussion at hand. > Really, Dave. You have absolutely nothing to back up your points with. > Merging is *not* some kind of "new" thing that was silently enabled > recently to take you by surprise. The key slab tha I monitor for fragmentation behaviour (the XFS inode slab) does not get merged. Ever. SLAB or SLUB. Because it has a *constructor*. Linus, if you bothered to read my previous comments in this discussion then you'd know this. I just want to flag to extend that behaviour to all the slab caches I actively manage with shrinkers, because slab merging does not benefit them the same way it does passive slabs. That's not hard to understand, nor is it a major issue for anyone. >From my perspective, Linus, you're way out of line. You are not engaging on a technical level - you're not even reading the arguments I've been presenting. You're just cherry-picking something mostly irrelelvant to the problem being discussed and going off at a tangent ranting and swearing and trying your best to be abusive. Your behaviour and bluster does not intimidate me, so please try to be a bit more civil and polite and engage properly on a technical level. -Dave. -- Dave Chinner dchinner@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) @ 2015-09-05 0:36 ` Dave Chinner 0 siblings, 0 replies; 42+ messages in thread From: Dave Chinner @ 2015-09-05 0:36 UTC (permalink / raw) To: Linus Torvalds Cc: Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Thu, Sep 03, 2015 at 08:51:09PM -0700, Linus Torvalds wrote: > On Thu, Sep 3, 2015 at 8:26 PM, Dave Chinner <dchinner@redhat.com> wrote: > > > > The double standard is the problem here. No notification, proof, > > discussion or review was needed to turn on slab merging for > > everyone, but you're setting a very high bar to jump if anyone wants > > to turn it off in their code. > > Ehh. You realize that almost the only load that is actually seriously > allocator-limited is networking? Of course I do - I've been following Jesper's work quite closely because we might be able to make use of the batch allocation mechanism in the XFS inode cache in certain workloads where we are burning through a million inode slab allocations a second... But again, you're bringing up justifications for a change that were not documented in the commit message for the change. It didn't even mention performance (just fragmentation and memory savings). If this was such a critical factor in making this decision, then why weren't such workloads and numbers provided with the commit? And why didn't someone from netowkring actually review the change and ack/test that it did actually do what it was supposed to? If you are going to make an assertion, then you damn well better provide numbers to go along with that assertion. What's you're phrase, Linus? "Numbers talk and BS walks?" Where are the numbers, Linus? Hmmmm? Indeed, with network slabs that hot, mixing them with random other slab caches could have a negative effect on performance by increasing contention on the slab over what the network load already brings. I learnt that lesson 12 years ago when optimisng the mbuf slab allocator in the Irix network stack to scale to >1Mpps through 16 GbE cards: It worked just fine until we started doing something with the data that the network was delivering and created more load on the shared slab.... But, I digress. I've been trying to explain why we shouldn't be merging slabs with shrinkers and you've shifted the goal posts rather than addressing the discussion at hand. > Really, Dave. You have absolutely nothing to back up your points with. > Merging is *not* some kind of "new" thing that was silently enabled > recently to take you by surprise. The key slab tha I monitor for fragmentation behaviour (the XFS inode slab) does not get merged. Ever. SLAB or SLUB. Because it has a *constructor*. Linus, if you bothered to read my previous comments in this discussion then you'd know this. I just want to flag to extend that behaviour to all the slab caches I actively manage with shrinkers, because slab merging does not benefit them the same way it does passive slabs. That's not hard to understand, nor is it a major issue for anyone. From my perspective, Linus, you're way out of line. You are not engaging on a technical level - you're not even reading the arguments I've been presenting. You're just cherry-picking something mostly irrelelvant to the problem being discussed and going off at a tangent ranting and swearing and trying your best to be abusive. Your behaviour and bluster does not intimidate me, so please try to be a bit more civil and polite and engage properly on a technical level. -Dave. -- Dave Chinner dchinner@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 3:51 ` Linus Torvalds 2015-09-05 0:36 ` Dave Chinner @ 2015-09-07 9:30 ` Jesper Dangaard Brouer 2015-09-07 20:22 ` Linus Torvalds 1 sibling, 1 reply; 42+ messages in thread From: Jesper Dangaard Brouer @ 2015-09-07 9:30 UTC (permalink / raw) To: Linus Torvalds Cc: brouer, Dave Chinner, Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, netdev On Thu, 3 Sep 2015 20:51:09 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, Sep 3, 2015 at 8:26 PM, Dave Chinner <dchinner@redhat.com> wrote: > > > > The double standard is the problem here. No notification, proof, > > discussion or review was needed to turn on slab merging for > > everyone, but you're setting a very high bar to jump if anyone wants > > to turn it off in their code. > > Ehh. You realize that almost the only load that is actually seriously > allocator-limited is networking? > > And slub was beating slab on that? And slub has been doing the merging > since day one. Slab was just changed to try to keep up with the > winning strategy. Sorry, I have to correct you on this. The slub allocator is not as fast as you might think. The slab allocator is actually faster for networking. IP-forwarding, single CPU, single flow UDP (highly tuned): * Allocator slub: 2043575 pps * Allocator slab: 2088295 pps Difference slab faster than slub: * +44720 pps and -10.48ns The slub allocator have a faster "fastpath", if your workload is fast-reusing within the same per-cpu page-slab, but once the workload increases you hit the slowpath, and then slab catches up. Slub looks great in micro-benchmarking. As you can see in patchset: [PATCH 0/3] Network stack, first user of SLAB/kmem_cache bulk free API. http://thread.gmane.org/gmane.linux.kernel.mm/137469/focus=376625 I'm working on speeding up slub to the level of slab. And it seems like I have succeeded with half-a-nanosec 2090522 pps (+2227 pps or 0.51 ns). And with "slab_nomerge" I get even high performance: * slub: bulk-free and slab_nomerge: 2121824 pps * Diff to slub: +78249 and -18.05ns -- Best regards, Jesper Dangaard Brouer MSc.CS, Sr. Network Kernel Developer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-07 9:30 ` Jesper Dangaard Brouer @ 2015-09-07 20:22 ` Linus Torvalds 0 siblings, 0 replies; 42+ messages in thread From: Linus Torvalds @ 2015-09-07 20:22 UTC (permalink / raw) To: Jesper Dangaard Brouer Cc: Dave Chinner, Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, netdev On Mon, Sep 7, 2015 at 2:30 AM, Jesper Dangaard Brouer <brouer@redhat.com> wrote: > > The slub allocator have a faster "fastpath", if your workload is > fast-reusing within the same per-cpu page-slab, but once the workload > increases you hit the slowpath, and then slab catches up. Slub looks > great in micro-benchmarking. > > And with "slab_nomerge" I get even high performance: I think those two are related. Not merging means that effectively the percpu caches end up being bigger (simply because there are more of them), and so it captures more of the fastpath cases. Obviously the percpu queue size is an easy tunable too, but there are real downsides to that too. I suspect your IP forwarding case isn't so different from some of the microbenchmarks, it just has more outstanding work.. And yes, the slow path (ie not hitting in the percpu cache) of SLUB could hopefully be optimizable too, although maybe the bulk patches are the way to go (and unrelated to this thread - at least part of your bulk patches actually got merged last Friday - they were part of Andrew's patch-bomb). Linus ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) @ 2015-09-07 20:22 ` Linus Torvalds 0 siblings, 0 replies; 42+ messages in thread From: Linus Torvalds @ 2015-09-07 20:22 UTC (permalink / raw) To: Jesper Dangaard Brouer Cc: Dave Chinner, Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, netdev On Mon, Sep 7, 2015 at 2:30 AM, Jesper Dangaard Brouer <brouer@redhat.com> wrote: > > The slub allocator have a faster "fastpath", if your workload is > fast-reusing within the same per-cpu page-slab, but once the workload > increases you hit the slowpath, and then slab catches up. Slub looks > great in micro-benchmarking. > > And with "slab_nomerge" I get even high performance: I think those two are related. Not merging means that effectively the percpu caches end up being bigger (simply because there are more of them), and so it captures more of the fastpath cases. Obviously the percpu queue size is an easy tunable too, but there are real downsides to that too. I suspect your IP forwarding case isn't so different from some of the microbenchmarks, it just has more outstanding work.. And yes, the slow path (ie not hitting in the percpu cache) of SLUB could hopefully be optimizable too, although maybe the bulk patches are the way to go (and unrelated to this thread - at least part of your bulk patches actually got merged last Friday - they were part of Andrew's patch-bomb). Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-07 20:22 ` Linus Torvalds (?) @ 2015-09-07 21:17 ` Jesper Dangaard Brouer -1 siblings, 0 replies; 42+ messages in thread From: Jesper Dangaard Brouer @ 2015-09-07 21:17 UTC (permalink / raw) To: Linus Torvalds Cc: Dave Chinner, Mike Snitzer, Christoph Lameter, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm, netdev, brouer On Mon, 7 Sep 2015 13:22:13 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Mon, Sep 7, 2015 at 2:30 AM, Jesper Dangaard Brouer > <brouer@redhat.com> wrote: > > > > The slub allocator have a faster "fastpath", if your workload is > > fast-reusing within the same per-cpu page-slab, but once the workload > > increases you hit the slowpath, and then slab catches up. Slub looks > > great in micro-benchmarking. > > > > And with "slab_nomerge" I get even high performance: > > I think those two are related. > > Not merging means that effectively the percpu caches end up being > bigger (simply because there are more of them), and so it captures > more of the fastpath cases. Yes, that was also my theory. As manually tuning the percpu sizes gave me almost the same boost. > Obviously the percpu queue size is an easy tunable too, but there are > real downsides to that too. The easy fix is to introduce a subsystem specific percpu cache that is large enough for our use-case. That seems to be a trend. I'm hoping to come up with something smarter that every subsystem can benefit from. E.g some heuristic that can dynamic adjust SLUB according to the usage pattern. I can imagine something as simple as a counter for every slowpath call, that is only valid as long as the jiffies count matches (reset to zero, and store new jiffies cnt). (But I have not thought this through...) > I suspect your IP forwarding case isn't so > different from some of the microbenchmarks, it just has more > outstanding work.. Yes, I will admit that my testing is very close to micro benchmarking, and it is specifically designed to pressure the system to its limits[1]. Especially the minimum frame size is evil and unrealistic, but the real purpose is preparing the stack for increasing speeds like 100Gbit/s. > And yes, the slow path (ie not hitting in the percpu cache) of SLUB > could hopefully be optimizable too, although maybe the bulk patches > are the way to go (and unrelated to this thread - at least part of > your bulk patches actually got merged last Friday - they were part of > Andrew's patch-bomb). Cool. Yes, it is only part of the bulk patches. The real performance boosters are not in yet (but I need to make them work correctly with memory debugging enabled before they can get merged). At least the main API is in, which allows me to implement use-case easier in other subsystems :-) [1] http://netoptimizer.blogspot.dk/2014/09/packet-per-sec-measurements-for.html -- Best regards, Jesper Dangaard Brouer MSc.CS, Sr. Network Kernel Developer at Red Hat Author of http://www.iptv-analyzer.org LinkedIn: http://www.linkedin.com/in/brouer -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 3:26 ` Dave Chinner 2015-09-04 3:51 ` Linus Torvalds @ 2015-09-04 13:55 ` Christoph Lameter 2015-09-04 22:46 ` Dave Chinner 1 sibling, 1 reply; 42+ messages in thread From: Christoph Lameter @ 2015-09-04 13:55 UTC (permalink / raw) To: Dave Chinner Cc: Linus Torvalds, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Fri, 4 Sep 2015, Dave Chinner wrote: > There are generic cases where it hurts, so no justification should > be needed for those cases... Inodes and dentries have constructors. These slabs are not mergeable and will never be because they have cache specific code to be executed on the object. > Really, we don't need some stupidly high bar to jump over here - > whether merging should be allowed can easily be answered with a > simple question: "Does the slab have a shrinker or does it back a > mempool?" If the answer is yes then using SLAB_SHRINKER or > SLAB_MEMPOOL to trigger the no-merge case doesn't need any more > justification from subsystem maintainers at all. The slab shrinkers do not use mergeable slab caches. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 13:55 ` Christoph Lameter @ 2015-09-04 22:46 ` Dave Chinner 2015-09-05 0:25 ` Christoph Lameter 0 siblings, 1 reply; 42+ messages in thread From: Dave Chinner @ 2015-09-04 22:46 UTC (permalink / raw) To: Christoph Lameter Cc: Linus Torvalds, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Fri, Sep 04, 2015 at 08:55:25AM -0500, Christoph Lameter wrote: > On Fri, 4 Sep 2015, Dave Chinner wrote: > > > There are generic cases where it hurts, so no justification should > > be needed for those cases... > > Inodes and dentries have constructors. These slabs are not mergeable and > will never be because they have cache specific code to be executed on the > object. I know - I said as much early on in this discussion. That's one of the generic cases I'm refering to. I also said that the fact that they are not merged is really by chance, not by good management. They are not being merged because of the constructor, not because they have a shrinker. hell, I even said that if it comes down to it, we don't even need SLAB_NO_MERGE because we can create dummy constructors to prevent merging.... > > Really, we don't need some stupidly high bar to jump over here - > > whether merging should be allowed can easily be answered with a > > simple question: "Does the slab have a shrinker or does it back a > > mempool?" If the answer is yes then using SLAB_SHRINKER or > > SLAB_MEMPOOL to trigger the no-merge case doesn't need any more > > justification from subsystem maintainers at all. > > The slab shrinkers do not use mergeable slab caches. Please, go back and read what i've already said. *Some* shrinkers act on mergable slabs because they have no constructor. e.g. the xfs_dquot and xfs_buf shrinkers. I want to keep them separate just like the inode cache is kept separate because they have workload based demand peaks in the millions of objects and LRU based shrinker reclaim, just like inode caches do. That's what I want SLAB_SHRINKER for - to explicitly tell the slab cache creation that I have a shrinker on this slab and so it should not merge it with others. Every slab that has a shrinker should be marked with this flag - we should not be relying on constructors to prevent merging of critical slab caches with shrinkers.... I really don't see the issue here - explicitly encoding and documenting the behaviour we've implicitly been relying on for years is something we do all the time. Code clarity and documented behaviour is a *good thing*. Cheers, Dave. -- Dave Chinner dchinner@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-04 22:46 ` Dave Chinner @ 2015-09-05 0:25 ` Christoph Lameter 2015-09-05 1:16 ` Dave Chinner 0 siblings, 1 reply; 42+ messages in thread From: Christoph Lameter @ 2015-09-05 0:25 UTC (permalink / raw) To: Dave Chinner Cc: Linus Torvalds, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Sat, 5 Sep 2015, Dave Chinner wrote: > > Inodes and dentries have constructors. These slabs are not mergeable and > > will never be because they have cache specific code to be executed on the > > object. > > I also said that the fact that they are not merged is really by > chance, not by good management. They are not being merged because of > the constructor, not because they have a shrinker. hell, I even said > that if it comes down to it, we don't even need SLAB_NO_MERGE > because we can create dummy constructors to prevent merging.... Right. There is no chance here though. Its intentional to not merge slab where we could get into issues. Would be interested to see how performance changes if the inode/dentries would become mergeable. > *Some* shrinkers act on mergable slabs because they have no > constructor. e.g. the xfs_dquot and xfs_buf shrinkers. I want to > keep them separate just like the inode cache is kept separate > because they have workload based demand peaks in the millions of > objects and LRU based shrinker reclaim, just like inode caches do. But then we are not sure why we would do that. Certainly merging can increases the stress on the per node locks for a slab cache as the example by Jesper shows (and this can be dealt with by increasing per cpu resources). On the other hand this also leads to rapid defragmentation because the free objects from partial pages produced by the frees of one of the merged slabs can get reused quickly for another purpose. > I really don't see the issue here - explicitly encoding and > documenting the behaviour we've implicitly been relying on for years > is something we do all the time. Code clarity and documented > behaviour is a *good thing*. The question first has to be answered why keeping them separate is such a good thing without also having an explicit way of telling the allocator to keep certain objects in the same slab page if possible. Otherwise we get this randomizing effect that nullifies the idea that sequential freeing/allocation would avoid fragmentation. I have in the past be in favor of adding such a flag to avoid merging but I am slowly getting to the point that this may not be wise anymore. There is too much arguing from gut reactions here and relying on assumptions about internal operations of slabs (thinking to be able to exploit the fact that linearly allocated objects come from the same slab page coming from you is one of these). Defragmentation IMHO requires a targeted approach were either objects that are in the way can be moved out of the way or there is some type of lifetime marker on objects that allows the memory allocators to know that these objects can be freed all at once when a certain operation is complete. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
* Re: slab-nomerge (was Re: [git pull] device mapper changes for 4.3) 2015-09-05 0:25 ` Christoph Lameter @ 2015-09-05 1:16 ` Dave Chinner 0 siblings, 0 replies; 42+ messages in thread From: Dave Chinner @ 2015-09-05 1:16 UTC (permalink / raw) To: Christoph Lameter Cc: Linus Torvalds, Mike Snitzer, Pekka Enberg, Andrew Morton, David Rientjes, Joonsoo Kim, dm-devel, Alasdair G Kergon, Joe Thornber, Mikulas Patocka, Vivek Goyal, Sami Tolvanen, Viresh Kumar, Heinz Mauelshagen, linux-mm On Fri, Sep 04, 2015 at 07:25:48PM -0500, Christoph Lameter wrote: > On Sat, 5 Sep 2015, Dave Chinner wrote: > > > > Inodes and dentries have constructors. These slabs are not mergeable and > > > will never be because they have cache specific code to be executed on the > > > object. > > > > I also said that the fact that they are not merged is really by > > chance, not by good management. They are not being merged because of > > the constructor, not because they have a shrinker. hell, I even said > > that if it comes down to it, we don't even need SLAB_NO_MERGE > > because we can create dummy constructors to prevent merging.... > > Right. There is no chance here though. Its intentional to not merge slab > where we could get into issues. The dentry cache does not have a constructor: /* * A constructor could be added for stable state like the lists, * but it is probably not worth it because of the cache nature * of the dcache. */ dentry_cache = KMEM_CACHE(dentry, SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD); > Would be interested to see how performance changes if the inode/dentries > would become mergeable. On my machines the dentry slab doesn't merge with any other slabs, though, because there are no other slabs with the same size object. That's one of the major crap shoots with slab merging that I want to fix. > > *Some* shrinkers act on mergable slabs because they have no > > constructor. e.g. the xfs_dquot and xfs_buf shrinkers. I want to > > keep them separate just like the inode cache is kept separate > > because they have workload based demand peaks in the millions of > > objects and LRU based shrinker reclaim, just like inode caches do. > > But then we are not sure why we would do that. Certainly merging can > increases the stress on the per node locks for a slab cache as the example > by Jesper shows (and this can be dealt with by increasing per cpu > resources). On the other hand this also leads to rapid defragmentation > because the free objects from partial pages produced by the frees of > one of the merged slabs can get reused quickly for another purpose. We can't control the freeing of objects from other merged slabs, unless they are also actively managed by a shrinker. So that page is pinned until the slab object is freed by whatever subsystem owns it, and no amount of memory pressure can cause that to happen. > > I really don't see the issue here - explicitly encoding and > > documenting the behaviour we've implicitly been relying on for years > > is something we do all the time. Code clarity and documented > > behaviour is a *good thing*. > > The question first has to be answered why keeping them separate is such a > good thing without also having an explicit way of telling the allocator to > keep certain objects in the same slab page if possible. Otherwise we get > this randomizing effect that nullifies the idea that sequential > freeing/allocation would avoid fragmentation. I don't follow. Sequential alloc/free of objects from an unshared slab does not alter fragmentation patterns of the slab. If it was fragmented before the sequntial run, it will be fragmented after. If you are talking about merging dentry/inode objects into the same slab and doing sequential allocation of them, that just does not work. the relationship between detries and inodes is an M:N relationship, not a 1:1 relationship, so they will never have nice neat aligned alloc/free patterns. > I have in the past be in favor of adding such a flag to avoid merging but > I am slowly getting to the point that this may not be wise anymore. There > is too much arguing from gut reactions here and relying on assumptions > about internal operations of slabs (thinking to be able to exploit the > fact that linearly allocated objects come from the same slab page coming > from you is one of these). Wow. The only time I've ever mentioned that we could do some interesting things if we knew certain objects were on the same backing page was earlier this year at LCA when we were talking about the design of the proposed batch allocation interface. You said that it probably couldn't be guaranteed and so i haven't even thought about that since. That's not an argument for preventing us from saying "don't merge this slab, we actively manage it's contents". > Defragmentation IMHO requires a targeted approach were either objects that > are in the way can be moved out of the way or there is some type of > lifetime marker on objects that allows the memory allocators to know that > these objects can be freed all at once when a certain operation is > complete. Which, if we know that there is only one type of object in the slab, is relatively easy to do and can be controlled by the subsystem shrinker.... :) Cheers, Dave. -- Dave Chinner dchinner@redhat.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 42+ messages in thread
end of thread, other threads:[~2015-09-08 0:21 UTC | newest] Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-09-02 23:13 slab-nomerge (was Re: [git pull] device mapper changes for 4.3) Linus Torvalds 2015-09-03 0:48 ` Andrew Morton 2015-09-03 0:53 ` Mike Snitzer 2015-09-03 0:51 ` Mike Snitzer 2015-09-03 0:51 ` Mike Snitzer 2015-09-03 1:21 ` Linus Torvalds 2015-09-03 2:31 ` Mike Snitzer 2015-09-03 3:10 ` Christoph Lameter 2015-09-03 4:55 ` Andrew Morton 2015-09-03 6:09 ` Pekka Enberg 2015-09-03 8:53 ` Dave Chinner 2015-09-03 3:11 ` Linus Torvalds 2015-09-03 6:02 ` Dave Chinner 2015-09-03 6:13 ` Pekka Enberg 2015-09-03 10:29 ` Jesper Dangaard Brouer 2015-09-03 16:19 ` Christoph Lameter 2015-09-04 9:10 ` Jesper Dangaard Brouer 2015-09-04 14:13 ` Christoph Lameter 2015-09-04 6:35 ` Sergey Senozhatsky 2015-09-04 7:01 ` Linus Torvalds 2015-09-04 7:59 ` Sergey Senozhatsky 2015-09-04 9:56 ` Sergey Senozhatsky 2015-09-04 14:05 ` Christoph Lameter 2015-09-04 14:11 ` Linus Torvalds 2015-09-05 2:09 ` Sergey Senozhatsky 2015-09-05 2:09 ` Sergey Senozhatsky 2015-09-05 20:33 ` Linus Torvalds 2015-09-07 8:44 ` Sergey Senozhatsky 2015-09-08 0:22 ` Sergey Senozhatsky 2015-09-03 15:02 ` Linus Torvalds 2015-09-04 3:26 ` Dave Chinner 2015-09-04 3:51 ` Linus Torvalds 2015-09-05 0:36 ` Dave Chinner 2015-09-05 0:36 ` Dave Chinner 2015-09-07 9:30 ` Jesper Dangaard Brouer 2015-09-07 20:22 ` Linus Torvalds 2015-09-07 20:22 ` Linus Torvalds 2015-09-07 21:17 ` Jesper Dangaard Brouer 2015-09-04 13:55 ` Christoph Lameter 2015-09-04 22:46 ` Dave Chinner 2015-09-05 0:25 ` Christoph Lameter 2015-09-05 1:16 ` Dave Chinner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.