* Slow DOWN, please!!! @ 2008-04-30 2:03 David Miller 2008-04-30 4:03 ` David Newall ` (2 more replies) 0 siblings, 3 replies; 229+ messages in thread From: David Miller @ 2008-04-30 2:03 UTC (permalink / raw) To: linux-kernel This is starting to get beyond frustrating for me. Yesterday, I spent the whole day bisecting boot failures on my system due to the totally untested linux/bitops.h optimization, which I fully analyzed and debugged. Today, I had hoped that I could get some work done of my own, but that's not the case. Yet another bootup regression got added within the last 24 hours. I don't mind fixing the regression or two during the merge window but THIS IS ABSOLUTELY, FUCKING, REDICULIOUS! The tree breaks every day, and it's becomming an extremely non-fun environment to work in. We need to slow down the merging, we need to review things more, we need people to test their fucking changes! ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 2:03 Slow DOWN, please!!! David Miller @ 2008-04-30 4:03 ` David Newall 2008-04-30 4:18 ` David Miller ` (2 more replies) 2008-04-30 14:48 ` Peter Teoh 2008-04-30 19:36 ` Rafael J. Wysocki 2 siblings, 3 replies; 229+ messages in thread From: David Newall @ 2008-04-30 4:03 UTC (permalink / raw) To: David Miller; +Cc: linux-kernel, Linus Torvalds David Miller wrote: > We need to slow down the merging, we need to review things > more, we need people to test their fucking changes! Yes. The Linux process is becoming unreliable. Newly "stable" versions have stability problems. The development process looks childish. Seasoned developers say not to worry, that the process works. I do worry. BSD seems more attractive, and it may even be worth the considerable effort to switch my entire client-base. Linux was lucky to gain the foothold that it did: traditionally, BSD had a better system with a less restrictive licence, so it is surprising that manufacturers chose to go with Linux. BSD still has a less restrictive licence and when mainstream press becomes interested in Linux's quality problems it's adoption will fall. BSD is still a good, maybe even better, option. Linus, this is your baby and so it's your problem. Only you have the influence to change things. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 4:03 ` David Newall @ 2008-04-30 4:18 ` David Miller 2008-04-30 13:04 ` David Newall 2008-04-30 7:11 ` Tarkan Erimer 2008-04-30 14:55 ` Russ Dill 2 siblings, 1 reply; 229+ messages in thread From: David Miller @ 2008-04-30 4:18 UTC (permalink / raw) To: davidn; +Cc: linux-kernel, torvalds From: David Newall <davidn@davidnewall.com> Date: Wed, 30 Apr 2008 13:33:29 +0930 > Yes. Please don't use my posting as an opportunity to portray BSD as the best thing since sliced bread. We're having ONE bad merge window, we're facing the problem head on, RIGHT NOW, to prevent it in the future. It's not a severe ongoing issue as you portray it to be. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 4:18 ` David Miller @ 2008-04-30 13:04 ` David Newall 2008-04-30 13:18 ` Michael Kerrisk 2008-04-30 14:51 ` Linus Torvalds 0 siblings, 2 replies; 229+ messages in thread From: David Newall @ 2008-04-30 13:04 UTC (permalink / raw) To: David Miller; +Cc: linux-kernel, torvalds David Miller wrote: > We're having ONE bad merge window, we're facing the problem > head on, RIGHT NOW, to prevent it in the future. It's > not a severe ongoing issue as you portray it to be. No. The problem is more than just a bad merge window. There is poor or non-existent review; frequent "regressions"; release of kernels as stable when they are not. There is resentment and resistance to even acknowledging these problems. Take, as an example, the desire to NOT record who gives good code and who gives bugs: that one clearly hit a nerve, which it should not have except from people who feel guilty. I don't claim BSD to be perfect, but it appears to have a consistently good quality. Old Linux kernels also have that; new ones not so. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 13:04 ` David Newall @ 2008-04-30 13:18 ` Michael Kerrisk 2008-04-30 14:51 ` Linus Torvalds 1 sibling, 0 replies; 229+ messages in thread From: Michael Kerrisk @ 2008-04-30 13:18 UTC (permalink / raw) To: David Newall; +Cc: David Miller, linux-kernel, torvalds > Take, as an example, the desire to NOT > record who gives good code and who gives bugs: that one clearly hit a > nerve, which it should not have except from people who feel guilty. Speaking as someone who has found quite a few kernel bugs, but written few (because I've written little kernel code ;-))... No. It hit a nerve because it's the simply wrong way of going about things. There is no use in assigning blame. Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 13:04 ` David Newall 2008-04-30 13:18 ` Michael Kerrisk @ 2008-04-30 14:51 ` Linus Torvalds 2008-04-30 18:21 ` David Newall 1 sibling, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 14:51 UTC (permalink / raw) To: David Newall; +Cc: David Miller, linux-kernel On Wed, 30 Apr 2008, David Newall wrote: > > I don't claim BSD to be perfect, but it appears to have a consistently > good quality. Lol. You should try VMS. Now *there* was a stable system. Oh, but it didn't actually make any progress, did it? The fact is, we're merging a lot. It comes from having a lot of development. If you don't want that, then you're a fool - because you aren't looking at the long term. > Old Linux kernels also have that; new ones not so. Can you point to any actual stability problem? The problem under discussion is the fact that some people are unhappy because we had some merge trouble. The fact is, the problems got fixed in a few days. And yes, we will probably will have to make Ingo follow the rules that pretty much everybody else also follows, and no, it's not going to solve all problems either - the fundamental issue is that we are just too damn good at development. And that's not a big problem in my view, as long as we are also also able to handle the _result_ of that flood of patches. Which, quite frankly, we are. DavidN, you just have an agenda, and you think that mentioning BSD as some kind of shining example of goodness is a good way to reach that agenda. It isn't. It just shows that you don't understand the issue, and that you think that "threatening" developers by saying you'll switch is a great way to make PR. But you know what? I really don't care one _whit_ what you do. You can switch to Vista for all I care, and I really don't mind. All I care about is doing a good job technically. And you just show that you don't have a clue what you are talking about. If you want stable kernel, don't follow the current -git tree. Don't mind the fact that in two weeks we merge 6672 files changed, 373817 insertions(+), 285901 deletions(-) and instead look at something like the enterprise kernels or other tree that lags the development tree by half a year or more exactly _because_ they care about stable, not development. In short: what do you think the git tree is? Is it something that should prioritize good developmnent, or is it something that should worry about you making inane arguments? Ask yourself that. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 14:51 ` Linus Torvalds @ 2008-04-30 18:21 ` David Newall 2008-04-30 18:27 ` Linus Torvalds 0 siblings, 1 reply; 229+ messages in thread From: David Newall @ 2008-04-30 18:21 UTC (permalink / raw) To: Linus Torvalds; +Cc: David Miller, linux-kernel Linus Torvalds wrote: > Can you point to any actual stability problem? > Well of course. So could you because they are a matter of public record on the list. Don't pretend otherwise. Just to give you some recent, personal bugaboos, and not even drawing on the many hundreds of relevant messages on LKML each month: 1. Out of memory, caused by apparent leak somewhere, resulting in machine effectively hanging for a minute or two (massive disk i/o) culminating in termination of one or more processes. (For what it's worth: 512MB, no swap.) Problem takes a couple of days to develop (hence I suspect a leak.) This is running only Firefox, Thunderbird and Evince, plus whatever xubuntu wants. Restarting the killed application(s) causes the problem to recur. Restarting X doesn't help. Killing almost all processes also doesn't help. Reboot is required. This problem seems not to be in 2.6.17, but is in 2.6.22 (plus whatever patches xubuntu use) and 2.6.23. I'm still testing 2.6.25, but probably going to have to abandon it and go backwards, because... 2. Suspend to disk doesn't resume properly (two out of three times.) System comes back but X has severe wierdness. Draws frames and title bar, but not window contents. Text-mode is just as bad: Screen is blank (erased font table, perhaps?) Subsequent suspend to disk doesn't resume at all. Note the wide range of kernels exhibiting problem 1. I don't even want to think about problem 2 at this stage; I just want to stop having to reboot to reclaim memory, especially when a mate who does Windows training visits! > the fundamental issue is that we are just > too damn good at development. > Not so good. The process is flawed. Inadequate testing. Inadequate review. This has been mentioned by others, so you know I'm not making it up. The real fundamental issue is that people are too keen to release and don't appear to care enough about correctness. > you think that mentioning BSD as some > kind of shining example of goodness is a good way to reach that agenda. Yes, BSD does seem to be a shining example of goodness, but I didn't mention it because I think people should switch. I did so to warn of competition, to say that the world does not owe Linux a second chance and isn't going to give it one. It's pointless to debate the relative merits of the two systems because, aside from the kernel, they are identical; and there's little that matters between the kernels, other than one appears to have a careful, robust and professional development process. Make no mistake about this point: I'm not saying that BSD is better, rather that Linux cannot lose credibility and survive. > But you know what? I really don't care one _whit_ what you do. You can > switch to Vista for all I care, and I really don't mind. All I care about > is doing a good job technically. > Sadly, you're doing a bad technical job in certain, important areas. You're pushing out buggy kernels and claiming that they're stable. This can't continue. Attrition to BSD is the risk, not some threat that I'm making. > And you just show that you don't have a clue what you are talking about. > If you want stable kernel, don't follow the current -git tree. Why are you bringing up git trees (which I don't use)? I'm presently plagued with a problem that's 2.6.22 or older, extending to at least 2.6.23 and maybe still current. I've said quite clearly that I'm talking about "stable" kernels, yet you presume I mean the git tree. Yet it's not the specifics of the problem I'm having that matters, it's the systemic problems in Linux's development process. I don't think I've anything to add unless the topic evolves in a direction that asks what should be changed. I'm posting this only because I want on record the answer to the question about actual stability problems. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 18:21 ` David Newall @ 2008-04-30 18:27 ` Linus Torvalds 2008-04-30 18:55 ` David Newall 2008-04-30 19:06 ` Chris Friesen 0 siblings, 2 replies; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 18:27 UTC (permalink / raw) To: David Newall; +Cc: David Miller, linux-kernel On Thu, 1 May 2008, David Newall wrote: > > Why are you bringing up git trees (which I don't use)? I'm presently > plagued with a problem that's 2.6.22 or older, extending to at least > 2.6.23 and maybe still current. Ok, *PLONK*. You're on an old kernel, don't know if your problem is fixed, and ask us to slow down development. That makes sense. Go away. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 18:27 ` Linus Torvalds @ 2008-04-30 18:55 ` David Newall 2008-04-30 19:08 ` Linus Torvalds 2008-04-30 19:06 ` Chris Friesen 1 sibling, 1 reply; 229+ messages in thread From: David Newall @ 2008-04-30 18:55 UTC (permalink / raw) To: Linus Torvalds; +Cc: David Miller, linux-kernel Linus Torvalds wrote: > You're on an old kernel, don't know if your problem is fixed, and ask us > to slow down development. I just finished telling you that I'm currently trying 2.6.25. But you couldn't have read that with any care at all, because I also just finished telling you that it's not the specifics of the problem I'm having that matters, it's the systemic problems in Linux's development process. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 18:55 ` David Newall @ 2008-04-30 19:08 ` Linus Torvalds 2008-04-30 19:16 ` David Newall 0 siblings, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 19:08 UTC (permalink / raw) To: David Newall; +Cc: David Miller, linux-kernel On Thu, 1 May 2008, David Newall wrote: > > I just finished telling you that I'm currently trying 2.6.25. But you > couldn't have read that with any care at all, because I also just > finished telling you that it's not the specifics of the problem I'm > having that matters, it's the systemic problems in Linux's development > process. No. What you told us was nothing like that at all. What you told us was that you totally ignored the issue I brought up, namely that development happens, and that you have the choice of stagnating or accepting it. You point to it as some "systemic problem", and I told you that it's a sign of fast development. Things change. You didn't listen, or understand. If you want systemic problems, it is your kind of "bug report" that isn't anything like a bug report. Make a real report, don't whine. Push the _report_, not your inane agenda. Talk about *technology*, not about how you wish everything revolved around you and your wishes. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 19:08 ` Linus Torvalds @ 2008-04-30 19:16 ` David Newall 2008-04-30 19:25 ` Linus Torvalds 0 siblings, 1 reply; 229+ messages in thread From: David Newall @ 2008-04-30 19:16 UTC (permalink / raw) To: Linus Torvalds; +Cc: David Miller, linux-kernel Linus Torvalds wrote: > On Thu, 1 May 2008, David Newall wrote: > >> I just finished telling you that I'm currently trying 2.6.25. But you >> couldn't have read that with any care at all, because I also just >> finished telling you that it's not the specifics of the problem I'm >> having that matters, it's the systemic problems in Linux's development >> process. >> > > No. What you told us was nothing like that at all. Don't be foolish, Linus. It was exactly like that, almost to the point of quoting myself. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 19:16 ` David Newall @ 2008-04-30 19:25 ` Linus Torvalds 2008-05-01 4:31 ` David Newall 0 siblings, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 19:25 UTC (permalink / raw) To: David Newall; +Cc: David Miller, linux-kernel On Thu, 1 May 2008, David Newall wrote: > > > > No. What you told us was nothing like that at all. > > Don't be foolish, Linus. It was exactly like that, almost to the point > of quoting myself. You misunderstand. I object to your _idiotic_ claim that there are "systemic problems", where your "solution" to them is apparently to stop making releases and stop making forward progress. That's why I said you told us was nothing like that. What you told us were your personal problems, no "systemic" issues. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 19:25 ` Linus Torvalds @ 2008-05-01 4:31 ` David Newall 2008-05-01 4:37 ` David Miller ` (2 more replies) 0 siblings, 3 replies; 229+ messages in thread From: David Newall @ 2008-05-01 4:31 UTC (permalink / raw) To: Linus Torvalds; +Cc: David Miller, linux-kernel Linus Torvalds wrote: > I object to your _idiotic_ claim that there are "systemic problems", where > your "solution" to them is apparently to stop making releases and stop > making forward progress. > I did not say to stop making releases or forward progress. You completely made that up! I said there are systemic problems, namely inadequate testing and review. Slow down; don't snatch up crap changes. Only accept them when they are properly tested and properly reviewed. > That's why I said you told us was nothing like that. What you told us were > your personal problems, no "systemic" issues. You asked me to give a specific problem, so I did, but I also said that the particulars of those problems weren't the point. You have ignored or twisted everything I said. Did you ask me for a specific problem purely to attack me with it? Perhaps you did. Linus Torvalds also wrote: > You complain how I don't release kernels that > are stable, but without any suggestions on what the issue might be You do release kernels that are unstable, and you call them "stable", but I'm sure I said that inadequate review and testing are causes, which I think counts as a suggestion on what the issue might be. It's been a repeating theme in this thread, and I'm talking about what everybody else is saying, not what I'm saying, so again, you know that I'm not making this up. Stop telling the world that 2.6.25 is ready for them when you know it's not. It's now ready for beta testing, and no more. Is 2.6.24 ready for the world yet? There are still problems being reported with it. > And yes, there is a solution: don't develop so much. Don't allow thousands > of developers to be involved. Do a small core group, and make development > so hard or inconvenient that you only have a few tens of people who write > code, and vet them and force them to jump through hoops when adding new > features (or fixing old ones, for that matter). > You're being absurd, even hysterical. How about you require test plans and test results? Is it possible to require serious, independent code review? And let me talk about code review. When one puts one's name to a reviewed-by tag one takes joint responsibility for the result. There needs to be some sort of balanced accounting. Presently it's all glory, where the records show who has contributed code that made it to mainline, but nobody counts who broke the system. There's no motive to do a good job, in fact the opposite is true. The more crap you can sneak in, the more glory you get. Don't you go and twist this into some sort of, "David want's to point fingers at people who regularly introduce bugs, which we don't want to do" and ignore the problem. There is a problem; this entire thread is testimony to that. You, Linus, are ultimately responsible for what goes in so you have to acknowledge that there is a problem, you have to stop shooting the messenger, and you have to shepherd a solution. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 4:31 ` David Newall @ 2008-05-01 4:37 ` David Miller 2008-05-01 13:49 ` Lennart Sorensen 2008-05-01 15:28 ` Kasper Sandberg 2 siblings, 0 replies; 229+ messages in thread From: David Miller @ 2008-05-01 4:37 UTC (permalink / raw) To: davidn; +Cc: torvalds, linux-kernel From: David Newall <davidn@davidnewall.com> Date: Thu, 01 May 2008 14:01:43 +0930 > Stop telling the world that 2.6.25 is ready for them when you know it's > not. It's now ready for beta testing, and no more. Is 2.6.24 ready for > the world yet? There are still problems being reported with it. This has an absurd presumption that something is only stable when there are zero problems with it. Fault free software, except in extremely trivial examples, does not exist in nature. BTW, this points out another BS aspect of your BSD fan-boy crap, the BSD userbase is only a tiny fraction of how many people use Linux. So you can't even compare the number of outstanding problem reports between the two. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 4:31 ` David Newall 2008-05-01 4:37 ` David Miller @ 2008-05-01 13:49 ` Lennart Sorensen 2008-05-01 15:28 ` Kasper Sandberg 2 siblings, 0 replies; 229+ messages in thread From: Lennart Sorensen @ 2008-05-01 13:49 UTC (permalink / raw) To: David Newall; +Cc: Linus Torvalds, David Miller, linux-kernel On Thu, May 01, 2008 at 02:01:43PM +0930, David Newall wrote: [snip] > Stop telling the world that 2.6.25 is ready for them when you know it's > not. It's now ready for beta testing, and no more. Is 2.6.24 ready for > the world yet? There are still problems being reported with it. If a kernel release works without problems on 9999 out of 10000 machines, is it stable? How few specific combinations of hardware are there allowed to be with any problems before you can call it stable? How do you know a problem you see wasn't tested by 500 people none of whom had any problems because none of them had the hardware you do? -- Len Sorensen ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 4:31 ` David Newall 2008-05-01 4:37 ` David Miller 2008-05-01 13:49 ` Lennart Sorensen @ 2008-05-01 15:28 ` Kasper Sandberg 2008-05-01 17:49 ` Russ Dill 2 siblings, 1 reply; 229+ messages in thread From: Kasper Sandberg @ 2008-05-01 15:28 UTC (permalink / raw) To: David Newall; +Cc: Linus Torvalds, David Miller, linux-kernel On Thu, 2008-05-01 at 14:01 +0930, David Newall wrote: <snip> > > Linus Torvalds also wrote: > > You complain how I don't release kernels that > > are stable, but without any suggestions on what the issue might be > > You do release kernels that are unstable, and you call them "stable", > but I'm sure I said that inadequate review and testing are causes, which > I think counts as a suggestion on what the issue might be. It's been a > repeating theme in this thread, and I'm talking about what everybody > else is saying, not what I'm saying, so again, you know that I'm not > making this up. this is kindof bullshit, You can never be sure that something works perfectly for everyone, if there were to be so excessive testing that you would be willing to make such a bold claim, any "stable" kernel would be years in testing.. Linux stability also seems to be okay, and people who wants to lower risk of problems can simply choose to use slightly older versions. What i find more of a problem is long term effects and problems of changes. For instance, Linux has slowly and steadily been getting alot more sensitive to IO, and ALOT more memory hungry.. I Recently found a system with a 2.6.4 kernel, and when i upgraded to 2.6.23, i saw memory usage increase from ~250mb to around 500. I upgraded to .25 to see if it was some weird bug, but it is the same. Unfortunately i cannot investigate more, as i only had the box for a very short time, but this is alot more concerning to me. Unfortunately i dont think i can easily reproduce this as i am unsure how to actually get to test 2.6.0 through .24 easily.. > > Stop telling the world that 2.6.25 is ready for them when you know it's > not. It's now ready for beta testing, and no more. Is 2.6.24 ready for > the world yet? There are still problems being reported with it. > > Well.. its doing a quite nice job on my new workstation :) <snip> ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 15:28 ` Kasper Sandberg @ 2008-05-01 17:49 ` Russ Dill 2008-05-02 1:47 ` Kasper Sandberg 0 siblings, 1 reply; 229+ messages in thread From: Russ Dill @ 2008-05-01 17:49 UTC (permalink / raw) To: linux-kernel > I Recently found a system with a 2.6.4 kernel, and when i upgraded to > 2.6.23, i saw memory usage increase from ~250mb to around 500. I > upgraded to .25 to see if it was some weird bug, but it is the same. > > Unfortunately i cannot investigate more, as i only had the box for a > very short time, but this is alot more concerning to me. > Memory is not something that is difficult to track. Its likely one of two things: a) Your card now has 3d support, hooray! and X is mapping more regions, which isn't really additional RAM usage. b) Linux is caching more things, hooray! I'm not saying that you are one of those people who just looks at the free number and doesn't think any further, but you might be. or c, the kernel has another 250MB is kernel data structures, seems unlikely. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 17:49 ` Russ Dill @ 2008-05-02 1:47 ` Kasper Sandberg 2008-05-02 2:54 ` Russ Dill 0 siblings, 1 reply; 229+ messages in thread From: Kasper Sandberg @ 2008-05-02 1:47 UTC (permalink / raw) To: Russ Dill; +Cc: linux-kernel On Thu, 2008-05-01 at 17:49 +0000, Russ Dill wrote: > > I Recently found a system with a 2.6.4 kernel, and when i upgraded to > > 2.6.23, i saw memory usage increase from ~250mb to around 500. I > > upgraded to .25 to see if it was some weird bug, but it is the same. > > > > Unfortunately i cannot investigate more, as i only had the box for a > > very short time, but this is alot more concerning to me. > > > > Memory is not something that is difficult to track. Its likely one of two things: > > a) Your card now has 3d support, hooray! and X is mapping more regions, which > isn't really additional RAM usage. no, that isnt it. im talking ram USAGE. :) > > b) Linux is caching more things, hooray! I'm not saying that you are one of > those people who just looks at the free number and doesn't think any further, > but you might be. Im afraid this theory also isnt the case, i know what the cache is, and i also know how to subtract :) > > or c, the kernel has another 250MB is kernel data structures, seems unlikely. Well yes that seems somewhat big, however, the kernel was the ONLY change. i can also say that i have noticed this on my own workstation, however thats not really an as valid case, as i have also upgraded userspace and such over time, but it used to be that my box wouldnt use more than ~100mb to boot into X with kde open, and about ~300mb at browsing/mail and such, but these days my workstation easily uses 1.5gb of ram for no apparent reason.. something certainly is fishy around here, these days people just tend to fix it by throwing 10 times more ram in than should really be necessary, which i guess, is because the ram prices has dropped 10 times > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-02 1:47 ` Kasper Sandberg @ 2008-05-02 2:54 ` Russ Dill 2008-05-02 7:01 ` Kasper Sandberg 2008-05-02 17:34 ` Lee Mathers (TCAFS) 0 siblings, 2 replies; 229+ messages in thread From: Russ Dill @ 2008-05-02 2:54 UTC (permalink / raw) To: Kasper Sandberg; +Cc: linux-kernel > i can also say that i have noticed this on my own workstation, however > thats not really an as valid case, as i have also upgraded userspace and > such over time, but it used to be that my box wouldnt use more than > ~100mb to boot into X with kde open, and about ~300mb at browsing/mail > and such, but these days my workstation easily uses 1.5gb of ram for no > apparent reason.. > > something certainly is fishy around here, these days people just tend to > fix it by throwing 10 times more ram in than should really be necessary, > which i guess, is because the ram prices has dropped 10 times > So you aren't really contributing anything to the discussion. It could be userspace, it could be different types of pages you are visiting, it could be the kernel, you haven't really measured what is taking up the memory. And of course, its all because developers are lazy. Thanks for the input. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-02 2:54 ` Russ Dill @ 2008-05-02 7:01 ` Kasper Sandberg 2008-05-02 17:34 ` Lee Mathers (TCAFS) 1 sibling, 0 replies; 229+ messages in thread From: Kasper Sandberg @ 2008-05-02 7:01 UTC (permalink / raw) To: Russ Dill; +Cc: linux-kernel On Thu, 2008-05-01 at 19:54 -0700, Russ Dill wrote: > > i can also say that i have noticed this on my own workstation, however > > thats not really an as valid case, as i have also upgraded userspace and > > such over time, but it used to be that my box wouldnt use more than > > ~100mb to boot into X with kde open, and about ~300mb at browsing/mail > > and such, but these days my workstation easily uses 1.5gb of ram for no > > apparent reason.. > > > > something certainly is fishy around here, these days people just tend to > > fix it by throwing 10 times more ram in than should really be necessary, > > which i guess, is because the ram prices has dropped 10 times > > > > So you aren't really contributing anything to the discussion. It could > be userspace, it could be different types of pages you are visiting, > it could be the kernel, you haven't really measured what is taking up > the memory. And of course, its all because developers are lazy. Thanks > for the input. I think you didnt read read the first part of my message.. And as i said, on my workstation i have no idea exactly WHAT has changed to cause it.. i just came with some information.. And of course, its all because the person writing the email forgets to include key details such as "however, thats not really an as valid case...". Thanks for the input.. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-02 2:54 ` Russ Dill 2008-05-02 7:01 ` Kasper Sandberg @ 2008-05-02 17:34 ` Lee Mathers (TCAFS) 2008-05-02 18:21 ` Andi Kleen 1 sibling, 1 reply; 229+ messages in thread From: Lee Mathers (TCAFS) @ 2008-05-02 17:34 UTC (permalink / raw) To: Russ Dill; +Cc: linux-kernel Russ Dill wrote: >> i can also say that i have noticed this on my own workstation, however >> thats not really an as valid case, as i have also upgraded userspace and >> such over time, but it used to be that my box wouldnt use more than >> ~100mb to boot into X with kde open, and about ~300mb at browsing/mail >> and such, but these days my workstation easily uses 1.5gb of ram for no >> apparent reason.. >> >> something certainly is fishy around here, these days people just tend to >> fix it by throwing 10 times more ram in than should really be necessary, >> which i guess, is because the ram prices has dropped 10 times >> >> > > So you aren't really contributing anything to the discussion. It could > be userspace, it could be different types of pages you are visiting, > it could be the kernel, you haven't really measured what is taking up > the memory. And of course, its all because developers are lazy. Thanks > for the input. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ If your subscribing and reading this list at least have the courtesy to think about your posting before delivering it around the world. Not to be rude but do you know how dumb you look right now. There are many programs that while provide detailed memory usage reports. That would be a first step. By "cc" to the list... *WTF*!! For someone that's been lurking on the kernel lists on and off since the mid 90's this has been one of the most stupid discussions and largest timesink to date.. Is this really the LKML or the #linux channel on irc? ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-02 17:34 ` Lee Mathers (TCAFS) @ 2008-05-02 18:21 ` Andi Kleen 2008-05-02 21:34 ` Kasper Sandberg 0 siblings, 1 reply; 229+ messages in thread From: Andi Kleen @ 2008-05-02 18:21 UTC (permalink / raw) To: Lee Mathers (TCAFS); +Cc: Russ Dill, linux-kernel "Lee Mathers (TCAFS)" <Lee.Mathers@tcafs.org> writes: > to think about your posting before delivering it around the world. > Not to be rude but do you know how dumb you look right now. There are > many programs that while provide detailed memory usage reports. That > would be a first step. To be fair detailed memory analysis of user space can be tricky, especially if you consider shared pages etc. And the standard tools for it are actually not very good (would be a very interesting area for someone to work on I think) Still the poster could have done much more research before ranting, agreed. -Andi ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-02 18:21 ` Andi Kleen @ 2008-05-02 21:34 ` Kasper Sandberg 0 siblings, 0 replies; 229+ messages in thread From: Kasper Sandberg @ 2008-05-02 21:34 UTC (permalink / raw) To: Andi Kleen; +Cc: Lee Mathers (TCAFS), Russ Dill, linux-kernel On Fri, 2008-05-02 at 20:21 +0200, Andi Kleen wrote: > "Lee Mathers (TCAFS)" <Lee.Mathers@tcafs.org> writes: > > > to think about your posting before delivering it around the world. > > Not to be rude but do you know how dumb you look right now. There are > > many programs that while provide detailed memory usage reports. That > > would be a first step. > > To be fair detailed memory analysis of user space can be tricky, > especially if you consider shared pages etc. And the standard tools for it > are actually not very good (would be a very interesting area for someone > to work on I think) Still the poster could have done much > more research before ranting, agreed. I did not rant, i in fact said that i did not know enough to place any blame, or even say what is causing it for whatever reason it might be. There is a difference, i merely provided some information. > > -Andi > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 18:27 ` Linus Torvalds 2008-04-30 18:55 ` David Newall @ 2008-04-30 19:06 ` Chris Friesen 2008-04-30 19:13 ` Linus Torvalds 1 sibling, 1 reply; 229+ messages in thread From: Chris Friesen @ 2008-04-30 19:06 UTC (permalink / raw) To: Linus Torvalds; +Cc: David Newall, David Miller, linux-kernel Linus Torvalds wrote: > > On Thu, 1 May 2008, David Newall wrote: > >>Why are you bringing up git trees (which I don't use)? I'm presently >>plagued with a problem that's 2.6.22 or older, extending to at least >>2.6.23 and maybe still current. > > > Ok, *PLONK*. > > You're on an old kernel, don't know if your problem is fixed, and ask us > to slow down development. > > That makes sense. > > Go away. He did say that he was testing 2.6.25, and that suspend-to-disk was broken in 2.6.25. Chris ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 19:06 ` Chris Friesen @ 2008-04-30 19:13 ` Linus Torvalds 2008-04-30 19:22 ` David Newall 0 siblings, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 19:13 UTC (permalink / raw) To: Chris Friesen; +Cc: David Newall, David Miller, linux-kernel On Wed, 30 Apr 2008, Chris Friesen wrote: > > He did say that he was testing 2.6.25, and that suspend-to-disk was > broken in 2.6.25. Neither of which had anything to do with the whole "slow down" argument. If you have a bug, make a bug report, and push it, and make people aware of it. But don't make it an argument for development to slow down. Should we all stand around with our thumbs up our *ss because somebody has a bug? Should the other developers just stop, because suspend-to-disk is broken for somebody? Should everything come to a standstill because David Newall doesn't like how there are other things going on that are independent of _his_ problems? Do you really believe that? Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 19:13 ` Linus Torvalds @ 2008-04-30 19:22 ` David Newall 2008-04-30 19:42 ` Linus Torvalds 0 siblings, 1 reply; 229+ messages in thread From: David Newall @ 2008-04-30 19:22 UTC (permalink / raw) To: Linus Torvalds; +Cc: Chris Friesen, David Miller, linux-kernel Linus Torvalds wrote: > Should everything come to a standstill because David > Newall doesn't like how there are other things going on that are > independent of _his_ problems? > You're being a nasty piece of work this day, Linus, and you're fibbing by mischaracterising what I said which, by the way, included, "it's not the specifics of the problem I'm having that matters". You're taking this far too personally. Get a grip. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 19:22 ` David Newall @ 2008-04-30 19:42 ` Linus Torvalds 0 siblings, 0 replies; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 19:42 UTC (permalink / raw) To: David Newall; +Cc: Chris Friesen, David Miller, linux-kernel On Thu, 1 May 2008, David Newall wrote: > > You're taking this far too personally. Umm. If you didn't want a personal opinion, why did you Cc me in the first place then, and ask for my input? I gave my input to you. I think your arguments are ludicrous, to the point of being totally idiotic. You complain how I don't release kernels that are stable, but without any suggestions on what the issue might be, apart from apparently me merging too much and making too many releases. But do you really expect me to stop merging, or hold up releases that fix hundreds of issues, just because there are other issues pending? Do you really think development can be stopped? Trust me, we've tried. Every time, it just leads to worse problems when the floodgates are then opened. And yes, there is a solution: don't develop so much. Don't allow thousands of developers to be involved. Do a small core group, and make development so hard or inconvenient that you only have a few tens of people who write code, and vet them and force them to jump through hoops when adding new features (or fixing old ones, for that matter). And yes, that *does* result in a "stable" system. Never mind that it's stable for all the wrong reasons, and generally doesn't actually work well across a dynamic environment (whether the hardware base below or user space above). See? This is why I think your arguments are so silly and misguided. But if you actually have real constructive ideas on things to actually *do*, please do mention them. We've changed our models over time, several times, exactly because we've searched for better ways to do thigns. But do realize that (a) we can't just stop, or even really slow down. We can onyl try to regulate and to some degree direct the flood, not hold it up for any particular issue. (b) We do have process in place, and it may not be perfect, but I doubt anything is, and what we do have actually has evolved over the years. And that's not just my process (ie "two-week merge window, followed by about 6-8 weeks of fixups"), but the whole process both before and after it (Andrew and now linux-next in front of it, and stable kernel tree and the vendors after it). (c) the "big picture" discussion is separate from individual issues. If you want your suspend-to-disk issue resolved, or a memory leak solved, you don't solve those by trying to complain about other parts of the system, that are totally separate. The global flow of patches and releases is not something that we can hold up for _any_ of your individual problems. I do end up delaying releases for really core things, so individual problems do obviously affect (for example) the release timing. But the solution to them is not in complaining about slowing down development, it is about actually trying to engage the developers of *that* feature in *that* particular bug. And finally, trust me, if you want to have people care about your problems, the last thing you want to do is say "I might switch to BSD". Because quite frankly, I really don't care. People who think that threats like that work in any productive way can go screw themselves. I'll flame idiots like that, and my likelihood of helping people because they think they hold a gun to my head is almost zero. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 4:03 ` David Newall 2008-04-30 4:18 ` David Miller @ 2008-04-30 7:11 ` Tarkan Erimer 2008-04-30 13:28 ` David Newall 2008-04-30 14:55 ` Russ Dill 2 siblings, 1 reply; 229+ messages in thread From: Tarkan Erimer @ 2008-04-30 7:11 UTC (permalink / raw) To: David Newall; +Cc: David Miller, linux-kernel, Linus Torvalds David Newall wrote: > Yes. The Linux process is becoming unreliable. Newly "stable" versions > have stability problems. The development process looks childish. > Seasoned developers say not to worry, that the process works. I do > worry. BSD seems more attractive, and it may even be worth the > considerable effort to switch my entire client-base. Linux was lucky to > gain the foothold that it did: traditionally, BSD had a better system > with a less restrictive licence, so it is surprising that manufacturers > chose to go with Linux. BSD still has a less restrictive licence and > when mainstream press becomes interested in Linux's quality problems > it's adoption will fall. BSD is still a good, maybe even better, option. > > Linus, this is your baby and so it's your problem. Only you have the > influence to change things. > I completely disagree with your foolish and nonsense comments about the Linux Kernel and the Linux OS. It's perfectly clear that you didn't understand well enough how the linux development process works. If you thought that the recently released kernels are not stable then, you have to wait the 2.6.x.y series or you can use the distro kernels. All of your comments are pointless and no base. You are free to choose BSD or whatever you want to use. No one is putting a gun on your head to use Linux :-) I can very easily say that,cause of my experiences , the Linux Kernel is PERFECTLY STABLE! I work in an one of the largest ISP of my country and I use Linux very intensively under very high loads and I NEVER NEVER faced any problems because of the fault of the Linux Kernel on my environments. For example, many of our mail servers run on Linux and all the day they process hundred thousands emails without any downtime or trouble! The manufactures mostly choose Linux instead of BSD flavors, simply because of that Linux kernel, technically, more superior to BSDs or others. When it comes to licenses: the BSD license is more and more worse, if GPL is bad. GPL protects your freedom and openness of the codes via forcing the changes to the source code must be return in open form. For BSD, it is opposite. You are free to take someone else's code and there is NO PROTECTION to prevent your code to become a closed (proprietary) source. Can you imagine that one company (like Microsoft) takes your whole kernel source code and creates a PROPRIETARY OS (Like Windows!) as making a fool of you ? Why? Because; simply, BSD license allows it! No need to return the code! Do you think really think that BSD license is more free as making a fool of you ? ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 7:11 ` Tarkan Erimer @ 2008-04-30 13:28 ` David Newall 2008-04-30 13:38 ` Mike Galbraith 2008-04-30 14:41 ` mws 0 siblings, 2 replies; 229+ messages in thread From: David Newall @ 2008-04-30 13:28 UTC (permalink / raw) To: Tarkan Erimer; +Cc: David Miller, linux-kernel, Linus Torvalds Tarkan Erimer wrote: > I completely disagree with your foolish and nonsense comments about > the Linux Kernel and the Linux OS. It's perfectly clear that you > didn't understand well enough how the linux development process works. > If you thought that the recently released kernels are not stable then, > you have to wait the 2.6.x.y series or you can use the distro kernels. > All of your comments are pointless and no base. You are free to choose > BSD or whatever you want to use. No one is putting a gun on your head > to use Linux :-) The problem is not exactly faults in recently released kernels, rather that introduction of faults is common when it should be rare, and kernels are released as stable when they are fragile. Ignoring a problem, and not caring if they migrate to BSD, is foolishness. Of course you don't want people to migrate to BSD, so don't pretend that you don't care. > Do you think really think that BSD license is more free as making a > fool of you ? It is a matter of transparent fact that BSD's licence is less restrictive than Linux's. Whether that is desirable is not something that need be discussed at this juncture. My point in raising BSD was that, from a commercial point of view, BSD is attractive in a way that Linux is not. The many commercial vendors who have been taken to task for not honouring their GPL obligations are strong demonstrations of that. Do not pretend that Linux is sacrosanct. BSD would be an easy swap for vendors should Linux gain a reputation for poor quality (and it already runs Linux applications.) Reputations snowball. By the time anybody notices that a good one has become tarnished it could be too late, and take too long, to rectify. I'm sure somebody else observed approximately this just yesterday, so it's not just me, is it? I won't champion this because it's unimportant to me. Linux's quality problems are not my problems. I do what I can to help Linux, but I'm not religious about operating systems and I know that good, free operating systems will continue to thrive, even if Linux's dies, just as they did before Linux was born. Ignore the problem, even shoot the messenger, if you like; or be adult, consider the proposition dispassionately, and take steps from there. I've said my bit, in fact more than I wanted to, so I choose to stop here. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 13:28 ` David Newall @ 2008-04-30 13:38 ` Mike Galbraith 2008-04-30 14:41 ` mws 1 sibling, 0 replies; 229+ messages in thread From: Mike Galbraith @ 2008-04-30 13:38 UTC (permalink / raw) To: David Newall; +Cc: Tarkan Erimer, David Miller, linux-kernel, Linus Torvalds On Wed, 2008-04-30 at 22:58 +0930, David Newall wrote: > Ignore the problem, even shoot the messenger, if you like; BANG! #include <chicken_little_headstone.h> ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 13:28 ` David Newall 2008-04-30 13:38 ` Mike Galbraith @ 2008-04-30 14:41 ` mws 1 sibling, 0 replies; 229+ messages in thread From: mws @ 2008-04-30 14:41 UTC (permalink / raw) To: David Newall; +Cc: Tarkan Erimer, David Miller, linux-kernel, Linus Torvalds David Newall schrieb: > Tarkan Erimer wrote: >> I completely disagree with your foolish and nonsense comments about >> the Linux Kernel and the Linux OS. It's perfectly clear that you >> didn't understand well enough how the linux development process works. >> If you thought that the recently released kernels are not stable then, >> you have to wait the 2.6.x.y series or you can use the distro kernels. >> All of your comments are pointless and no base. You are free to choose >> BSD or whatever you want to use. No one is putting a gun on your head >> to use Linux :-) > > The problem is not exactly faults in recently released kernels, rather > that introduction of faults is common when it should be rare, and > kernels are released as stable when they are fragile. Ignoring a > problem, and not caring if they migrate to BSD, is foolishness. Of > course you don't want people to migrate to BSD, so don't pretend that > you don't care. > >> Do you think really think that BSD license is more free as making a >> fool of you ? > > It is a matter of transparent fact that BSD's licence is less > restrictive than Linux's. Whether that is desirable is not something > that need be discussed at this juncture. My point in raising BSD was > that, from a commercial point of view, BSD is attractive in a way that > Linux is not. The many commercial vendors who have been taken to task > for not honouring their GPL obligations are strong demonstrations of > that. Do not pretend that Linux is sacrosanct. BSD would be an easy > swap for vendors should Linux gain a reputation for poor quality (and it > already runs Linux applications.) > > Reputations snowball. By the time anybody notices that a good one has > become tarnished it could be too late, and take too long, to rectify. > I'm sure somebody else observed approximately this just yesterday, so > it's not just me, is it? > > I won't champion this because it's unimportant to me. Linux's quality > problems are not my problems. I do what I can to help Linux, but I'm > not religious about operating systems and I know that good, free > operating systems will continue to thrive, even if Linux's dies, just as > they did before Linux was born. > > Ignore the problem, even shoot the messenger, if you like; or be adult, > consider the proposition dispassionately, and take steps from there. > > I've said my bit, in fact more than I wanted to, so I choose to stop here. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > within all time you spent up here to discuss nonsense (from my pov it is), several bugfixes, regression fixes, new drivers, ... have been done or started. lets concentrate back on what counts - shouldn't we? my 2ct marcel ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 4:03 ` David Newall 2008-04-30 4:18 ` David Miller 2008-04-30 7:11 ` Tarkan Erimer @ 2008-04-30 14:55 ` Russ Dill 2 siblings, 0 replies; 229+ messages in thread From: Russ Dill @ 2008-04-30 14:55 UTC (permalink / raw) To: linux-kernel David Newall <davidn <at> davidnewall.com> writes: > Yes. The Linux process is becoming unreliable. Newly "stable" versions > have stability problems. The development process looks childish. > Seasoned developers say not to worry, that the process works. I do > worry. BSD seems more attractive, and it may even be worth the > considerable effort to switch my entire client-base. Linux was lucky to > gain the foothold that it did: traditionally, BSD had a better system > with a less restrictive licence, so it is surprising that manufacturers > chose to go with Linux. BSD still has a less restrictive licence and > when mainstream press becomes interested in Linux's quality problems > it's adoption will fall. BSD is still a good, maybe even better, option. > > Linus, this is your baby and so it's your problem. Only you have the > influence to change things. > Can you please point me in the direction of the BSD kernel lists so that I can inject useless snarky flaimbait anytime someone attempts a little process improvement. I don't do any BSD kernel development, but I like to stir things up. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 2:03 Slow DOWN, please!!! David Miller 2008-04-30 4:03 ` David Newall @ 2008-04-30 14:48 ` Peter Teoh 2008-04-30 19:36 ` Rafael J. Wysocki 2 siblings, 0 replies; 229+ messages in thread From: Peter Teoh @ 2008-04-30 14:48 UTC (permalink / raw) To: David Miller; +Cc: linux-kernel On Wed, Apr 30, 2008 at 10:03 AM, David Miller <davem@davemloft.net> wrote: > > This is starting to get beyond frustrating for me. > > Yesterday, I spent the whole day bisecting boot failures > on my system due to the totally untested linux/bitops.h > optimization, which I fully analyzed and debugged. > > Today, I had hoped that I could get some work done of my > own, but that's not the case. > > Yet another bootup regression got added within the last 24 > hours. > > I don't mind fixing the regression or two during the merge > window but THIS IS ABSOLUTELY, FUCKING, REDICULIOUS! > > The tree breaks every day, and it's becomming an extremely > non-fun environment to work in. > > We need to slow down the merging, we need to review things > more, we need people to test their fucking changes! > -- Just some comments: Analogous to that of the football team, everyone has an impt role to play. And u better let go of the ball as fast as u can, otherwise u are going to tire yourself out easily. So, in a development team, if u think there is some unequal distribution of workload, make noise. Or think of some means to do automatic loading of workload - specifically in the area of change review. (At other times, it is not easily to pass the load around.....eg, if the bug happened only on your machines and not on others?) 1. Generally, the more people reviewed the work, the higher chances the piece of work is ok. 2. If more variation of real-testing is done, the better. "variation" here means testing by users of different background skills, different applications running, and most impt - is the base kernel version where the patch is applied and tested. etc. 3. Based on the two numbers above alone, we can immediately have some measure of confidence of the patch - correct? 4. So if we can put all these in a web page - the patches itself, the reviewers/testers that have worked on it. When someone comes in and review, review counter increase by one. Or tester counter increased by one after testing. And I supposed everyone will attempt to cover those that are lesser covered by others - automatic loading of workload done in a distributed manner. Avoid having to fill in too much information though...u will discourage taking up the work, and let the participant spent more precious time on reviewing instead. So prior to consolidation of sources, just by looking at the numbers, u can see how successful the consolidation will be. If it is lesser tested, then avoid including it for consoldating...... Please comments...... -- Regards, Peter Teoh ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 2:03 Slow DOWN, please!!! David Miller 2008-04-30 4:03 ` David Newall 2008-04-30 14:48 ` Peter Teoh @ 2008-04-30 19:36 ` Rafael J. Wysocki 2008-04-30 20:00 ` Andrew Morton ` (2 more replies) 2 siblings, 3 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 19:36 UTC (permalink / raw) To: David Miller; +Cc: linux-kernel, Andrew Morton, Linus Torvalds, Jiri Slaby On Wednesday, 30 of April 2008, David Miller wrote: > > This is starting to get beyond frustrating for me. > > Yesterday, I spent the whole day bisecting boot failures > on my system due to the totally untested linux/bitops.h > optimization, which I fully analyzed and debugged. > > Today, I had hoped that I could get some work done of my > own, but that's not the case. > > Yet another bootup regression got added within the last 24 > hours. > > I don't mind fixing the regression or two during the merge > window but THIS IS ABSOLUTELY, FUCKING, REDICULIOUS! > > The tree breaks every day, and it's becomming an extremely > non-fun environment to work in. > > We need to slow down the merging, we need to review things > more, we need people to test their fucking changes! Well, I must say I second that. I'm not seeing regressions myself this time (well, except for the one that Jiri fixed), but I did find a few of them during the post-2.6.24 merge window and I wouldn't like to repeat that experience, so to speak. IMO, the merge window is way too short for actually testing anything. I rebuild the kernel once or even twice a day and there's no way I can really test it. I can only check if it breaks right away. And if it does, there's no time to find out what broke it before the next few hundreds of commits land on top of that. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 19:36 ` Rafael J. Wysocki @ 2008-04-30 20:00 ` Andrew Morton 2008-04-30 20:20 ` Rafael J. Wysocki 2008-04-30 20:05 ` Linus Torvalds 2008-04-30 20:15 ` Andrew Morton 2 siblings, 1 reply; 229+ messages in thread From: Andrew Morton @ 2008-04-30 20:00 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: davem, linux-kernel, torvalds, jirislaby On Wed, 30 Apr 2008 21:36:57 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > On Wednesday, 30 of April 2008, David Miller wrote: > > > > This is starting to get beyond frustrating for me. > > > > Yesterday, I spent the whole day bisecting boot failures > > on my system due to the totally untested linux/bitops.h > > optimization, which I fully analyzed and debugged. > > > > Today, I had hoped that I could get some work done of my > > own, but that's not the case. > > > > Yet another bootup regression got added within the last 24 > > hours. > > > > I don't mind fixing the regression or two during the merge > > window but THIS IS ABSOLUTELY, FUCKING, REDICULIOUS! > > > > The tree breaks every day, and it's becomming an extremely > > non-fun environment to work in. > > > > We need to slow down the merging, we need to review things > > more, we need people to test their fucking changes! > > Well, I must say I second that. ooh, fun thread. One of the main reasons for -mm (probably _the_ main reason) is to weed out other-developer-impacting regressions before they hit mainline and, umm, affect developers. But there are implementation problems: a) developers aren't testing -mm enough b) -mm releases have become too slow, and (hence) too unstable c) people are slamming changes into mainline which have never been seen in -mm. Lots of changes. So here's how we're going to fix David's problem: - Everyone gets their stuff into linux-next. - Lots of people _test_ linux-next. Just once a week. Those two steps will improve the merge-window chaos a lot. Things will get better. The remaining open problem is what do we do about the shiny new code which is getting slammed into the merge window? Well, it's very easy to tell whether code which appears in the merge window was present in linux-next. Our first way of preventing people from shoving inadequately-cooked code into the merge window is suasion (aka flaming their titties off). If that proves insufficient and if we still have a sufficiently large problem that we need to do something about it then sure, let's reevaluate. But one thing at a time. For the 2.6.27 release let us concentrte on two things - get your stuff into linux-next - test linux-next. If merge-window stability is still a problem after that then let's revisit? ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:00 ` Andrew Morton @ 2008-04-30 20:20 ` Rafael J. Wysocki 0 siblings, 0 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 20:20 UTC (permalink / raw) To: Andrew Morton; +Cc: davem, linux-kernel, torvalds, jirislaby On Wednesday, 30 of April 2008, Andrew Morton wrote: > On Wed, 30 Apr 2008 21:36:57 +0200 > "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > > > On Wednesday, 30 of April 2008, David Miller wrote: > > > > > > This is starting to get beyond frustrating for me. > > > > > > Yesterday, I spent the whole day bisecting boot failures > > > on my system due to the totally untested linux/bitops.h > > > optimization, which I fully analyzed and debugged. > > > > > > Today, I had hoped that I could get some work done of my > > > own, but that's not the case. > > > > > > Yet another bootup regression got added within the last 24 > > > hours. > > > > > > I don't mind fixing the regression or two during the merge > > > window but THIS IS ABSOLUTELY, FUCKING, REDICULIOUS! > > > > > > The tree breaks every day, and it's becomming an extremely > > > non-fun environment to work in. > > > > > > We need to slow down the merging, we need to review things > > > more, we need people to test their fucking changes! > > > > Well, I must say I second that. > > ooh, fun thread. > > One of the main reasons for -mm (probably _the_ main reason) is to weed out > other-developer-impacting regressions before they hit mainline and, umm, > affect developers. > > But there are implementation problems: > > a) developers aren't testing -mm enough > > b) -mm releases have become too slow, and (hence) too unstable > > c) people are slamming changes into mainline which have never been seen > in -mm. Lots of changes. Yeah. > So here's how we're going to fix David's problem: > > - Everyone gets their stuff into linux-next. > > - Lots of people _test_ linux-next. Just once a week. For this to happen, let's make the mainline change slower than once a day after the merge window. > Those two steps will improve the merge-window chaos a lot. Things will get > better. Not until we make a rule that nothing that didn't went through linux-next is mergeable unless it's an obvious bugfix that has no side effects. > The remaining open problem is what do we do about the shiny new code which > is getting slammed into the merge window? > > Well, it's very easy to tell whether code which appears in the merge window > was present in linux-next. > > Our first way of preventing people from shoving inadequately-cooked code > into the merge window is suasion (aka flaming their titties off). If that > proves insufficient and if we still have a sufficiently large problem that > we need to do something about it then sure, let's reevaluate. OK > But one thing at a time. For the 2.6.27 release let us concentrte on two > things > > - get your stuff into linux-next > > - test linux-next. > > If merge-window stability is still a problem after that then let's revisit? I'll see you in the analogous thread during the next merge window. ;-) ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 19:36 ` Rafael J. Wysocki 2008-04-30 20:00 ` Andrew Morton @ 2008-04-30 20:05 ` Linus Torvalds 2008-04-30 20:14 ` Linus Torvalds ` (2 more replies) 2008-04-30 20:15 ` Andrew Morton 2 siblings, 3 replies; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 20:05 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > > IMO, the merge window is way too short for actually testing anything. That is largely on purpose. There's two choices: - have a longer and calmer merge window, spread out the joy, and have people test and fix their things during the merge window too. In other words, less black-and-white. - Really short merge window, and use the extra time *after* it to fix the issues. and I've obviously gone for the latter. In fact, I'd personally like to make it even shorter, because the problem with the long merge window can be summed up very simply: Long merge windows don't work - because rather than test more, it just means that people will use them to make more changes! So one of the major things about the short merge window is that it's hopefully encouraging people to have things ready by the time the merge window opens, because it's too late to do anything later. And yes, we could have some other way of enforcing that - allow the merge window to be longer, but have some other mechanism to make sure that I only merge old code. In fact, I'd personally *love* to have a hard rule that says "I will only pull from trees that were already 'done' by the time the window opened", and we've been kind-of moving in that direction. But that wish is counteracted by the fact that the merges themselves do need some development, so expecting everything to be ready before-hand is simply not realistic. Also, while I'd like trees to be ready when the window opens, at the same time I do think that it's good to spread out some of it, and get *some* basic testing - even if it's just a nightly build and a few tens of developers. > I rebuild the kernel once or even twice a day and there's no way I can > really test it. I can only check if it breaks right away. And really, that's all that we'd expect during the merge window. We want to find the *obvious* problems - build issues, and the things that hit everybody, but let's face it, the subtle ones will take time to find regardless. Then, the short merge window means that we have more time when we really don't have big changes going in to find the subtle ones. (And making the release cycle longer would *not* help - that would just make the next merge window more painful, so while it can, and does, work for some individual release with particular problems, it's not a solution in the long run). Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:05 ` Linus Torvalds @ 2008-04-30 20:14 ` Linus Torvalds 2008-04-30 20:56 ` Rafael J. Wysocki 2008-04-30 23:34 ` Greg KH 2008-04-30 20:45 ` Rafael J. Wysocki 2008-04-30 23:29 ` Paul Mackerras 2 siblings, 2 replies; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 20:14 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, 30 Apr 2008, Linus Torvalds wrote: > > In fact, I'd personally like to make it even shorter Just to clarify: I'd actually like to make the merge window be just a week. If even that. With linux-next hopefully stepping up to be a place where the actual _conflicts_ (which are usually not the big problem, they are just inconvenient from a timing standpoint) can get found and handled early, a shorter merge window should be technically possible. HOWEVER. Even now, at two weeks, we do have issues where timing just doesn't fit some developer, because of conferences or vacations or just random personal issues or whatever. There are always people who grumble because the window didn't work for them. Of course, they should have had it all ready, but somehow that simply doesn't happen. I think it's against most human nature to be quite _that_ forward-looking. And maybe everything would be ok if we could also shorten the actual release cycle, so that if you miss one merge window for some random conference or other (or just a *really* bad hair-day and you didn't get your act together), you wouldn't mind too much and you'd just hit the next one instead. But that, in turn, is unrealistic because when bugs do happen, the latency you get between testers and developers is long enough that I really don't think we can shorten the after-merge-window thing much. Six weeks seems to be already pushing it. And as mentioned, a longer after-merge-window-stabilization phase is just going to aggravate the problem next time around. We could have staggered releases, but let's face it, that's what -mm and linux-next and stable is all about. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:14 ` Linus Torvalds @ 2008-04-30 20:56 ` Rafael J. Wysocki 2008-04-30 23:34 ` Greg KH 1 sibling, 0 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 20:56 UTC (permalink / raw) To: Linus Torvalds; +Cc: David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wednesday, 30 of April 2008, Linus Torvalds wrote: > > On Wed, 30 Apr 2008, Linus Torvalds wrote: > > > > In fact, I'd personally like to make it even shorter > > Just to clarify: I'd actually like to make the merge window be just a > week. If even that. > > With linux-next hopefully stepping up to be a place where the actual > _conflicts_ (which are usually not the big problem, they are just > inconvenient from a timing standpoint) can get found and handled early, a > shorter merge window should be technically possible. That even might be better, if there's less code merged as a result. > HOWEVER. Even now, at two weeks, we do have issues where timing just > doesn't fit some developer, because of conferences or vacations or just > random personal issues or whatever. There are always people who grumble > because the window didn't work for them. Well, where's it stated that you have to develop new code for each merge window? By making shorter merge windows with less code merged in each of them, we could actaully improve things. > Of course, they should have had it all ready, but somehow that simply > doesn't happen. I think it's against most human nature to be quite _that_ > forward-looking. > > And maybe everything would be ok if we could also shorten the actual > release cycle, so that if you miss one merge window for some random > conference or other (or just a *really* bad hair-day and you didn't get > your act together), you wouldn't mind too much and you'd just hit the next > one instead. Exactly. > But that, in turn, is unrealistic because when bugs do happen, the latency > you get between testers and developers is long enough that I really don't > think we can shorten the after-merge-window thing much. Six weeks seems to > be already pushing it. That depends on the amount of bugs introduced during the merge window. With shorter merge windows we may introduce fewer bugs per merge window and the most subtle ones take more six weeks to find anyway. > And as mentioned, a longer after-merge-window-stabilization phase is just > going to aggravate the problem next time around. > > We could have staggered releases, but let's face it, that's what -mm and > linux-next and stable is all about. Well, that's assuming that people test linux-next and -mm etc., but frankly I'm not seeing that happening. Hopefully, things are going to improve. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:14 ` Linus Torvalds 2008-04-30 20:56 ` Rafael J. Wysocki @ 2008-04-30 23:34 ` Greg KH 1 sibling, 0 replies; 229+ messages in thread From: Greg KH @ 2008-04-30 23:34 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, Apr 30, 2008 at 01:14:39PM -0700, Linus Torvalds wrote: > > > On Wed, 30 Apr 2008, Linus Torvalds wrote: > > > > In fact, I'd personally like to make it even shorter > > Just to clarify: I'd actually like to make the merge window be just a > week. If even that. I'd go for that. The only one with a possible problem might be Andrew due to his need to rebase his 1000+ individual patches before he sends them to you :) Everyone else should have things queued up and ready to go for you as it's not like we don't have some warning that the window is about to open up... thanks, greg k-h ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:05 ` Linus Torvalds 2008-04-30 20:14 ` Linus Torvalds @ 2008-04-30 20:45 ` Rafael J. Wysocki 2008-04-30 21:37 ` Linus Torvalds 2008-05-01 13:54 ` Stefan Richter 2008-04-30 23:29 ` Paul Mackerras 2 siblings, 2 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 20:45 UTC (permalink / raw) To: Linus Torvalds; +Cc: David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wednesday, 30 of April 2008, Linus Torvalds wrote: > > On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > > > > IMO, the merge window is way too short for actually testing anything. > > That is largely on purpose. > > There's two choices: Oh well, I don't think it's really that simple. > - have a longer and calmer merge window, spread out the joy, and have > people test and fix their things during the merge window too. In other > words, less black-and-white. > > - Really short merge window, and use the extra time *after* it to fix the > issues. > > and I've obviously gone for the latter. In fact, I'd personally like to > make it even shorter, because the problem with the long merge window can > be summed up very simply: > > Long merge windows don't work - because rather than test more, it just > means that people will use them to make more changes! And what do you think is happening _after_ the merge window closes, when we're supposed to be fixing bugs? People work on new code. And, in fact, they have to, if they want to be ready for the next merge window. > So one of the major things about the short merge window is that it's > hopefully encouraging people to have things ready by the time the merge > window opens, because it's too late to do anything later. > > And yes, we could have some other way of enforcing that - allow the merge > window to be longer, but have some other mechanism to make sure that I > only merge old code. How about, instead, putting limits on the amount of stuff that's going to be merged during the next window? > In fact, I'd personally *love* to have a hard rule that says "I will only > pull from trees that were already 'done' by the time the window opened", > and we've been kind-of moving in that direction. Well, and when's the time for fixing bugs? Surely not during the merge window and also not after that, because otherwise people won't be ready for the next merge window with the new code. > But that wish is counteracted by the fact that the merges themselves do > need some development, so expecting everything to be ready before-hand is > simply not realistic. > > Also, while I'd like trees to be ready when the window opens, at the same > time I do think that it's good to spread out some of it, and get *some* > basic testing - even if it's just a nightly build and a few tens of > developers. > > > I rebuild the kernel once or even twice a day and there's no way I can > > really test it. I can only check if it breaks right away. > > And really, that's all that we'd expect during the merge window. We want > to find the *obvious* problems - build issues, and the things that hit > everybody, but let's face it, the subtle ones will take time to find > regardless. Exactly. Moreover, the code is now being merged at a pace that makes it physically impossible to review it given the human resources we have. > Then, the short merge window means that we have more time when we really > don't have big changes going in to find the subtle ones. Sorry to say that, but I don't think this is realistic. What happens after the merge window is people go and develop new stuff. They look at the already merged code only if they have to. Also, there are a _few_ people testing the kernel carefully enough to see the more subtle problems, let alone debugging and fixing them. > (And making the release cycle longer would *not* help - that would just > make the next merge window more painful, so while it can, and does, work > for some individual release with particular problems, it's not a solution > in the long run). My point is, given the width of the merge windown, there's too much stuff going in during it. As far as I'm concerned, the window can be a week long or whatever, but let's make fewer commits over a unit of time. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:45 ` Rafael J. Wysocki @ 2008-04-30 21:37 ` Linus Torvalds 2008-04-30 22:23 ` Rafael J. Wysocki 2008-05-01 13:54 ` Stefan Richter 1 sibling, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 21:37 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > > > > Long merge windows don't work - because rather than test more, it just > > means that people will use them to make more changes! > > And what do you think is happening _after_ the merge window closes, when > we're supposed to be fixing bugs? People work on new code. And, in fact, they > have to, if they want to be ready for the next merge window. Oh, I agree. But at that point, the issue you brought up - of testing and then having the code change under you wildly - has at least gone away. And I think you are missing a big issue: > Sorry to say that, but I don't think this is realistic. What happens after the merge > window is people go and develop new stuff. >From a testing standpoint, the *developers* aren't ever even the main issue. Yes, we get test coverage that way too, but we should really aim for getting most of the non-obvious issues from the user community, and not primarily from developers. So the whole point of the merge window is *not* to have developers testing their code during the six subsequent weeks, but to have *users* able to use -rc1 and report issues! That's why the distro "testing" trees are so important. And that's why it's so important that -rc1 be timely. > My point is, given the width of the merge windown, there's too much stuff > going in during it. As far as I'm concerned, the window can be a week long > or whatever, but let's make fewer commits over a unit of time. I'm not following that logic. A single merge will bring in easily thousands of commits. It doesn't matter if the merge window is a day or a week or two weeks, the merge will be one event. And there's no way to avoid the fact that during the merge window, we will get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was 9629 commits). So your "fewer commits over a unit of time" doesn't make sense. We have those ten thousand commits. They need to go in. They cannot take forever. Ergo, you *will* have a thousand commits a day during the merge window. We can spread it out a bit (and I do to some degree), but in many ways that is just going to be more painful. So it's actually easier if we can get about half of the merges done early, so that people like Andrew then has at least most of the base set for him by the first few days of the merge window. So here's the math: 3,500 commits per month. That's just the *average* speed, it's sometimes more. And we *cannot* merge them continuously, because we need to have a stabler period for testing. And remember: those 3,500 commits don't stop happening just because they aren't merged. You should think of them as a constant pressure. So 3,500 commits per month, but with a stable period (that is *longer* than the merge window) that means that the merge window needs to merge that constant stream of commits *faster* than they happen, so that we can then have that breather when we try to get users to test it. Let's say that we have a 1:3 ratio (which is fairly close to what we have), and that means that we need to merge 3,500 commits in a week. That's just simple *math*. So when you say "let's make fewer commits over a unit of time" I can onyl shake my head and wonder what the hell you are talking about. The merge window _needs_ to do those 3,500 commits per week. Otherwise they don't get merged! Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 21:37 ` Linus Torvalds @ 2008-04-30 22:23 ` Rafael J. Wysocki 2008-04-30 22:31 ` Linus Torvalds 2008-04-30 22:40 ` david 0 siblings, 2 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 22:23 UTC (permalink / raw) To: Linus Torvalds; +Cc: David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wednesday, 30 of April 2008, Linus Torvalds wrote: > > On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > > > > > > Long merge windows don't work - because rather than test more, it just > > > means that people will use them to make more changes! > > > > And what do you think is happening _after_ the merge window closes, when > > we're supposed to be fixing bugs? People work on new code. And, in fact, they > > have to, if they want to be ready for the next merge window. > > Oh, I agree. But at that point, the issue you brought up - of testing and > then having the code change under you wildly - has at least gone away. > > And I think you are missing a big issue: > > > Sorry to say that, but I don't think this is realistic. What happens after the merge > > window is people go and develop new stuff. > > From a testing standpoint, the *developers* aren't ever even the main > issue. Yes, we get test coverage that way too, but we should really aim > for getting most of the non-obvious issues from the user community, and > not primarily from developers. > > So the whole point of the merge window is *not* to have developers testing > their code during the six subsequent weeks, but to have *users* able to > use -rc1 and report issues! > > That's why the distro "testing" trees are so important. And that's why > it's so important that -rc1 be timely. That's correct, but since developers are already working on new code at that point, the bug reports in fact distract them and make them go back to the "old" stuff, recall why they did that particular changes etc. As a result, the developers often do not take the bug reports seriously enough, especially if they do not finger the "guilty" change. That, in turn, makes the users believe there's no point in testing and reporting bugs. > > My point is, given the width of the merge windown, there's too much stuff > > going in during it. As far as I'm concerned, the window can be a week long > > or whatever, but let's make fewer commits over a unit of time. > > I'm not following that logic. > > A single merge will bring in easily thousands of commits. It doesn't > matter if the merge window is a day or a week or two weeks, the merge will > be one event. No, technically it doesn't. > And there's no way to avoid the fact that during the merge window, we will > get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was > 9629 commits). Well, do we _have_ _to_ take that much? I know we _can_, but is this really necessary? > So your "fewer commits over a unit of time" doesn't make sense. Oh, yes it does. Equally well you could say that having brakes in a car didn't make sense, even if you could drive it as fast as the engine allowed you to. ;-) > We have those ten thousand commits. They need to go in. They cannot take > forever. But perhaps some of them can wait a bit longer. > Ergo, you *will* have a thousand commits a day during the merge window. That's only if you insist on handling everything what people push to you. > We can spread it out a bit (and I do to some degree), but in many ways > that is just going to be more painful. So it's actually easier if we can > get about half of the merges done early, so that people like Andrew then > has at least most of the base set for him by the first few days of the > merge window. > > So here's the math: 3,500 commits per month. That's just the *average* > speed, it's sometimes more. And we *cannot* merge them continuously, > because we need to have a stabler period for testing. And remember: those > 3,500 commits don't stop happening just because they aren't merged. You > should think of them as a constant pressure. > > So 3,500 commits per month, but with a stable period (that is *longer* > than the merge window) that means that the merge window needs to merge > that constant stream of commits *faster* than they happen, so that we can > then have that breather when we try to get users to test it. Let's say > that we have a 1:3 ratio (which is fairly close to what we have), and that > means that we need to merge 3,500 commits in a week. > > That's just simple *math*. So when you say "let's make fewer commits over > a unit of time" I can onyl shake my head and wonder what the hell you are > talking about. The merge window _needs_ to do those 3,500 commits per > week. Otherwise they don't get merged! Surely, they don't, but maybe they don't have to. You can technically handle merging even more, but what about quality? Do we have a quality assurance process in place? If we do, what is it? How is it able to handle the 3500 commits a week? Assuming it is, will it be able to handle more and what's the limit? IMO, there has to be a limit somewhere, or we will end up in a spiral driving everybody mad. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:23 ` Rafael J. Wysocki @ 2008-04-30 22:31 ` Linus Torvalds 2008-04-30 22:41 ` Andrew Morton ` (2 more replies) 2008-04-30 22:40 ` david 1 sibling, 3 replies; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 22:31 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > And there's no way to avoid the fact that during the merge window, we will > > get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was > > 9629 commits). > > Well, do we _have_ _to_ take that much? I know we _can_, but is this really > necessary? Do you want me to stop merging your code? Do you think anybody else does? Any suggestions on how to convince people that their code is not worth merging? Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:31 ` Linus Torvalds @ 2008-04-30 22:41 ` Andrew Morton 2008-04-30 23:23 ` Rafael J. Wysocki ` (2 more replies) 2008-04-30 22:46 ` Willy Tarreau 2008-04-30 23:03 ` Rafael J. Wysocki 2 siblings, 3 replies; 229+ messages in thread From: Andrew Morton @ 2008-04-30 22:41 UTC (permalink / raw) To: Linus Torvalds; +Cc: rjw, davem, linux-kernel, jirislaby On Wed, 30 Apr 2008 15:31:22 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote: > Any suggestions on how to convince people that their code is not worth > merging? Raise the quality. Then the volume will automatically decrease. Which leads us to... the volume isn't a problem per-se. The problem is quality. It's the fact that they vary inversely which makes us say "slow down". So David's Subject: should have been "Do Better, please". Slowing down is just a side-effect. And, we expect, a tool. We should be discussing how to raise the quality of our work. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:41 ` Andrew Morton @ 2008-04-30 23:23 ` Rafael J. Wysocki 2008-04-30 23:41 ` david 2008-05-01 0:57 ` Adrian Bunk 2008-05-01 12:31 ` Tarkan Erimer 2 siblings, 1 reply; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 23:23 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, davem, linux-kernel, jirislaby On Thursday, 1 of May 2008, Andrew Morton wrote: > On Wed, 30 Apr 2008 15:31:22 -0700 (PDT) > Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > Any suggestions on how to convince people that their code is not worth > > merging? > > Raise the quality. Then the volume will automatically decrease. > > Which leads us to... the volume isn't a problem per-se. The problem is > quality. It's the fact that they vary inversely which makes us say "slow > down". > > So David's Subject: should have been "Do Better, please". Slowing down is > just a side-effect. And, we expect, a tool. > > > We should be discussing how to raise the quality of our work. I violently agree. One of the (obvious?) ways in which we can raise the quality of the code overall is to spend more time on reviewing the others' code and discussing that code. It follows from my experience that the quality of patches improves dramatically if they are discussed while being developed. Of course, that requires time, but it's time well spent. For this reason, there should be a mechanism in place that will encourage people to review the existing code, even the code that hasn't changed for a long time, and to review and discuss patches submitted by the other people instead of producing new code. Also, the patches that were thoroughly discussed during their development should be regarded as more trustworthy than the ones that were not discussed at all. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:23 ` Rafael J. Wysocki @ 2008-04-30 23:41 ` david 2008-04-30 23:51 ` Rafael J. Wysocki 0 siblings, 1 reply; 229+ messages in thread From: david @ 2008-04-30 23:41 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Andrew Morton, Linus Torvalds, davem, linux-kernel, jirislaby On Thu, 1 May 2008, Rafael J. Wysocki wrote: > On Thursday, 1 of May 2008, Andrew Morton wrote: >> On Wed, 30 Apr 2008 15:31:22 -0700 (PDT) >> Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Also, the patches that were thoroughly discussed during their development > should be regarded as more trustworthy than the ones that were not discussed > at all. but you don't have any way of knowing how much discussion took place on any particular patch. that discussion could have taken place in many different places, and you don't have the ability to monitor them all. David Lang ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:41 ` david @ 2008-04-30 23:51 ` Rafael J. Wysocki 0 siblings, 0 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 23:51 UTC (permalink / raw) To: david; +Cc: Andrew Morton, Linus Torvalds, davem, linux-kernel, jirislaby On Thursday, 1 of May 2008, david@lang.hm wrote: > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > On Thursday, 1 of May 2008, Andrew Morton wrote: > >> On Wed, 30 Apr 2008 15:31:22 -0700 (PDT) > >> Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > > Also, the patches that were thoroughly discussed during their development > > should be regarded as more trustworthy than the ones that were not discussed > > at all. > > but you don't have any way of knowing how much discussion took place on > any particular patch. that discussion could have taken place in many > different places, and you don't have the ability to monitor them all. Not at the moment, but there may be a way to do that if we think of it more thoroughly. One idea may be add a "Commented-by:" tag in which to place people who provided valuable comments to the patch author and/or maintainer (as a comma separated list, for example, in analogy with the email Cc lists), especially if the patch has been changed as a result of the comments. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:41 ` Andrew Morton 2008-04-30 23:23 ` Rafael J. Wysocki @ 2008-05-01 0:57 ` Adrian Bunk 2008-05-01 1:25 ` Linus Torvalds 2008-05-01 1:35 ` Theodore Tso 2008-05-01 12:31 ` Tarkan Erimer 2 siblings, 2 replies; 229+ messages in thread From: Adrian Bunk @ 2008-05-01 0:57 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, rjw, davem, linux-kernel, jirislaby On Wed, Apr 30, 2008 at 03:41:24PM -0700, Andrew Morton wrote: > On Wed, 30 Apr 2008 15:31:22 -0700 (PDT) > Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > Any suggestions on how to convince people that their code is not worth > > merging? > > Raise the quality. Then the volume will automatically decrease. 100% ACK to "Raise the quality" (no matter whether it influences the volume). > Which leads us to... the volume isn't a problem per-se. The problem is > quality. It's the fact that they vary inversely which makes us say "slow > down". > > So David's Subject: should have been "Do Better, please". Slowing down is > just a side-effect. And, we expect, a tool. > > > We should be discussing how to raise the quality of our work. One big problem I see is Linus wanting to merge all drivers regardless of the quality. Linus said in [1]: "I'd really rather have the driver merged, and then *other* people can send patches!" The problem is that such "other people" do not exist (except perhaps Al) for non-trivial stuff. My favorite gem from this driver we merged in 2.6.25 is: grep -C4 volatile drivers/infiniband/hw/nes/nes_nic.c Fixing such stuff aren't "janitorial kind of things", and people are actually more motivated to fix their code for getting it into the kernel than to fix their code after it went into the kernel. I am not saying we shouldn't merge such a driver at all or set unrealistic high quality goals - I'm for merging all code of good quality that provides functionality not yet into the kernel. But we need some minimum quality level. cu Adrian [1] http://lkml.org/lkml/2008/2/21/334 -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 0:57 ` Adrian Bunk @ 2008-05-01 1:25 ` Linus Torvalds 2008-05-01 2:13 ` Adrian Bunk 2008-05-01 1:35 ` Theodore Tso 1 sibling, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 1:25 UTC (permalink / raw) To: Adrian Bunk; +Cc: Andrew Morton, rjw, davem, linux-kernel, jirislaby On Thu, 1 May 2008, Adrian Bunk wrote: > > One big problem I see is Linus wanting to merge all drivers regardless > of the quality. That's not what I said. What I said was that I think we get *better* quality by merging early. In other words, you're turning the whole argument on its head, and incorrectly so. I claim that you are the one that is arguing for *worse* quality, by arguing for a process that is KNOWN to tend to generate bad code (out-of-tree drivers) as opposed to one that tends to fix things over time (and note the "tends" in both cases - there are counter-examples, but the trend is so clear that anybody who disputes it would seem to be either blind or lying). So here's my challenge: give me *one* reason to believe that quality improves more out-of-tree than it does in-tree, and then you'll have a point. But you'd better be able to explain the ton of historical data we have that proves otherwise. Until you do that, your blathering is just that - total blathering. The process I advocate is the one that has historical data on its side. Yours is just a failed theory. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:25 ` Linus Torvalds @ 2008-05-01 2:13 ` Adrian Bunk 2008-05-01 2:30 ` Linus Torvalds 0 siblings, 1 reply; 229+ messages in thread From: Adrian Bunk @ 2008-05-01 2:13 UTC (permalink / raw) To: Linus Torvalds; +Cc: Andrew Morton, rjw, davem, linux-kernel, jirislaby On Wed, Apr 30, 2008 at 06:25:50PM -0700, Linus Torvalds wrote: > > > On Thu, 1 May 2008, Adrian Bunk wrote: > > > > One big problem I see is Linus wanting to merge all drivers regardless > > of the quality. > > That's not what I said. > > What I said was that I think we get *better* quality by merging early. > > In other words, you're turning the whole argument on its head, and > incorrectly so. > > I claim that you are the one that is arguing for *worse* quality, by > arguing for a process that is KNOWN to tend to generate bad code > (out-of-tree drivers) as opposed to one that tends to fix things over time > (and note the "tends" in both cases - there are counter-examples, but > the trend is so clear that anybody who disputes it would seem to be either > blind or lying). >... I am *not* saying it should have stayed out-of-tree. I am saying that it was merged too early, and that there are points that should have been addressed before the driver got merged. Get it submitted for review to linux-kernel. Give the maintainers some time to incorporate all comments. Even one month later it could still have made it into 2.6.25. The only problem with my suggestion is that it's currently pretty random whether someone takes the time to review such a driver on linux-kernel. And even if I'm getting fire for this again (and different from newbies running checkpatch on the kernel) for driver submissions it actually makes sense to tell the submitter to fix the checkpatch errors [1], and it would have made the driver better in this case (again, it could still have made it into 2.6.25). People are actually more motivated to fix their code for getting it into the kernel than to fix their code after it went into the kernel, so we might get better quality when merging a bit later. > Linus cu Adrian [1] not necessarily all checkpatch warnings -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 2:13 ` Adrian Bunk @ 2008-05-01 2:30 ` Linus Torvalds 2008-05-01 18:54 ` Adrian Bunk 2008-05-14 14:55 ` Pavel Machek 0 siblings, 2 replies; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 2:30 UTC (permalink / raw) To: Adrian Bunk; +Cc: Andrew Morton, rjw, davem, linux-kernel, jirislaby On Thu, 1 May 2008, Adrian Bunk wrote: > > I am saying that it was merged too early, and that there are points that > should have been addressed before the driver got merged. > > Get it submitted for review to linux-kernel. > Give the maintainers some time to incorporate all comments. > Even one month later it could still have made it into 2.6.25. > > The only problem with my suggestion is that it's currently pretty random > whether someone takes the time to review such a driver on linux-kernel. Now, I do agree that we could/should have some more process in general. I really _would_ like to have a process in place that basically says: - everything must have gone through lkml at least once - after that point, it should have been in linux-next or the -mm queue - and then it can get merged (and if it didn't get any review by then, maybe it was because nobody was interested, and it simply won't be getting any until it oopses or catches peoples interest some other way) HOWEVER. That process doesn't actually work for everything anyway (a lot of trivial fixes are really best not being so noisy, and various patches that are specific to some subsystem really _are_ better off just discussed on that subsystem mailing lists). And perhaps more pertinently, right now that kind of process is very inconvenient (to the point of effectively being impossible) for me to check. Obviously, if the patch comes from Andrew, I know it was in -mm, and I seldom drop those patches for obvious reasons anyway, but the last thing we want is some process that depends even _more_ on Andrew being a burnt-out-excuse-for-a-man in a few years (*). So I could ask for people to always have pointers to "it was discussed here" on patches they send (and I'd likely mostly trust them without even bothering to verify), the same way -git maintainers often talk about "most of this has been in -mm for the last two months". That might work. But then there would still be the patches that are obvious and don't need them. And then even the obvious patches do break. And people will complain. Even though requiring that kind of process for the stupid stuff would just slow everybody down, and would be really painful. So one of my _personal_ reasons I don't want to put too much process in place is that I don't think process is appropriate for everything, and yet even the stuff that obviously doesn't need or want process (speling fixes and build failures) _will_ cause problems, and then people will whine about them not being there. Linus (*) Andrew, no offense. I'm sure you'd be a magnificent burnt-out-excuse- for-a-man. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 2:30 ` Linus Torvalds @ 2008-05-01 18:54 ` Adrian Bunk 2008-05-14 14:55 ` Pavel Machek 1 sibling, 0 replies; 229+ messages in thread From: Adrian Bunk @ 2008-05-01 18:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: Andrew Morton, rjw, davem, linux-kernel, jirislaby On Wed, Apr 30, 2008 at 07:30:13PM -0700, Linus Torvalds wrote: > > > On Thu, 1 May 2008, Adrian Bunk wrote: > > > > I am saying that it was merged too early, and that there are points that > > should have been addressed before the driver got merged. > > > > Get it submitted for review to linux-kernel. > > Give the maintainers some time to incorporate all comments. > > Even one month later it could still have made it into 2.6.25. > > > > The only problem with my suggestion is that it's currently pretty random > > whether someone takes the time to review such a driver on linux-kernel. > > Now, I do agree that we could/should have some more process in general. I > really _would_ like to have a process in place that basically says: > > - everything must have gone through lkml at least once > > - after that point, it should have been in linux-next or the -mm queue > > - and then it can get merged (and if it didn't get any review by then, > maybe it was because nobody was interested, and it simply won't be > getting any until it oopses or catches peoples interest some other way) > > HOWEVER. > > That process doesn't actually work for everything anyway (a lot of trivial > fixes are really best not being so noisy, and various patches that are > specific to some subsystem really _are_ better off just discussed on that > subsystem mailing lists). Cc linux-kernel on 3 patches "specific to some subsystem" that add the word "select" to Kconfig files and I'll catch at least one bug before it enters your tree... > And perhaps more pertinently, right now that kind of process is very > inconvenient (to the point of effectively being impossible) for me to > check. Obviously, if the patch comes from Andrew, I know it was in -mm, > and I seldom drop those patches for obvious reasons anyway, but the last > thing we want is some process that depends even _more_ on Andrew being a > burnt-out-excuse-for-a-man in a few years (*). > > So I could ask for people to always have pointers to "it was discussed > here" on patches they send (and I'd likely mostly trust them without even > bothering to verify), the same way -git maintainers often talk about "most > of this has been in -mm for the last two months". It should be enough to trust maintainers that they follow the rules. And in the unlikely case someone didn't follow them you know whom you have to watch closely during his next merge requests... > That might work. But then there would still be the patches that are > obvious and don't need them. > > And then even the obvious patches do break. And people will complain. Even > though requiring that kind of process for the stupid stuff would just slow > everybody down, and would be really painful. There's a middle way. Requiring the submission of bigger changes and new drivers to be Cc'ed to linux-kernel can help and shouldn't cause real problems. And requiring this kind of patches to be in linux-next for some time should also be possible. Both can improve the quality of the kernel. Trivial patches and bugfixes might not have to follow these rules, but that's similar to e.g. the current merge window process also having exceptions for new drivers. >... > Linus >... cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 2:30 ` Linus Torvalds 2008-05-01 18:54 ` Adrian Bunk @ 2008-05-14 14:55 ` Pavel Machek 1 sibling, 0 replies; 229+ messages in thread From: Pavel Machek @ 2008-05-14 14:55 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, Andrew Morton, rjw, davem, linux-kernel, jirislaby Hi! > > I am saying that it was merged too early, and that there are points that > > should have been addressed before the driver got merged. > > > > Get it submitted for review to linux-kernel. > > Give the maintainers some time to incorporate all comments. > > Even one month later it could still have made it into 2.6.25. > > > > The only problem with my suggestion is that it's currently pretty random > > whether someone takes the time to review such a driver on linux-kernel. > > Now, I do agree that we could/should have some more process in general. I > really _would_ like to have a process in place that basically says: > > - everything must have gone through lkml at least once What about 'must go through lkml at least once *outside the merge window*'. Or is it just me? During the merge window, I'm totally overloaded by all those patches going in and related lkml traffic... -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 0:57 ` Adrian Bunk 2008-05-01 1:25 ` Linus Torvalds @ 2008-05-01 1:35 ` Theodore Tso 1 sibling, 0 replies; 229+ messages in thread From: Theodore Tso @ 2008-05-01 1:35 UTC (permalink / raw) To: Adrian Bunk Cc: Andrew Morton, Linus Torvalds, rjw, davem, linux-kernel, jirislaby On Thu, May 01, 2008 at 03:57:27AM +0300, Adrian Bunk wrote: > > > > We should be discussing how to raise the quality of our work. > > One big problem I see is Linus wanting to merge all drivers regardless > of the quality. > > Linus said in [1]: > "I'd really rather have the driver merged, and then *other* people can > send patches!" > > The problem is that such "other people" do not exist (except perhaps Al) > for non-trivial stuff. Sure, but that's not cause of the problems that people like DavidN whine about, or problems that frustrate David Miller and/or Ingo Molnar. The problems that cause whining and/or frustration are when changes in core code break other maintainer. That is a TOTALLY DIFFERENT problem from lower-quality device drivers getting merged. In general, those device drivers don't cause problems who don't have the relevant hardware, and worse case, the device driver can just be CONFIG'ed out. So this is a totally different issue, and whether or not we merge new device drivers, and at what quality level (from "it compiles, ship it!", to every single checkpatch, sparse, and Cristoph Hellwig nitpick has to be addressed *AND* then the submitter has to give a bottle of high-quality alcohol to a Maintainer :-) is completely orthoganal to the question of whether we can, in a King Canute fashion, compel developers from stopping to develop by command them not to send pull requests or by refusing to merge their work into mainline. If we don't merge their work, and it's really cool features that our end users are demanding, it will just flow into the distros via out-of-tree patches, much like it did during the 2.4/2.5 era. And maybe the current enterprise distro's will try to hold it back, but if end users start saying things like's "We want containers!!" and start voting with their feet to a distro that is willing to merge OpenVZ patches, it doesn't how much we try to tell the tide to stop flowing in. So yes, we can apply some amount of backpressure, but the real challenge is to figure out how we can work smarter and flush out the bugs faster. - Ted ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:41 ` Andrew Morton 2008-04-30 23:23 ` Rafael J. Wysocki 2008-05-01 0:57 ` Adrian Bunk @ 2008-05-01 12:31 ` Tarkan Erimer 2008-05-01 15:34 ` Stefan Richter 2 siblings, 1 reply; 229+ messages in thread From: Tarkan Erimer @ 2008-05-01 12:31 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, rjw, davem, linux-kernel, jirislaby Andrew Morton wrote: > So David's Subject: should have been "Do Better, please". Slowing down is > just a side-effect. And, we expect, a tool. > > To improve the quality of kernel releases, maybe we can create a special kernel testing tool. This tool should have : - Can check known bugs, regressions, compile errors etc. - The design should be modular (plug-in support). So, easily these regressions, known bugs etc. should be implemented. - It should have a git support. So,when hit a bug, this tool should have ability to bisect the commits which automates to finding buggy commits. - It should have console interface and X interface. So; not just developers, also users, who wants to help to find out the issues, can contribute easily. Just a few things came to my mind when thought about it. Any more ideas/suggestions welcomed :-) Also, we can create a web site for this project and we can identify the known regressions, bugs etc. So, easily who wants to contribute some code about this tool, easily find out these issues. If anyone interests to create/lead such a tool like this, I can host a website about this project on our system. Cheers Tarkan ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 12:31 ` Tarkan Erimer @ 2008-05-01 15:34 ` Stefan Richter 2008-05-02 14:05 ` Tarkan Erimer 0 siblings, 1 reply; 229+ messages in thread From: Stefan Richter @ 2008-05-01 15:34 UTC (permalink / raw) To: Tarkan Erimer Cc: Andrew Morton, Linus Torvalds, rjw, davem, linux-kernel, jirislaby Tarkan Erimer wrote: > To improve the quality of kernel releases, maybe we can create a special > kernel testing tool. A variety of bugs cannot be caught by automated tests. Notably those which happen with rare hardware, or due to very specific interaction with hardware, or with very special workloads. An interesting thing to investigate would be to start at the regression meta bugs at bugzilla.kernel.org, go through all bugs on which are linked from there, and try to figure out - if these bugs could have been found by automated or at least semiautomatic tests on pre-merge code, and - how those tests had to have looked like, e.g. what equipment would have been necessary. Let's look back at the posting at the thread start: | On Wed, Apr 30, 2008 at 10:03 AM, David Miller <davem@davemloft.net> wrote: | > Yesterday, I spent the whole day bisecting boot failures | > on my system due to the totally untested linux/bitops.h | > optimization, which I fully analyzed and debugged. ... | > Yet another bootup regression got added within the last 24 | > hours. Bootup regressions can be automatically caught if the necessary machines are available, and candidate code gets exposure to test parks of those machines. I hear this is already being done, and increasingly so. But those test parks will ever only cover a tiny fraction of existing hardware and cannot be subjected to all code iterations and all possible .config permutations, hence will have limited coverage of bugs. And things like the bitops issue depend on review much more than on tests, AFAIU. -- Stefan Richter -=====-==--- -=-= ----= http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 15:34 ` Stefan Richter @ 2008-05-02 14:05 ` Tarkan Erimer 0 siblings, 0 replies; 229+ messages in thread From: Tarkan Erimer @ 2008-05-02 14:05 UTC (permalink / raw) To: Stefan Richter Cc: Andrew Morton, Linus Torvalds, rjw, davem, linux-kernel, jirislaby Stefan Richter wrote: > Tarkan Erimer wrote: >> To improve the quality of kernel releases, maybe we can create a >> special kernel testing tool. > > A variety of bugs cannot be caught by automated tests. Notably those > which happen with rare hardware, or due to very specific interaction > with hardware, or with very special workloads. Of course,it's impossible to test all the things/scenarios. Just, that kind of tool, should allow us to minimize the issues that we will face. > > An interesting thing to investigate would be to start at the > regression meta bugs at bugzilla.kernel.org, go through all bugs on > which are linked from there, and try to figure out > - if these bugs could have been found by automated or at least > semiautomatic tests on pre-merge code, and > - how those tests had to have looked like, e.g. what equipment would > have been necessary. > > Let's look back at the posting at the thread start: > | On Wed, Apr 30, 2008 at 10:03 AM, David Miller <davem@davemloft.net> > wrote: > | > Yesterday, I spent the whole day bisecting boot failures > | > on my system due to the totally untested linux/bitops.h > | > optimization, which I fully analyzed and debugged. > ... > | > Yet another bootup regression got added within the last 24 > | > hours. > > Bootup regressions can be automatically caught if the necessary > machines are available, and candidate code gets exposure to test parks > of those machines. I hear this is already being done, and > increasingly so. But those test parks will ever only cover a tiny > fraction of existing hardware and cannot be subjected to all code > iterations and all possible .config permutations, hence will have > limited coverage of bugs. > > And things like the bitops issue depend on review much more than on > tests, AFAIU. My idea is also hunting the bugs more easily via a tool like this that has a console/X interface and ability to bisect. So; users,who has little or no knowledge about git/bisect, can easily try to find out the problematic commits/bugs. Tarkan ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:31 ` Linus Torvalds 2008-04-30 22:41 ` Andrew Morton @ 2008-04-30 22:46 ` Willy Tarreau 2008-04-30 22:52 ` Andrew Morton 2008-04-30 23:20 ` Linus Torvalds 2008-04-30 23:03 ` Rafael J. Wysocki 2 siblings, 2 replies; 229+ messages in thread From: Willy Tarreau @ 2008-04-30 22:46 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, Apr 30, 2008 at 03:31:22PM -0700, Linus Torvalds wrote: > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > And there's no way to avoid the fact that during the merge window, we will > > > get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was > > > 9629 commits). > > > > Well, do we _have_ _to_ take that much? I know we _can_, but is this really > > necessary? > > Do you want me to stop merging your code? > > Do you think anybody else does? > > Any suggestions on how to convince people that their code is not worth > merging? I think you're approaching a solution Linus. If developers take a refusal as a punishment, maybe you can use that for trees which have too many unresolved regressions. This would be really unfair to subsystem maintainers which themselves merge a lot of work, but recursively they may apply the same principle to their own developers, so that everybody knows that it's not worth working on next code past a point where too many regressions are reported. Willy ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:46 ` Willy Tarreau @ 2008-04-30 22:52 ` Andrew Morton 2008-04-30 23:21 ` Willy Tarreau 2008-04-30 23:20 ` Linus Torvalds 1 sibling, 1 reply; 229+ messages in thread From: Andrew Morton @ 2008-04-30 22:52 UTC (permalink / raw) To: Willy Tarreau; +Cc: torvalds, rjw, davem, linux-kernel, jirislaby On Thu, 1 May 2008 00:46:10 +0200 Willy Tarreau <w@1wt.eu> wrote: > On Wed, Apr 30, 2008 at 03:31:22PM -0700, Linus Torvalds wrote: > > > > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > > > And there's no way to avoid the fact that during the merge window, we will > > > > get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was > > > > 9629 commits). > > > > > > Well, do we _have_ _to_ take that much? I know we _can_, but is this really > > > necessary? > > > > Do you want me to stop merging your code? > > > > Do you think anybody else does? > > > > Any suggestions on how to convince people that their code is not worth > > merging? > > I think you're approaching a solution Linus. If developers take a refusal > as a punishment, maybe you can use that for trees which have too many > unresolved regressions. This would be really unfair to subsystem maintainers > which themselves merge a lot of work, but recursively they may apply the > same principle to their own developers, so that everybody knows that it's > not worth working on next code past a point where too many regressions are > reported. > Well. If we were good enough at tracking bug reports and regressions we could look at the status of subsytem X and say "no new features for you". That would be a drastic step even if we had the information to do it (which we don't). It would certainly put the pigeon amongst the cats tho. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:52 ` Andrew Morton @ 2008-04-30 23:21 ` Willy Tarreau 2008-04-30 23:38 ` Chris Shoemaker 0 siblings, 1 reply; 229+ messages in thread From: Willy Tarreau @ 2008-04-30 23:21 UTC (permalink / raw) To: Andrew Morton; +Cc: torvalds, rjw, davem, linux-kernel, jirislaby On Wed, Apr 30, 2008 at 03:52:52PM -0700, Andrew Morton wrote: > On Thu, 1 May 2008 00:46:10 +0200 > Willy Tarreau <w@1wt.eu> wrote: > > > On Wed, Apr 30, 2008 at 03:31:22PM -0700, Linus Torvalds wrote: > > > > > > > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > > > > > And there's no way to avoid the fact that during the merge window, we will > > > > > get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was > > > > > 9629 commits). > > > > > > > > Well, do we _have_ _to_ take that much? I know we _can_, but is this really > > > > necessary? > > > > > > Do you want me to stop merging your code? > > > > > > Do you think anybody else does? > > > > > > Any suggestions on how to convince people that their code is not worth > > > merging? > > > > I think you're approaching a solution Linus. If developers take a refusal > > as a punishment, maybe you can use that for trees which have too many > > unresolved regressions. This would be really unfair to subsystem maintainers > > which themselves merge a lot of work, but recursively they may apply the > > same principle to their own developers, so that everybody knows that it's > > not worth working on next code past a point where too many regressions are > > reported. > > > > Well. If we were good enough at tracking bug reports and regressions we > could look at the status of subsytem X and say "no new features for you". > > That would be a drastic step even if we had the information to do it (which > we don't). We already have some information, Rafael is tracking this info. But we need other developers to look at others' bugs. If we considered that for each release, the *worst* subsystem does not get any new features merged, maybe the ones who really want to get theirs merged will quickly take a look at their not-so-friend coworkers's work to try to get their score up and avoid getting spotted. After all, that's what we want to achieve : better cross-testing. For 2.6.27, we would probably have Davem happy to report one hundred of bugs brought by Ingo and ban him from next merge. But if that's the only way to find 100 buts in one release cycle, hey that's quite efficient! And in turn, Ingo would have more time to fix (or deny) bugs assigned to him, then take a look at his accuser's code for next release. Not very moral, but the kernel team has evolved from a small team of buddies to a large enterprise. And to survive this evolution, we may need to apply the immoral principles found in big companies. Willy ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:21 ` Willy Tarreau @ 2008-04-30 23:38 ` Chris Shoemaker 0 siblings, 0 replies; 229+ messages in thread From: Chris Shoemaker @ 2008-04-30 23:38 UTC (permalink / raw) To: Willy Tarreau Cc: Andrew Morton, torvalds, rjw, davem, linux-kernel, jirislaby On Thu, May 01, 2008 at 01:21:43AM +0200, Willy Tarreau wrote: > Not very moral, but the kernel team has evolved from a small team of > buddies to a large enterprise. And to survive this evolution, we may > need to apply the immoral principles found in big companies. On the contrary, I call this "keeping everybody else honest". -chris ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:46 ` Willy Tarreau 2008-04-30 22:52 ` Andrew Morton @ 2008-04-30 23:20 ` Linus Torvalds 2008-05-01 0:42 ` Rafael J. Wysocki 2008-05-01 1:30 ` Jeremy Fitzhardinge 1 sibling, 2 replies; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 23:20 UTC (permalink / raw) To: Willy Tarreau Cc: Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, 1 May 2008, Willy Tarreau wrote: > > > > Any suggestions on how to convince people that their code is not worth > > merging? > > I think you're approaching a solution Linus. If developers take a refusal > as a punishment, maybe you can use that for trees which have too many > unresolved regressions. Heh. It's been done. In fact, it's done all the time on a smaller scale. It's how I've enforced some cleanliness or process issues ("I won't pull that because it's too ugly"). I see similar messages floating around about individual patches. That said, I don't think it really works that well as "the solution": it works as a small part of the bigger picture, but no, we can't see punishment as the primary model for encouraging better bevaiour. First off, and maybe this is not true, but I don't think it is a very healthy way to handle issues in general. I may come off as an opinionated bastard in discussions like these, and I am, but when it actually comes to maintaining code, really prefer a much softer approach. I want to _trust_ people, and I really don't want to be a "you need to do 'xyz' or else" kind of guy. So I'll happily say "I can't merge this, because xyz", where 'xyz' is something that is related to the particular code that is actually merged. But quite frankly, holding up _unrelated_ fixes, because some other issue hasn't been resolved, I really try to not do that. So I'll say "I don't want to merge this, because quite frankly, we've had enough code for this merge window already, it can wait". That tends to happen at the end of the merge window, but it's not a threat, it's just me being tired of the worries of inevitable new issues at the end of the window. And I personally feel that this is important to keep people motivated. Being too stick-oriented isn't healthy. The other reason I don't believe in the "won't merge until you do 'xyz'" kind of thing as a main development model is that it traditionally hasn't worked. People simply disagree, the vendors will take the code that their customers need, the users will get the tree that works for them, and saying "I won't merge it" won't help anybody if it's actually useful. Finally, the people I work with may not be perfect, but most maintainers are pretty much experts within their own area. At some point you have to ask yourself: "Could I do better? Would I have the time? Could I find somebody else to do better?" and not just in a theoretical way. And if the answer is "no", then at that point, what else can you do? Yes, we have personalities that clash, and merge problems. And let's face it, as kernel developers, we aren't exactly a very "cuddly" group of people. People are opinionated and not afraid to speak their mind. But on the whole, I think the kernel development community is actually driven a lot more by _positive_ things than by the stick of "I won't get merged unless I shape up". So quite frankly, I'd personally much rather have a process that encourages people to have so much _pride_ in what they do that they want it to be seen as being really good (and hopefully then that pride means that they won't take crap!) than having a chain of fear that trickles down. So this is why, for example, I have so strongly encouraged git maintainers to think of their public trees as "releases". Because I think people act differently when they *think* of their code as a release than when they think of it as a random development tree. I do _not_ want to slow down development by setting some kind of "quality bar" - but I do believe that we should keep our quality high, not because of any hoops we need to jump through, but because we take pride in the thing we do. [ An example of this: I don't believe code review tends to much help in itself, but I *do* believe that the process of doing code review makes people more aware of the fact that others are looking at the code they produce, and that in turn makes the code often better to start with. And I think publicly announced git trees and -mm and linux-next are great partly because they end up doing that same thing. I heartily encourage submaintainers to always Cc: linux-kernel when they send me a "please pull" request - I don't know if anybody else ever really pulls that tree, but I do think that it's very healthy to write that message and think of it as a publication event. ] Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:20 ` Linus Torvalds @ 2008-05-01 0:42 ` Rafael J. Wysocki 2008-05-01 1:19 ` Linus Torvalds 2008-05-01 1:30 ` Jeremy Fitzhardinge 1 sibling, 1 reply; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 0:42 UTC (permalink / raw) To: Linus Torvalds Cc: Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thursday, 1 of May 2008, Linus Torvalds wrote: > > On Thu, 1 May 2008, Willy Tarreau wrote: [--snip--] > I do _not_ want to slow down development by setting some kind of "quality > bar" - but I do believe that we should keep our quality high, not because > of any hoops we need to jump through, but because we take pride in the > thing we do. Well, we certainly should, but do we always remeber about it? Honest, guv? > [ An example of this: I don't believe code review tends to much help in > itself, but I *do* believe that the process of doing code review makes > people more aware of the fact that others are looking at the code they > produce, and that in turn makes the code often better to start with. It may help directly, for example when people realize that they work on conflicting or just related changes. > And I think publicly announced git trees and -mm and linux-next are > great partly because they end up doing that same thing. I heartily > encourage submaintainers to always Cc: linux-kernel when they send me a > "please pull" request - I don't know if anybody else ever really pulls > that tree, but I do think that it's very healthy to write that message > and think of it as a publication event. ] I totally agree with that. Still, the issue at hand is that (1) The code merged during a merge window is somewhat opaque from the tester's point of view and if a regression is found, the only practical means to figure out what caused it is to carry out a bisection (which generally is unpleasant, to put it lightly). (2) Many regressions are introduced during merge windows (relative to the total amount of code merged they are a few, but the raw numbers are significant) and because of (1) the process of removing them is generally painful for the affected people. (3) The suspicion is that the number of regressions introduced during merge windows has something to do with the quality of code being below expectations, that in turn may be related to the fact that it's being developed very rapidly. My opinion is that we need to solve this issue sooner rather than later and so the question is how we are going to approach that. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 0:42 ` Rafael J. Wysocki @ 2008-05-01 1:19 ` Linus Torvalds 2008-05-01 1:31 ` Andrew Morton ` (2 more replies) 0 siblings, 3 replies; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 1:19 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > I do _not_ want to slow down development by setting some kind of "quality > > bar" - but I do believe that we should keep our quality high, not because > > of any hoops we need to jump through, but because we take pride in the > > thing we do. > > Well, we certainly should, but do we always remeber about it? Honest, guv? Hey, guv, do you _honestly_ believe that some kind of ISO-9000-like process generates quality? And I dislike how people try to conflate "quality" and "merging speed" as if there was any reason what-so-ever to believe that they are related. You (and Andrew) have tried to argue that slowing things down results in better quality, and I simply don't for a moment believe that. I believe the exact opposite. The way to get good quality is not to put barriers up in front of developers, but totally the reverse - by helping them. And yes, that help can quite possibly be in the form of "process" - by making things more streamlined, and by having people not have to waste time on wondering where they should send things etc. But the notion that we should even _try_ to aim to slow things down, that one I find unlikely to be true, and I don't even understand why anybody would find it a logical goal? Of course, you will have fewer new bugs if you have fewer changes. But that's not a goal, that's a tautology and totally uninteresting. A small program is likely to have fewer bugs, but that doesn't make something small "better" than something large that does more. Similarly, a stagnant development community will introduce new bugs more seldom. But does that make a stagnant one better than a virbrant one? Hell no. So what I'm arguing against here is not that we should aim for worse quality, but I'm arguing against the false dichotomy of believing that quality is incompatible with lots of change. So if we can get the discussion *away* from the "let's slow things down", then I'm interested. Because at that point we don't have to fight made-up arguments about something irrelevant. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:19 ` Linus Torvalds @ 2008-05-01 1:31 ` Andrew Morton 2008-05-01 1:43 ` Linus Torvalds 2008-05-01 1:40 ` Linus Torvalds 2008-05-01 5:50 ` Willy Tarreau 2 siblings, 1 reply; 229+ messages in thread From: Andrew Morton @ 2008-05-01 1:31 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Wed, 30 Apr 2008 18:19:56 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote: > You (and Andrew) have tried to argue that slowing things down results in > better quality, eh? I argued the opposite: that increasing quality will as a side-effect slow things down. If we simply throttled things, people would spend more time watching the shopping channel while merging smaller amounts of the same old crap. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:31 ` Andrew Morton @ 2008-05-01 1:43 ` Linus Torvalds 2008-05-01 10:59 ` Rafael J. Wysocki 0 siblings, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 1:43 UTC (permalink / raw) To: Andrew Morton Cc: Rafael J. Wysocki, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Wed, 30 Apr 2008, Andrew Morton wrote: > > eh? I argued the opposite: that increasing quality will as a side-effect > slow things down. Yes, my bad, I realized that when I read through my message and already sent out a fix for my buggy email ;) > If we simply throttled things, people would spend more time watching the > shopping channel while merging smaller amounts of the same old crap. I agree totally. And although some of the time would probably _also_ be spent on the frustrating crap that was designed to do the throttling, that isn't much more productive than watching the shopping channel would be ... Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:43 ` Linus Torvalds @ 2008-05-01 10:59 ` Rafael J. Wysocki 2008-05-01 15:26 ` Linus Torvalds 0 siblings, 1 reply; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 10:59 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Thursday, 1 of May 2008, Linus Torvalds wrote: > > On Wed, 30 Apr 2008, Andrew Morton wrote: > > > > eh? I argued the opposite: that increasing quality will as a side-effect > > slow things down. > > Yes, my bad, I realized that when I read through my message and already > sent out a fix for my buggy email ;) > > > If we simply throttled things, people would spend more time watching the > > shopping channel while merging smaller amounts of the same old crap. > > I agree totally. And although some of the time would probably _also_ be > spent on the frustrating crap that was designed to do the throttling, that > isn't much more productive than watching the shopping channel would be ... Okay, so what exactly are we going to do to address the issue that I described in the part of my last message that you skipped? Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 10:59 ` Rafael J. Wysocki @ 2008-05-01 15:26 ` Linus Torvalds 2008-05-01 17:09 ` Rafael J. Wysocki 2008-05-01 18:35 ` Chris Frey 0 siblings, 2 replies; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 15:26 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > Okay, so what exactly are we going to do to address the issue that I described > in the part of my last message that you skipped? Umm. I don't really see anythign to say. You said: > Still, the issue at hand is that > (1) The code merged during a merge window is somewhat opaque from the tester's > point of view and if a regression is found, the only practical means to > figure out what caused it is to carry out a bisection (which generally is > unpleasant, to put it lightly). > (2) Many regressions are introduced during merge windows (relative to the > total amount of code merged they are a few, but the raw numbers are > significant) and because of (1) the process of removing them is generally > painful for the affected people. > (3) The suspicion is that the number of regressions introduced during merge > windows has something to do with the quality of code being below > expectations, that in turn may be related to the fact that it's being > developed very rapidly. And quite frankly, (2) and (3) are both: "merge windows introduce new bugs", and that's such an uninteresting tautology that I'm left wordless. And (1) is just a result of merrging lots of stuff. Of course the new bugs / regressions are introduced during the merge window. That's when we merge new code. New bugs don't generally happen when you don't get new code. And of course finding bugs is always painful to everybody involved. And of course the bugs indicate something about the quality of code being merged. Perfect code wouldn't have bugs. So what you are stating isn't interesting, and isn't even worthy of discussion. The way you state it, the only answer is: don't take new code, then. That's what your whole argument always seems to boild down to, and excuse me for (yet again) finding that argument totally pointless. So let me repeat: (1) we have new code. We always *will* have new code, hopefully. A few million lines pe year. If you don't accept this, I don't have anything to say. (2) we need a merge window. That is a direct result not of wanting to have lots of code at the same time, but of the _reverse_ issue: we want to have times of relative calm. And again, if you continue to see the merge window as the "problem", rather than as the INEVITABLE result of wanting to have a calm period, there's no point in talking to you. (3) Ergo, there's a very fundamental and basic and inescapable result: we absolutely _will_ have times when we get lots and lots of new code. So these are not "problems". They are *facts*. Stating them as problems is stupid and pointless. I'm not going to discuss this with you if you cannot get over this. So please accept the facts. Once you accept the facts, you can state the things you can change. But the things you cannot change is the merge window, and the fact that we get a lot of new code at a high rate (where the merge window will inevitably compress that rate, so that we have _another_ window where the rate is lower). So stop arguing against facts, and start arguing about other things that can be argued about. That's all I'm saying. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 15:26 ` Linus Torvalds @ 2008-05-01 17:09 ` Rafael J. Wysocki 2008-05-01 17:41 ` Linus Torvalds 2008-05-01 18:35 ` Chris Frey 1 sibling, 1 reply; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 17:09 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Thursday, 1 of May 2008, Linus Torvalds wrote: > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > Okay, so what exactly are we going to do to address the issue that I described > > in the part of my last message that you skipped? > > Umm. I don't really see anythign to say. You said: > > > Still, the issue at hand is that > > (1) The code merged during a merge window is somewhat opaque from the tester's > > point of view and if a regression is found, the only practical means to > > figure out what caused it is to carry out a bisection (which generally is > > unpleasant, to put it lightly). > > (2) Many regressions are introduced during merge windows (relative to the > > total amount of code merged they are a few, but the raw numbers are > > significant) and because of (1) the process of removing them is generally > > painful for the affected people. > > (3) The suspicion is that the number of regressions introduced during merge > > windows has something to do with the quality of code being below > > expectations, that in turn may be related to the fact that it's being > > developed very rapidly. > > And quite frankly, (2) and (3) are both: "merge windows introduce new > bugs", and that's such an uninteresting tautology that I'm left > wordless. Perhaps if they introduced fewer bugs, all of that would be less frustrating to people who get hit by them, especially by two or more at a time. Everyone seems to be fine with that until it happens to him personally (like it happened to David). > And (1) is just a result of merrging lots of stuff. > > Of course the new bugs / regressions are introduced during the merge > window. That's when we merge new code. New bugs don't generally happen > when you don't get new code. I obviously agree with that. The question is, however, if we can decrease the number of bugs introduced during merge windows and you seem to be saying that no, we can't. Which is disappointing. > And of course finding bugs is always painful to everybody involved. > > And of course the bugs indicate something about the quality of code > being merged. Perfect code wouldn't have bugs. > > So what you are stating isn't interesting, and isn't even worthy of > discussion. The way you state it, the only answer is: don't take new > code, then. That's what your whole argument always seems to boild down > to, and excuse me for (yet again) finding that argument totally > pointless. I have never said you shouldn't take new code at all. That's not what I'm saying and please don't paint me this way. I see a problem in that you get patches that you shouldn't have got because they are unfinished and not well thought through. They introduce regressions which are only possible to find using bisection because of the amount of code merged at a time and that's frustrating. You seem to be regarding this as a necessity, but I'm really not convinced that you're right in that. > So let me repeat: > > (1) we have new code. We always *will* have new code, hopefully. A few > million lines pe year. > > If you don't accept this, I don't have anything to say. > > (2) we need a merge window. That is a direct result not of wanting to > have lots of code at the same time, but of the _reverse_ issue: we > want to have times of relative calm. > > And again, if you continue to see the merge window as the > "problem", rather than as the INEVITABLE result of wanting to have > a calm period, there's no point in talking to you. However, the width of the merge window is not a predetermined thing and might be adjusted, for example. Other things might be changed too. > (3) Ergo, there's a very fundamental and basic and inescapable result: > we absolutely _will_ have times when we get lots and lots of new > code. But that need not include obviously broken patches. > So these are not "problems". They are *facts*. Stating them as > problems is stupid and pointless. I'm not going to discuss this with > you if you cannot get over this. > > So please accept the facts. > > Once you accept the facts, you can state the things you can change. But > the things you cannot change is the merge window, and the fact that we > get a lot of new code at a high rate (where the merge window will > inevitably compress that rate, so that we have _another_ window where > the rate is lower). The problem is the (relatively small) fraction of patches pushed to you that is broken. Some patches are obviously broken, some of them are just not tested well enough. The result is pretty much the same in either case. Now, the question is if we can get rid of that fraction by adjusting the process somehow. You're arguing that we can't and so be it. [This is your opinion and BTW there's nothing allowing me to call that unreasonable or saying that you use made up arguments or something like this.] My opinion is that we could at least try to do something about it. linux-next is probably a step in the right direction, though time will tell. I'm afraid, though, that I personally can't do much more than I've been doing already to improve things. > So stop arguing against facts, and start arguing about other things that > can be argued about. That's all I'm saying. The message that started this whole thread was not from me and I believe it was sent for a reason. So the fact is that at least some people lose their patience over the current handling of merge windows. And I'm not sure that's necessary. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 17:09 ` Rafael J. Wysocki @ 2008-05-01 17:41 ` Linus Torvalds 2008-05-01 18:11 ` Al Viro ` (3 more replies) 0 siblings, 4 replies; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 17:41 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > I obviously agree with that. The question is, however, if we can decrease the > number of bugs introduced during merge windows and you seem to be saying > that no, we can't. Which is disappointing. No, that's not what I'm saying. What I *am* saying is that as long as you concentrate on "merge window" and "lots of code", you're concentrating not on the problems, but on the facts of life. You can't change facts, and even trying is pointless. What you should concentrate on is not how many patches there are during the merge window (because we can't do anything about that) or the fact that they all happen in a short timeframe, but about quality of patches _regardless_ of merge window. So if you can make an argument that does not even *try* to change the fact that - we have lots of patches and - we have a merge window and - merging patches causes bugs but argues about quality from some other standpoint, then I can start to believe that you have a point. But as long as you argue about the fact that we merge a lot of stuff, and that bugs come in during the merge window, I'm not interested. Arguing about facts is totally non-productive. And as long as people keep saying "let's not merge broken patches" or "we should never have bugs", I'll just ignore those kinds of idiotic statements. They aren't even arguments, they are wishes, and they are unrealistic. If we knew they were broken and had bugs, of course we wouldn't merge them. In short - I'm simply not interested in what you _wish_ reality was. People need to first acknowledge reality, and _then_ they may have solutions. So the reality is: - we do have tons of patches, and they need to be merged (and furiously) - there *will* be bugs. And the number of bugs will inevitably be relative to the number of patches. There is no "perfect", and anybody who argues for a lower number of bugs by lowering the number of patches is an idiot in my book. - there *will* be releases, even in the presense of bugs, because holding everything up is simply not an option. Those are the things that we have to accept. Anything else is just dreaming. Now, what part _can_ we improve and still be realistic? We can try to improve average quality - the number of bugs will *still* be relative to the size of the changes (no getting away from that), but we may be able to lower the absolute number of bugs. But not to zero! And that "not to zero" is IMPORTANT. If you think you can aim for zero bugs, I'm simply not interested in discussing it with you. You live in a different universe, and we're not talking about the same reality. And if you're not being realistic, then why the hell would I believe that your solutions are realistic? I'd rather take some pills and talk to the little purple man living under the deck in my back yard, because at least he's amusing, even if he doesn't make much sense either. And I'm also not in the *least* interested in arguments like "We should just improve our quality of patches". Of course everybody wishes for that. Again, it's not an argument, it's just a unrealistic wish, unless you can actually give a suggestion of a process or other thing that would actually seem to reach it (without assuming other impossible things like "we need more time" or "we need more people who just spend their day looking for bugs"). Same goes for "we should all just spend time looking at each others patches and trying to find bugs in them". That's not a solution, that's a drug-induced dream you're living in. And again, if I want to discuss dreams, I'd rather talk about my purple guy, and the bad things he does to the hedgehog that lives next door. So do you have any productive *suggestions*? Some that involve more than "let's write less code" or "let's just review each others patches more". Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 17:41 ` Linus Torvalds @ 2008-05-01 18:11 ` Al Viro 2008-05-01 18:23 ` Linus Torvalds 2008-05-01 18:50 ` Willy Tarreau ` (2 subsequent siblings) 3 siblings, 1 reply; 229+ messages in thread From: Al Viro @ 2008-05-01 18:11 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Thu, May 01, 2008 at 10:41:21AM -0700, Linus Torvalds wrote: > Of course everybody wishes for that. Again, it's not an argument, it's > just a unrealistic wish, unless you can actually give a suggestion of a > process or other thing that would actually seem to reach it (without > assuming other impossible things like "we need more time" or "we need > more people who just spend their day looking for bugs"). > > Same goes for "we should all just spend time looking at each others > patches and trying to find bugs in them". That's not a solution, that's a > drug-induced dream you're living in. As one of those obviously drug-addled freaks who _are_ looking for bugs... Thank you so fucking much ;-/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 18:11 ` Al Viro @ 2008-05-01 18:23 ` Linus Torvalds 2008-05-01 18:30 ` Linus Torvalds ` (2 more replies) 0 siblings, 3 replies; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 18:23 UTC (permalink / raw) To: Al Viro Cc: Rafael J. Wysocki, Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Thu, 1 May 2008, Al Viro wrote: > On Thu, May 01, 2008 at 10:41:21AM -0700, Linus Torvalds wrote: > > > > Same goes for "we should all just spend time looking at each others > > patches and trying to find bugs in them". That's not a solution, that's a > > drug-induced dream you're living in. > > As one of those obviously drug-addled freaks who _are_ looking for bugs... > Thank you so fucking much ;-/ That's not what I meant, and I think you know it. Of course as many people as possible should look at other peoples patches and comment on them. But saying so won't _make_ it so. And it's also something that we have done since day #1 _anyway_, so anybody who thinks that it would improve code quality from where we already are, should explain how he thinks the increase would be caused, and how it would happen. So when we're looking at improvement suggestions, they should be real suggestions that have realistic goals, not just wishes. And they shouldn't be the things we *already* do, because then they wouldn't be improvements. In other words: do people have realistic ideas for how to make others spend _more_ time looking at patches? And not just _wishing_ people did that? Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 18:23 ` Linus Torvalds @ 2008-05-01 18:30 ` Linus Torvalds 2008-05-01 18:58 ` Willy Tarreau 2008-05-01 19:37 ` Al Viro 2 siblings, 0 replies; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 18:30 UTC (permalink / raw) To: Al Viro Cc: Rafael J. Wysocki, Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Thu, 1 May 2008, Linus Torvalds wrote: > > In other words: do people have realistic ideas for how to make others > spend _more_ time looking at patches? And not just _wishing_ people did > that? Just to throw out an example: - make a "Random pending patch of the day" google gadget. I know that's abit out there, and I'm not sure the google gadget thing is realistic, but I bet I'm not the only one who ends up using the google homepage all the time. A button that says "this patch looks ok", "this patch looks crap", or "I dunno, give me another one to look at" might be a fun game that would encourage people to look at a couple of patches a day. You get five thousand people doing that occasionally (not every day, but maybe when they are bored and look for something more rewarding than trying to find bad music videos on youtube), and maybe you'd actually get feedback on patches. Make it pick a random commit that is in linux-next but hasn't been merged into main -git yet. Crazy? Probably. But at least it fits my notion of "let's not just wish people did more patch commentary" thing. IOW, if people are really serious about coming up with ways to improve code quality, I really think it needs to be about _practical_ things that can fit in our flow or can be extensions to it, not just wishing for better quality. "If wishes were horses, beggars would ride" Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 18:23 ` Linus Torvalds 2008-05-01 18:30 ` Linus Torvalds @ 2008-05-01 18:58 ` Willy Tarreau 2008-05-01 19:37 ` Al Viro 2 siblings, 0 replies; 229+ messages in thread From: Willy Tarreau @ 2008-05-01 18:58 UTC (permalink / raw) To: Linus Torvalds Cc: Al Viro, Rafael J. Wysocki, Andrew Morton, David Miller, linux-kernel, Jiri Slaby our mails have crossed each other. Just to follow up in this thread just in case... On Thu, May 01, 2008 at 11:23:43AM -0700, Linus Torvalds wrote: > So when we're looking at improvement suggestions, they should be real > suggestions that have realistic goals, not just wishes. And they > shouldn't be the things we *already* do, because then they wouldn't > be improvements. as explained in last mail, I think that we're doing that far less than we used to because of the ease of "Linus, please pull from git://master...". > In other words: do people have realistic ideas for how to make others > spend _more_ time looking at patches? And not just _wishing_ people did > that? As explained, I have no problem hijacking pull requests asking for 1) code and 2) review if it's not explicitly stated in the message that it has been reviewed, or that it is an obvious fix. I have no problem trusting the poster, he should just care not to lie too often or will get a bad reputation of being a blatant liar. The only limit is that if I'm alone doing those raids, I'll quickly get into all developer's blacklist and nothing will change. *YOU* too have to enforce this policy. Willy ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 18:23 ` Linus Torvalds 2008-05-01 18:30 ` Linus Torvalds 2008-05-01 18:58 ` Willy Tarreau @ 2008-05-01 19:37 ` Al Viro 2008-05-01 19:58 ` Andrew Morton 2008-05-01 20:07 ` Joel Becker 2 siblings, 2 replies; 229+ messages in thread From: Al Viro @ 2008-05-01 19:37 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Thu, May 01, 2008 at 11:23:43AM -0700, Linus Torvalds wrote: > On Thu, 1 May 2008, Al Viro wrote: > > On Thu, May 01, 2008 at 10:41:21AM -0700, Linus Torvalds wrote: > > > > > > Same goes for "we should all just spend time looking at each others > > > patches and trying to find bugs in them". That's not a solution, that's a > > > drug-induced dream you're living in. > > > > As one of those obviously drug-addled freaks who _are_ looking for bugs... > > Thank you so fucking much ;-/ > > That's not what I meant, and I think you know it. FWIW, the way I'd read that had been "face it, normal folks don't *do* that and if you hope for more people doing code review - put down your pipe, it's not even worth talking about". Which managed to get under my skin, and that's not something that happens often... Anyway, I'm glad it had been a misparsing; my apologies for the reaction. > So when we're looking at improvement suggestions, they should be real > suggestions that have realistic goals, not just wishes. And they > shouldn't be the things we *already* do, because then they wouldn't > be improvements. > > In other words: do people have realistic ideas for how to make others > spend _more_ time looking at patches? And not just _wishing_ people did > that? The obvious answer: amount of areas where one _can_ do that depends on some things that can be changed. Namely: * one needs to understand enough of the area or know where/how to get the information needed for that. I've got some experience with the latter and I suspect that most of the folks who do active reviews have their own set of tricks for getting into the unfamiliar area fast. Moreover, having such set of tricks is probably _the_ thing that makes us able to do that kind of work. Sharing such (i.e. "here's how one wades through unfamiliar area and gets a sense of what's going on there; here's what one looks out for; here's how to deal with data structures; here are the signs of problematic lifetime logics; here's how one formulates hypothesis about refcounting rules; here's how one verifies such and looks for possible bugs in that area; etc.) is a Good Idea(tm). Having the critical areas documented with easy to review in mind is another thing that would probably help. And yes, it won't happen overnight, it won't happen for all areas and it won't be mandatory for maintainers, etc. Previous part (i.e. which questions to ask about data structures, etc.) would help with that. FWIW, I'm trying to do that - right now I'm flipping between wading through Cthulhu-damned fs/locks.c and its friends and getting the notes I've got from the last month work into edible form (which includes translation into something that resembles normal English, among other things - more than half of that is in... well, let's call it idiom-rich Russian). * patches should be visible *when* *they* *can* *be* *changed*. If it's "Linus had pulled from linux-foo.git and that included a merge from linux-foobar.git, which is developed on foobar-wank@hell.knows.where", it's too late. It's not just that you don't revert; it's that you _can't_ realistically revert in such situation - not without very massive work. And I don't know what _can_ be done about that, other than making it socially discouraged. To some extent it's OK, but my impression is that some areas are as bad as CVS-based "communities" had been and switch to git has simply hidden the obvious signs of trouble... ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 19:37 ` Al Viro @ 2008-05-01 19:58 ` Andrew Morton 2008-05-01 20:07 ` Joel Becker 1 sibling, 0 replies; 229+ messages in thread From: Andrew Morton @ 2008-05-01 19:58 UTC (permalink / raw) To: Al Viro; +Cc: torvalds, rjw, w, davem, linux-kernel, jirislaby On Thu, 1 May 2008 20:37:14 +0100 Al Viro <viro@ZenIV.linux.org.uk> wrote: > * patches should be visible *when* *they* *can* *be* *changed*. > If it's "Linus had pulled from linux-foo.git and that included a merge > from linux-foobar.git, which is developed on foobar-wank@hell.knows.where", > it's too late. It's not just that you don't revert; it's that you _can't_ > realistically revert in such situation - not without very massive work. > And I don't know what _can_ be done about that, other than making it > socially discouraged. To some extent it's OK, but my impression is that > some areas are as bad as CVS-based "communities" had been and switch to > git has simply hidden the obvious signs of trouble... Yup. I think the only sane+scalable way of making this happen is to prevail upon the 100-odd subsystem maintainers to keep an eye out for code which should be exposed to additional eyes. There are of course many reasons _why_ such code needs the attention of others, and those reasons have varying strengths. Off the top of my head: - modifies stuff outside the designated subsystem (eg: lib/pcounter.c - thanks Pavel) - (having just spent an hour looking at drivers/net/sfc/ and having boggled at its bitmap.h): adds generic-looking infrastructure which should be in core kernel. Or already _is_ in core kernel. - Adds any kernel<->user interface which is not of the the most trivial&standard form - Futzes with memory management internals, adds pagefault handlers, etc. - Ditto vfs things, I guess - In any way attempts to work around _any_ shortcoming of any other part of the kernel! - Does anything RCU related. Every time I cc Paul on an rcu-using patch, he finds holes in it. - add your own here. But we won't find such code by going out and looking for it - we do need the recipients of that code to say "hey, others might want to see this". That's very low-effort for the hey-sayer, so I expect we can do better here quite easily. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 19:37 ` Al Viro 2008-05-01 19:58 ` Andrew Morton @ 2008-05-01 20:07 ` Joel Becker 1 sibling, 0 replies; 229+ messages in thread From: Joel Becker @ 2008-05-01 20:07 UTC (permalink / raw) To: Al Viro Cc: Linus Torvalds, Rafael J. Wysocki, Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Thu, May 01, 2008 at 08:37:14PM +0100, Al Viro wrote: > * one needs to understand enough of the area or know where/how > to get the information needed for that. I've got some experience with > the latter and I suspect that most of the folks who do active reviews > have their own set of tricks for getting into the unfamiliar area fast. > Moreover, having such set of tricks is probably _the_ thing that makes > us able to do that kind of work. > Sharing such (i.e. "here's how one wades through unfamiliar > area and gets a sense of what's going on there; here's what one looks > out for; here's how to deal with data structures; here are the signs > of problematic lifetime logics; here's how one formulates hypothesis > about refcounting rules; here's how one verifies such and looks for > possible bugs in that area; etc.) is a Good Idea(tm). <snip> > FWIW, I'm trying to do that - right now I'm flipping between > wading through Cthulhu-damned fs/locks.c and its friends and getting > the notes I've got from the last month work into edible form (which > includes translation into something that resembles normal English, > among other things - more than half of that is in... well, let's call > it idiom-rich Russian). I think you've just nailed one of the tricks right there. A long time ago, I just sat down and wrote up a "how the locking works in the vfs" document for myself and others. Wrote up the structures, what each member is for, where the structure appears and disappears, and all the call chains for all of the locks. When I was done, I had a pretty good idea of how everything interacted. I think this is a great trick for ramping up on a section of the code - documentation is good, but you understand self-written documentation better. Joel -- Life's Little Instruction Book #452 "Never compromise your integrity." Joel Becker Principal Software Developer Oracle E-mail: joel.becker@oracle.com Phone: (650) 506-8127 ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 17:41 ` Linus Torvalds 2008-05-01 18:11 ` Al Viro @ 2008-05-01 18:50 ` Willy Tarreau 2008-05-01 19:07 ` david 2008-05-01 22:17 ` Rafael J. Wysocki 2008-05-01 19:39 ` Friedrich Göpel 2008-05-01 21:59 ` Rafael J. Wysocki 3 siblings, 2 replies; 229+ messages in thread From: Willy Tarreau @ 2008-05-01 18:50 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Andrew Morton, David Miller, linux-kernel, Jiri Slaby On Thu, May 01, 2008 at 10:41:21AM -0700, Linus Torvalds wrote: > Same goes for "we should all just spend time looking at each others > patches and trying to find bugs in them". That's not a solution, that's a > drug-induced dream you're living in. "all" above is the wrong part. Encourage each other into reviewing code will definitely *help* (and I did not say fix the problem, OK?). There are persons who regularly spend some time to review code. I'm thinking about Al, Andrew, Christoph, Arjan, and maybe many other ones I'm missing, just that I regularly see them give advices to people who post their patches on the list. And even if only for that, they deserve some respect, and their efforts must not be dismissed. Maybe they are more skilled than anyone else for this job. Maybe they're so much used to do it that it just takes them a few minutes each time, I don't know. I wish *more* people could be encouraged to do this work, which is very likely painful but instructive. If the current reviewers could give hints on how to save a lot of time to them, it may motivate more to follow them. I suspect that insisting on developers to post their less obvious work to the list(s) is a first step. Maybe at one point we're all responsible when we see a mail entitled "[GIT] pull request for XXX", we should all jump on it and ask "when and where was this code reviewed ?". Once again, it's not a fix. It's just one small step towards a saner process. > So do you have any productive *suggestions*? Some that involve more than > "let's write less code" or "let's just review each others patches more". It's not much about reviewing each others' patches, it's about showing one's work to others first. If our developers are encouraged to work alone in a cave late at night with itching eyes, and send their work at once every 2 months in a sealed envelope, we'll not solve anything. I also proposed a more repressive method incitating the ones with really bad scores to find crap in other's work in order to remain hidden behind them. You explained why it would not work. Fine. I also proposed to group merges by reduced overlapping areas, and to shorten the merge window and make it (at least) twice as often. Rafael also proposed to merge core first, then archs, which is a refined variation on the same principle. I'm not sure I've seen your opinion on this. Willy ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 18:50 ` Willy Tarreau @ 2008-05-01 19:07 ` david 2008-05-01 19:28 ` Willy Tarreau 2008-05-01 22:17 ` Rafael J. Wysocki 1 sibling, 1 reply; 229+ messages in thread From: david @ 2008-05-01 19:07 UTC (permalink / raw) To: Willy Tarreau Cc: Linus Torvalds, Rafael J. Wysocki, Andrew Morton, David Miller, linux-kernel, Jiri Slaby On Thu, 1 May 2008, Willy Tarreau wrote: > I also proposed to group merges by reduced overlapping areas, and to > shorten the merge window and make it (at least) twice as often. Rafael > also proposed to merge core first, then archs, which is a refined variation > on the same principle. I'm not sure I've seen your opinion on this. the problem with trying to make the cycle twice as fast is that it takes time to hunt down the hard bugs, even when you have some idea where they are. go back through the last few kernels and look at the bugs that were fixed in the last couple of -rc releases (and in final), would they have really been fixed faster if other changes hadn't taken place? I suspect that they would not have, and if I'm right the result of merging half as much wouldn't be twice as many releases, but rather approximatly the same release schedule with more piling up for the next release. even individual git trees that do get a fair bit of testing (like networking for example) run into odd and hard to debug problems when exposed to a wider set of hardware and loads. having the networking changes go in every 4 months (with 4 months worth of changes) instead of every 2 months (with 2 months worth of changes) will just mean that there will be more problems in this area, and since they will be more concentrated in that area it will be harder to fix them all fast as the same group of people are needed for all of them. if several maintainers think that you are correct that doing a merge with far fewer changes will be a lot faster, they can test this in the real world by skipping one release. just send Linus a 'no changes this time' instead of a pull request. If you are right the stable release will happen significantly faster and they can say 'I told you so' and in the next release have a fair chance of convincing other maintainers to skip a release. it does worry me a bit that the release cycle seems to be slipping slightly each release, but I don't see a good way to fix this. David Lang ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 19:07 ` david @ 2008-05-01 19:28 ` Willy Tarreau 2008-05-01 19:46 ` david 0 siblings, 1 reply; 229+ messages in thread From: Willy Tarreau @ 2008-05-01 19:28 UTC (permalink / raw) To: david Cc: Linus Torvalds, Rafael J. Wysocki, Andrew Morton, David Miller, linux-kernel, Jiri Slaby On Thu, May 01, 2008 at 12:07:53PM -0700, david@lang.hm wrote: > On Thu, 1 May 2008, Willy Tarreau wrote: > > >I also proposed to group merges by reduced overlapping areas, and to > >shorten the merge window and make it (at least) twice as often. Rafael > >also proposed to merge core first, then archs, which is a refined variation > >on the same principle. I'm not sure I've seen your opinion on this. > > the problem with trying to make the cycle twice as fast is that it takes > time to hunt down the hard bugs, even when you have some idea where they > are. Of course, they'll always be bugs. They'll still slip past the release, as many are doing today. > go back through the last few kernels and look at the bugs that were fixed > in the last couple of -rc releases (and in final), would they have really > been fixed faster if other changes hadn't taken place? Don't know. However, I think that core bugs have more impact on the rest than other bugs. Reason to merge core first. > I suspect that they would not have, and if I'm right the result of merging > half as much wouldn't be twice as many releases, but rather approximatly > the same release schedule with more piling up for the next release. no, this is exactly what *not* to do. Linus is right about the risk of getting more stuff at once. If we merge less things, we *must* be able to speed up the process. Half the patches to cross-check in half the time should be easier than all patches in full time. The time to fix a problem within N patches is O(N^2). > even individual git trees that do get a fair bit of testing (like > networking for example) run into odd and hard to debug problems when > exposed to a wider set of hardware and loads. having the networking > changes go in every 4 months (with 4 months worth of changes) instead of > every 2 months (with 2 months worth of changes) will just mean that there > will be more problems in this area, and since they will be more > concentrated in that area it will be harder to fix them all fast as the > same group of people are needed for all of them. You're perfectly right and that's exactly not what I'm proposing. BTW, having two halves will also get more of the merge job done the side of developers, where testing is being done before submission. So in the end, we should also get *less* regressions caused by each submission. > if several maintainers think that you are correct that doing a merge with > far fewer changes will be a lot faster, they can test this in the real > world by skipping one release. just send Linus a 'no changes this time' > instead of a pull request. If you are right the stable release will happen > significantly faster and they can say 'I told you so' and in the next > release have a fair chance of convincing other maintainers to skip a > release. again, this cannot work because this would result in slowing them down, and it's not what I'm proposing. Willy ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 19:28 ` Willy Tarreau @ 2008-05-01 19:46 ` david 2008-05-01 19:53 ` Willy Tarreau 0 siblings, 1 reply; 229+ messages in thread From: david @ 2008-05-01 19:46 UTC (permalink / raw) To: Willy Tarreau Cc: Linus Torvalds, Rafael J. Wysocki, Andrew Morton, David Miller, linux-kernel, Jiri Slaby On Thu, 1 May 2008, Willy Tarreau wrote: > On Thu, May 01, 2008 at 12:07:53PM -0700, david@lang.hm wrote: >> On Thu, 1 May 2008, Willy Tarreau wrote: >> >> I suspect that they would not have, and if I'm right the result of merging >> half as much wouldn't be twice as many releases, but rather approximatly >> the same release schedule with more piling up for the next release. > > no, this is exactly what *not* to do. Linus is right about the risk of > getting more stuff at once. If we merge less things, we *must* be able > to speed up the process. Half the patches to cross-check in half the > time should be easier than all patches in full time. The time to fix > a problem within N patches is O(N^2). in general you are correct, however I don't think that it's the general bugs that end up delaying the releases, think it's the nasty, hard to identify and understand bugs that delay the releases, and I don't think that the debugging of those will speed up much. >> even individual git trees that do get a fair bit of testing (like >> networking for example) run into odd and hard to debug problems when >> exposed to a wider set of hardware and loads. having the networking >> changes go in every 4 months (with 4 months worth of changes) instead of >> every 2 months (with 2 months worth of changes) will just mean that there >> will be more problems in this area, and since they will be more >> concentrated in that area it will be harder to fix them all fast as the >> same group of people are needed for all of them. > > You're perfectly right and that's exactly not what I'm proposing. BTW, > having two halves will also get more of the merge job done the side of > developers, where testing is being done before submission. So in the > end, we should also get *less* regressions caused by each submission. Ok, I guess I don't understand what you are proposing then. I thought that you were proposing going from 2 week merge + 6 week stabilize = release to 1 week merge half + 3 week stabilize = release it now sounds as if you are saying 1 week merge + x week stabilize + 1 week merge + x week stabilize = release can you clarify? >> if several maintainers think that you are correct that doing a merge with >> far fewer changes will be a lot faster, they can test this in the real >> world by skipping one release. just send Linus a 'no changes this time' >> instead of a pull request. If you are right the stable release will happen >> significantly faster and they can say 'I told you so' and in the next >> release have a fair chance of convincing other maintainers to skip a >> release. > > again, this cannot work because this would result in slowing them down, > and it's not what I'm proposing. if merging fewer catagoies of stuff doesn't speed up the release cycle then you are right, it would just slow things down. however I thought you were arguing that if we merged fewer catagories of stuff each cycle we could speed up the cycle. I'm saying that maintainers can choose to test this experimentally and see if it works. if it works we can shift to doing more of it, if it doesn't they only delay things by a couple of months one time. you would need to have several maintainers decide to participate in the experiment or the difference in cycle time may not be noticable. David Lang ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 19:46 ` david @ 2008-05-01 19:53 ` Willy Tarreau 0 siblings, 0 replies; 229+ messages in thread From: Willy Tarreau @ 2008-05-01 19:53 UTC (permalink / raw) To: david Cc: Linus Torvalds, Rafael J. Wysocki, Andrew Morton, David Miller, linux-kernel, Jiri Slaby On Thu, May 01, 2008 at 12:46:41PM -0700, david@lang.hm wrote: > On Thu, 1 May 2008, Willy Tarreau wrote: > > >On Thu, May 01, 2008 at 12:07:53PM -0700, david@lang.hm wrote: > >>On Thu, 1 May 2008, Willy Tarreau wrote: > >> > >>I suspect that they would not have, and if I'm right the result of merging > >>half as much wouldn't be twice as many releases, but rather approximatly > >>the same release schedule with more piling up for the next release. > > > >no, this is exactly what *not* to do. Linus is right about the risk of > >getting more stuff at once. If we merge less things, we *must* be able > >to speed up the process. Half the patches to cross-check in half the > >time should be easier than all patches in full time. The time to fix > >a problem within N patches is O(N^2). > > in general you are correct, however I don't think that it's the general > bugs that end up delaying the releases, think it's the nasty, hard to > identify and understand bugs that delay the releases, and I don't think > that the debugging of those will speed up much. Indirectly yes it should. Who do you think is chasing those nasty bugs ? More people than should be. While those people spend time on bugs caused revealed by associating several trees, they don't work on fixing their own bugs. > >>even individual git trees that do get a fair bit of testing (like > >>networking for example) run into odd and hard to debug problems when > >>exposed to a wider set of hardware and loads. having the networking > >>changes go in every 4 months (with 4 months worth of changes) instead of > >>every 2 months (with 2 months worth of changes) will just mean that there > >>will be more problems in this area, and since they will be more > >>concentrated in that area it will be harder to fix them all fast as the > >>same group of people are needed for all of them. > > > >You're perfectly right and that's exactly not what I'm proposing. BTW, > >having two halves will also get more of the merge job done the side of > >developers, where testing is being done before submission. So in the > >end, we should also get *less* regressions caused by each submission. > > Ok, I guess I don't understand what you are proposing then. > > I thought that you were proposing going from 2 week merge + 6 week > stabilize = release to 1 week merge half + 3 week stabilize = release > > it now sounds as if you are saying 1 week merge + x week stabilize + 1 > week merge + x week stabilize = release > > can you clarify? The later : 1 week merge for core, 2-4 weeks to stabilize depending on the amount of changes and complexity of some bugs, release or not at this point (probably not), then 1 week merge for the rest, and 2-4 weeks stabilize. Drivers are different. Maybe we'll find it's better to merge them with the rest, maybe we'll find it wise to merge them all along, I don't know. > >>if several maintainers think that you are correct that doing a merge with > >>far fewer changes will be a lot faster, they can test this in the real > >>world by skipping one release. just send Linus a 'no changes this time' > >>instead of a pull request. If you are right the stable release will happen > >>significantly faster and they can say 'I told you so' and in the next > >>release have a fair chance of convincing other maintainers to skip a > >>release. > > > >again, this cannot work because this would result in slowing them down, > >and it's not what I'm proposing. > > if merging fewer catagoies of stuff doesn't speed up the release cycle > then you are right, it would just slow things down. however I thought you > were arguing that if we merged fewer catagories of stuff each cycle we > could speed up the cycle. I'm saying that maintainers can choose to test > this experimentally and see if it works. if it works we can shift to doing > more of it, if it doesn't they only delay things by a couple of months one > time. we should not delay too much IMHO, especially for core changes. We risk to get huge piles of code which break a lot of other things. Also, core changes sometimes involve adjustments in every driver or so. So they should not get additional delay (unless we're really bore by the maintainer not respecting the process). > you would need to have several maintainers decide to participate in the > experiment or the difference in cycle time may not be noticable. But it would require Linus to drive it first. Willy ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 18:50 ` Willy Tarreau 2008-05-01 19:07 ` david @ 2008-05-01 22:17 ` Rafael J. Wysocki 1 sibling, 0 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 22:17 UTC (permalink / raw) To: Willy Tarreau Cc: Linus Torvalds, Andrew Morton, David Miller, linux-kernel, Jiri Slaby On Thursday, 1 of May 2008, Willy Tarreau wrote: > On Thu, May 01, 2008 at 10:41:21AM -0700, Linus Torvalds wrote: > > Same goes for "we should all just spend time looking at each others > > patches and trying to find bugs in them". That's not a solution, that's a > > drug-induced dream you're living in. > > "all" above is the wrong part. Encourage each other into reviewing code > will definitely *help* (and I did not say fix the problem, OK?). There > are persons who regularly spend some time to review code. I'm thinking > about Al, Andrew, Christoph, Arjan, and maybe many other ones I'm missing, > just that I regularly see them give advices to people who post their patches > on the list. And even if only for that, they deserve some respect, and their > efforts must not be dismissed. > > Maybe they are more skilled than anyone else for this job. Maybe they're > so much used to do it that it just takes them a few minutes each time, I > don't know. I wish *more* people could be encouraged to do this work, > which is very likely painful but instructive. If the current reviewers > could give hints on how to save a lot of time to them, it may motivate > more to follow them. I suspect that insisting on developers to post their > less obvious work to the list(s) is a first step. Maybe at one point we're > all responsible when we see a mail entitled "[GIT] pull request for XXX", > we should all jump on it and ask "when and where was this code reviewed ?". > > Once again, it's not a fix. It's just one small step towards a saner process. > > > So do you have any productive *suggestions*? Some that involve more than > > "let's write less code" or "let's just review each others patches more". > > It's not much about reviewing each others' patches, it's about showing > one's work to others first. If our developers are encouraged to work > alone in a cave late at night with itching eyes, and send their work > at once every 2 months in a sealed envelope, we'll not solve anything. > > I also proposed a more repressive method incitating the ones with really > bad scores to find crap in other's work in order to remain hidden behind > them. You explained why it would not work. Fine. > > I also proposed to group merges by reduced overlapping areas, and to > shorten the merge window and make it (at least) twice as often. Rafael > also proposed to merge core first, then archs, which is a refined variation > on the same principle. That wasn't me, but the idea is also worth considering IMO. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 17:41 ` Linus Torvalds 2008-05-01 18:11 ` Al Viro 2008-05-01 18:50 ` Willy Tarreau @ 2008-05-01 19:39 ` Friedrich Göpel 2008-05-01 21:59 ` Rafael J. Wysocki 3 siblings, 0 replies; 229+ messages in thread From: Friedrich Göpel @ 2008-05-01 19:39 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On 10:41 Thu 01 May , Linus Torvalds wrote: > Same goes for "we should all just spend time looking at each others > patches and trying to find bugs in them". That's not a solution, that's a > drug-induced dream you're living in. But is it smarter to discourage people from doing code review, by saying that they won't be doing it anyway, or actively and publicly encourage people to do so, even on the chance that it might not lead to everyone doing it? It's kind of a self-fulfilling prophecy that way. Trying to force it through the process is another matter entirely. Cheers, Friedrich Göpel ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 17:41 ` Linus Torvalds ` (2 preceding siblings ...) 2008-05-01 19:39 ` Friedrich Göpel @ 2008-05-01 21:59 ` Rafael J. Wysocki 2008-05-02 12:17 ` Stefan Richter 3 siblings, 1 reply; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 21:59 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby On Thursday, 1 of May 2008, Linus Torvalds wrote: > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > I obviously agree with that. The question is, however, if we can decrease the > > number of bugs introduced during merge windows and you seem to be saying > > that no, we can't. Which is disappointing. > > No, that's not what I'm saying. > > What I *am* saying is that as long as you concentrate on "merge window" > and "lots of code", you're concentrating not on the problems, but on the > facts of life. You can't change facts, and even trying is pointless. > > What you should concentrate on is not how many patches there are during > the merge window (because we can't do anything about that) or the fact > that they all happen in a short timeframe, but about quality of patches > _regardless_ of merge window. > > So if you can make an argument that does not even *try* to change the fact > that > - we have lots of patches > and > - we have a merge window > and > - merging patches causes bugs > > but argues about quality from some other standpoint, then I can start to > believe that you have a point. > > But as long as you argue about the fact that we merge a lot of stuff, and > that bugs come in during the merge window, I'm not interested. Arguing > about facts is totally non-productive. > > And as long as people keep saying "let's not merge broken patches" or "we > should never have bugs", I'll just ignore those kinds of idiotic > statements. They aren't even arguments, they are wishes, and they are > unrealistic. If we knew they were broken and had bugs, of course we > wouldn't merge them. > > In short - I'm simply not interested in what you _wish_ reality was. > People need to first acknowledge reality, and _then_ they may have > solutions. > > So the reality is: > - we do have tons of patches, and they need to be merged (and furiously) > > - there *will* be bugs. And the number of bugs will inevitably be > relative to the number of patches. There is no "perfect", and anybody > who argues for a lower number of bugs by lowering the number of patches > is an idiot in my book. > > - there *will* be releases, even in the presense of bugs, because holding > everything up is simply not an option. > > Those are the things that we have to accept. Anything else is just > dreaming. > > Now, what part _can_ we improve and still be realistic? > > We can try to improve average quality - the number of bugs will *still* be > relative to the size of the changes (no getting away from that), but we > may be able to lower the absolute number of bugs. But not to zero! > > And that "not to zero" is IMPORTANT. If you think you can aim for zero > bugs, No, I don't. I've never said we can _eliminate_ bugs and please don't make things look as though I did. > I'm simply not interested in discussing it with you. You live in a > different universe, and we're not talking about the same reality. > > And if you're not being realistic, then why the hell would I believe that > your solutions are realistic? I'd rather take some pills and talk to the > little purple man living under the deck in my back yard, because at least > he's amusing, even if he doesn't make much sense either. That's not a level of discussion I'm used to, sorry. > And I'm also not in the *least* interested in arguments like "We should > just improve our quality of patches". > > Of course everybody wishes for that. Again, it's not an argument, it's > just a unrealistic wish, unless you can actually give a suggestion of a > process or other thing that would actually seem to reach it (without > assuming other impossible things like "we need more time" or "we need > more people who just spend their day looking for bugs"). > > Same goes for "we should all just spend time looking at each others > patches and trying to find bugs in them". Not necessarily trying to find bugs in them, but trying to understand how the patched code is supposed to work and if that's really what we want. I really think we should review each other's code more, but I do realize that people don't do it. Of course, I'm digressing. > That's not a solution, that's a drug-induced dream you're living in. And > again, if I want to discuss dreams, I'd rather talk about my purple guy, and > the bad things he does to the hedgehog that lives next door. > > So do you have any productive *suggestions*? Some that involve more than > "let's write less code" or "let's just review each others patches more". I'm not sure if you find it productive, but whatever. A general rule that the trees people want you to pull during a merge window should be tested in linux-next before, with no additional last minute changes, may help. For this to work, though, the people will have to know in advance when the merge window will start. Which may be helpful anyway. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 21:59 ` Rafael J. Wysocki @ 2008-05-02 12:17 ` Stefan Richter 0 siblings, 0 replies; 229+ messages in thread From: Stefan Richter @ 2008-05-02 12:17 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, Andrew Morton, Willy Tarreau, David Miller, linux-kernel, Jiri Slaby Rafael J. Wysocki wrote: > A general rule that the trees people want you to pull during a merge window > should be tested in linux-next before, with no additional last minute changes, > may help. > > For this to work, though, the people will have to know in advance when the > merge window will start. Which may be helpful anyway. If I only release into my tree's for-next branch what I would release into my tree's for-linus if I was to send a merge request to Linus right in this moment, then I won't need advance notice of a merge window. IOW, treat -next everyday as if the merge window was open right now. I'm sure it is not that easy for the larger subsystems or the infrastructure trees. However, Linus' late -rc announcements are plenty of advance notice, at least for a merge period as long as two weeks. -- Stefan Richter -=====-==--- -=-= ---=- http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 15:26 ` Linus Torvalds 2008-05-01 17:09 ` Rafael J. Wysocki @ 2008-05-01 18:35 ` Chris Frey 2008-05-02 13:22 ` Enrico Weigelt 1 sibling, 1 reply; 229+ messages in thread From: Chris Frey @ 2008-05-01 18:35 UTC (permalink / raw) To: linux-kernel On Thu, May 01, 2008 at 08:26:27AM -0700, Linus Torvalds wrote: > So let me repeat: > > (1) we have new code. We always *will* have new code, hopefully. A few > million lines pe year. Pardon this comment from an inexperienced kernel hacker, but it seems to me that one of the main problems is subsystems stomping on each other during the merge window, and a general confusion as to who is responsible for what bugs that appear. Perhaps a shorter merge window, using a round-robin approach, based on subsystem, would help alleviate these issues? This would: - give people a "known" tree to base their subsystem patches on, when their turn comes around - give a rough schedule if the round-robin was always consistent in order, or made known in advance - a shorter window would keep people from waiting too long for their turn - give those responsible for the currently merged subsystem motivation and clarity to fix bugs that do appear during their merge window Problems I see with this approach: - those at the end of the cycle get the shaft, if previous changes affect their work - political issues with determining the order of the round-robin schedule If I'm overlooking something, I'm sure someone will correct me. :-) - Chris ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 18:35 ` Chris Frey @ 2008-05-02 13:22 ` Enrico Weigelt 0 siblings, 0 replies; 229+ messages in thread From: Enrico Weigelt @ 2008-05-02 13:22 UTC (permalink / raw) To: linux kernel list Hi folks, <big_snip> Just a few naive thoughts: a) What about reducing code size ? Some parts, IMHO, doen't necessarily need to be in the kernel, eg. certain filesystems. Less code, less patches to review, less chance of kernel bugs. Of course this might also cause other impacts (eg. performance), so those decisions require great care. b) Mutli-tier trees / patchlines IMHO, a major problem are conflicting patches (eg. a core change causes some driver to break). In measurement instrumentation (eg. timesync), there's typically one primary reference point (eg. atomic clock) as tier-0, where (a limited set of) tier-1's are synchronized against, tier-2 syncs against tier-1 and so on. So for the linux kernel, we perhaps could have something like: * tier-0: core * tier-1: arch * tier-2: hw drivers * tier-3: sw drivers * tier-4: userland interfaces If a change from a lower tier wants to it's upper tier, it first MUST fit it's current mainline and carefully checked. Of course this introduces longer times for an individual change to go to into release (since it has to pass several tiers), but IMHO the chance of new bugs in release should be reduced this way. Of course there might be chances in a lower tier, which obviously won't affect several intermediate tiers. Those could skip some tiers. For example, I'm currently working on an /proc interface for changing process privileges. In my model, this had to be settled in #4, but shouldn't touch drivers (#2,#3), but maybe arch (#1). So these changes could be kicked directly to #2. What do you think about this ? cu -- --------------------------------------------------------------------- Enrico Weigelt == metux IT service - http://www.metux.de/ --------------------------------------------------------------------- Please visit the OpenSource QM Taskforce: http://wiki.metux.de/public/OpenSource_QM_Taskforce Patches / Fixes for a lot dozens of packages in dozens of versions: http://patches.metux.de/ --------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:19 ` Linus Torvalds 2008-05-01 1:31 ` Andrew Morton @ 2008-05-01 1:40 ` Linus Torvalds 2008-05-01 1:51 ` David Miller ` (4 more replies) 2008-05-01 5:50 ` Willy Tarreau 2 siblings, 5 replies; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 1:40 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, 30 Apr 2008, Linus Torvalds wrote: > > You (and Andrew) have tried to argue that slowing things down results in > better quality, Sorry, not Andrew. DavidN. Andrew argued the other way (quality->slower), which I also happen to not necessarily believe in, but that's a separate argument. Nobody should ever argue against raising quality. The question could be about "at what cost"? (although I think that's not necessarily a good argument, since I personally suspect that good quality code comes from _lowering_ costs, not raising them). But what's really relevant is "how?" Now, we do know that open-source code tends to be higher quality (along a number of metrics) than closed source code, and my argument is that it's not because of bike-shedding (aka code review), but simply because the code is out there and available and visible. And as a result of that, my personal belief is that the best way to raise quality of code is to distribute it. Yes, as patches for discussion, but even more so as a part of a cohesive whole - as _merged_ patches! The thing is, the quality of individual patches isn't what matters! What matters is the quality of the end result. And people are going to be a lot more involved in looking at, testing, and working with code that is merged, rather than code that isn't. So _my_ answer to the "how do we raise quality" is actually the exact reverse of what you guys seem to be arguing. IOW, I argue that the high speed of merging very much is a big part of what gives us quality in the end. It may result in bugs along the way, but it also results in fixes, and lots of people looking at the result (and looking at it in *context*, not just as a patch flying around). And yes, maybe that sounds counter-intuitive. But hey, people thought open source was counter-intuitive. I spent years explaining why it should work at all! Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:40 ` Linus Torvalds @ 2008-05-01 1:51 ` David Miller 2008-05-01 2:01 ` Linus Torvalds 2008-05-01 2:21 ` Al Viro ` (3 subsequent siblings) 4 siblings, 1 reply; 229+ messages in thread From: David Miller @ 2008-05-01 1:51 UTC (permalink / raw) To: torvalds; +Cc: rjw, w, linux-kernel, akpm, jirislaby From: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed, 30 Apr 2008 18:40:39 -0700 (PDT) > IOW, I argue that the high speed of merging very much is a big part of > what gives us quality in the end. It may result in bugs along the way, but > it also results in fixes, and lots of people looking at the result (and > looking at it in *context*, not just as a patch flying around). This is a huge burdon to put on people. The more broken stuff you merge, the more people are forced to track these problems down so that they can get their own work done. It punishes people who do put forth the effort to let new changes cook properly, before pushing, and thus avoid putting turds into the tree. You really have to think about the ramifications of this system. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:51 ` David Miller @ 2008-05-01 2:01 ` Linus Torvalds 2008-05-01 2:17 ` David Miller 0 siblings, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 2:01 UTC (permalink / raw) To: David Miller; +Cc: rjw, w, linux-kernel, akpm, jirislaby On Wed, 30 Apr 2008, David Miller wrote: > From: Linus Torvalds <torvalds@linux-foundation.org> > Date: Wed, 30 Apr 2008 18:40:39 -0700 (PDT) > > > IOW, I argue that the high speed of merging very much is a big part of > > what gives us quality in the end. It may result in bugs along the way, but > > it also results in fixes, and lots of people looking at the result (and > > looking at it in *context*, not just as a patch flying around). > > This is a huge burdon to put on people. > > The more broken stuff you merge, the more people are forced to track > these problems down so that they can get their own work done. I'm not saying we should merge crap. You can take any argument too far, and clearly it doesn't mean that we should just accept *anything*, because it will magically be gilded by its mere inclusion into the kernel. No, I'm not going to argue that. But I do want to argue against the notion that the only way to raise quality is to do it before it gets merged. It's often better to merge early, and fix the issues the merge brings up early too! Release early, release often. That was the watch-word early in Linux kernel development, and there was a reason for it. And it _worked_. Did it mean "release crap, release anything"? No. But it did mean that things got lots more exposure - even if those "things" were sometimes bugs. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 2:01 ` Linus Torvalds @ 2008-05-01 2:17 ` David Miller 0 siblings, 0 replies; 229+ messages in thread From: David Miller @ 2008-05-01 2:17 UTC (permalink / raw) To: torvalds; +Cc: rjw, w, linux-kernel, akpm, jirislaby From: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed, 30 Apr 2008 19:01:12 -0700 (PDT) > I'm not saying we should merge crap. That's exactly what's been happening this merge window though. And throughout this, Andrew Morton has been the only person with the balls and lack of ego problems to revert regression causing changes he introduced. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:40 ` Linus Torvalds 2008-05-01 1:51 ` David Miller @ 2008-05-01 2:21 ` Al Viro 2008-05-01 5:19 ` david 2008-05-04 3:26 ` Rene Herman 2008-05-01 2:31 ` Nigel Cunningham ` (2 subsequent siblings) 4 siblings, 2 replies; 229+ messages in thread From: Al Viro @ 2008-05-01 2:21 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, Apr 30, 2008 at 06:40:39PM -0700, Linus Torvalds wrote: > Now, we do know that open-source code tends to be higher quality (along a > number of metrics) than closed source code, and my argument is that it's > not because of bike-shedding (aka code review), but simply because the > code is out there and available and visible. Really? And how, pray tell, being out there will magically improve the code? "With enough eyes all bugs are shallow" stuff out of ESR's arse? FWIW, after the last month's flamefests I decided to actually do something about review density of code in the areas I'm theoretically responsible for. Namely, do systematic review of core data structure handling (starting with the place where most of the codepaths get into VFS - descriptor tables and struct file), doing both blow-by-blow writeup on how that sort of things is done and documentation of the life cycle/locking rules/assertions made by code/etc. I made one bad mistake that held the things back for quite a while - sending heads-up for one of the worse bugs found in process to never-sufficiently-damned vendor-sec. The last time I'm doing that, TYVM... Anyway, I'm going to get the notes on that stuff in order and put them in the open. I really hope that other folks will join the fun afterwards. The goal is to get a coherent braindump that would be sufficient for people new to the area wanting to understand and review VFS-related code - both in the tree and in new patches. files_struct/fdtable handling is mostly dealt with, struct file is only partially done - unfortunately, struct file_lock has to be dealt with before that and it's a (predictable) nightmare. On the other end of things, fs_struct is not really started, vfsmount review is partially done, dentry/superblock/inode not even touched. Even with what little had been covered... well, let's just say that it caught quite a few fun turds. With typical age around 3-4 years. And VFS is not the messiest part of the tree... ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 2:21 ` Al Viro @ 2008-05-01 5:19 ` david 2008-05-04 3:26 ` Rene Herman 1 sibling, 0 replies; 229+ messages in thread From: david @ 2008-05-01 5:19 UTC (permalink / raw) To: Al Viro Cc: Linus Torvalds, Rafael J. Wysocki, Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, 1 May 2008, Al Viro wrote: > FWIW, after the last month's flamefests I decided to actually do something > about review density of code in the areas I'm theoretically responsible > for. Namely, do systematic review of core data structure handling (starting > with the place where most of the codepaths get into VFS - descriptor tables > and struct file), doing both blow-by-blow writeup on how that sort of things > is done and documentation of the life cycle/locking rules/assertions made > by code/etc. I made one bad mistake that held the things back for quite > a while - sending heads-up for one of the worse bugs found in process to > never-sufficiently-damned vendor-sec. The last time I'm doing that, TYVM... > > Anyway, I'm going to get the notes on that stuff in order and put them in > the open. I really hope that other folks will join the fun afterwards. > The goal is to get a coherent braindump that would be sufficient for > people new to the area wanting to understand and review VFS-related code - > both in the tree and in new patches. thank you, the lack of good documentation on the intent of the code has been a significant barrier for new people. it's (relativly) easy for a good programmer to look at the code and figure out how it does things, a bit harder to figure out what it does, but why it does it (and what it was actually _intended_ to do) is very hard to track down > files_struct/fdtable handling is mostly dealt with, struct file is only > partially done - unfortunately, struct file_lock has to be dealt with > before that and it's a (predictable) nightmare. On the other end of > things, fs_struct is not really started, vfsmount review is partially > done, dentry/superblock/inode not even touched. > > Even with what little had been covered... well, let's just say that it > caught quite a few fun turds. With typical age around 3-4 years. And > VFS is not the messiest part of the tree... it may not be the messiest part of the tree, but it's definantly one of the hardest to figure out the intent of. David Lang ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 2:21 ` Al Viro 2008-05-01 5:19 ` david @ 2008-05-04 3:26 ` Rene Herman 1 sibling, 0 replies; 229+ messages in thread From: Rene Herman @ 2008-05-04 3:26 UTC (permalink / raw) To: Al Viro Cc: Linus Torvalds, Rafael J. Wysocki, Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On 01-05-08 04:21, Al Viro wrote: > Really? And how, pray tell, being out there will magically improve the > code? "With enough eyes all bugs are shallow" stuff out of ESR's arse? In the same way that ESR's arse would improve if he'd not wear pants: by him going to the gym more to avoid at least a few of the many disgusted stares. ie, the magic would be in the quality of the code being greater simply due to developer being aware of the openness. The effect probably wears of after enough time though... Rene. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:40 ` Linus Torvalds 2008-05-01 1:51 ` David Miller 2008-05-01 2:21 ` Al Viro @ 2008-05-01 2:31 ` Nigel Cunningham 2008-05-01 18:32 ` Stephen Clark 2008-05-01 3:53 ` Frans Pop 2008-05-01 11:38 ` Rafael J. Wysocki 4 siblings, 1 reply; 229+ messages in thread From: Nigel Cunningham @ 2008-05-01 2:31 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby Hi. On Wed, 2008-04-30 at 18:40 -0700, Linus Torvalds wrote: > The thing is, the quality of individual patches isn't what matters! What > matters is the quality of the end result. And people are going to be a lot > more involved in looking at, testing, and working with code that is > merged, rather than code that isn't. No. People generally expect that code that has been merged does work, so they don't look at it unless they're forced to (by a bug or the desire to make further modifications in that code) and they don't explicitly seek to test it. They just seek to use it. When it doesn't work, some of us will go and seek to find the cause, others (most?) will simply roll back to whatever they last found to be reliable. Out of tree code has the same issues. The only time code really gets looked at and tested is when there's a problem, or when people are explicitly choosing to inspect it (pre-merge reviews, eg). So my answer to the "how do we raise quality" question would be that when writing the code, we put time and effort into properly analysing the problem and developing a solution, we put time and effort into carefully testing the solution, and we put code in that will help the end-user help us to debug issues later (without them necessarily needing to git-bisect). After all, good software isn't the result of random (or semi-random), unconsidered modifications, but of planning, thought and attention to detail. In other words, I'm arguing that the speed of merging should be irrelevant. What's relevant is the quality of the work done in the first place. If you want better quality code, penalise the people who get buggy code merged. Give them a reason to get it in a better state before they try to merge. Of course Linus alone can't do that. Nigel ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 2:31 ` Nigel Cunningham @ 2008-05-01 18:32 ` Stephen Clark 0 siblings, 0 replies; 229+ messages in thread From: Stephen Clark @ 2008-05-01 18:32 UTC (permalink / raw) To: Nigel Cunningham Cc: Linus Torvalds, Rafael J. Wysocki, Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby Nigel Cunningham wrote: > Hi. > > On Wed, 2008-04-30 at 18:40 -0700, Linus Torvalds wrote: >> The thing is, the quality of individual patches isn't what matters! What >> matters is the quality of the end result. And people are going to be a lot >> more involved in looking at, testing, and working with code that is >> merged, rather than code that isn't. > > No. People generally expect that code that has been merged does work, so > they don't look at it unless they're forced to (by a bug or the desire > to make further modifications in that code) and they don't explicitly > seek to test it. They just seek to use it. > > When it doesn't work, some of us will go and seek to find the cause, > others (most?) will simply roll back to whatever they last found to be > reliable. > > Out of tree code has the same issues. > > The only time code really gets looked at and tested is when there's a > problem, or when people are explicitly choosing to inspect it (pre-merge > reviews, eg). > > So my answer to the "how do we raise quality" question would be that > when writing the code, we put time and effort into properly analysing > the problem and developing a solution, we put time and effort into > carefully testing the solution, and we put code in that will help the > end-user help us to debug issues later (without them necessarily needing > to git-bisect). After all, good software isn't the result of random (or > semi-random), unconsidered modifications, but of planning, thought and > attention to detail. > > In other words, I'm arguing that the speed of merging should be > irrelevant. What's relevant is the quality of the work done in the first > place. > > If you want better quality code, penalise the people who get buggy code > merged. Give them a reason to get it in a better state before they try > to merge. Of course Linus alone can't do that. > > Nigel > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > Amen! -- "They that give up essential liberty to obtain temporary safety, deserve neither liberty nor safety." (Ben Franklin) "The course of history shows that as a government grows, liberty decreases." (Thomas Jefferson) ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:40 ` Linus Torvalds ` (2 preceding siblings ...) 2008-05-01 2:31 ` Nigel Cunningham @ 2008-05-01 3:53 ` Frans Pop 2008-05-01 11:38 ` Rafael J. Wysocki 4 siblings, 0 replies; 229+ messages in thread From: Frans Pop @ 2008-05-01 3:53 UTC (permalink / raw) To: Linus Torvalds; +Cc: rjw, w, davem, linux-kernel, akpm, jirislaby Linus Torvalds wrote: > IOW, I argue that the high speed of merging very much is a big part of > what gives us quality in the end. It may result in bugs along the way, but > it also results in fixes, and lots of people looking at the result (and > looking at it in *context*, not just as a patch flying around). The main problem as I see it is with the huge number of hard, confirmed bugs that are *not* getting fixed. With the current development model, developers only really care about current regressions. In a large part this is due to the excellent work of Rafael with his tracking of regressions since the previous release. But it does mean older regressions fall by the wayside, even if they've been confirmed, bisected and the submitter is responsive. For a while Natalie Protasevich did some work on trying to get attention for older regressions, but that effort seems to have died out. Two concrete examples from my personal experience: - http://bugzilla.kernel.org/show_bug.cgi?id=9749; the error: sysctl table check failed: /dev/parport/parport0/devices/ppdev0/timeslice Sysctl already exists First reported for 2.6.24-rc5, just now confirmed with 2.6.25 Acknowledged by maintainer, but no follow-up [1]. - http://bugzilla.kernel.org/show_bug.cgi?id=9310; the error: completely blank console with FRAMEBUFFER_CONSOLE_DETECT_PRIMARY set when framebuffer is active, but no VGA=xxx parameter is passed First reported for 2.6.23, confirmed for 2.6.24-rc6, almost certainly still present in 2.6.25 Acknowledged by maintainer, but no follow-up despite later pings. Another issue is that sometimes developers really are too eager to get their changes into mainline even when there are known issues or when they know in their heart that the changes have not received enough testing. Example is a a scheduler change [2] that causes a completely reproducible regression (music skips and key repeats) on my box with one specific workload. Ingo and Peter have been great doing debugging after I reported it for 2.6.25-rc8 and it was reverted just before the release, but I was very surprised to see the patch resubmitted for 2.6.26 without the regression being resolved first. It is now confirmed to still be there and there has been additional effort on it, but so far without result. This really is nothing against Ingo (in fact he is in my experience one of the most responsive developers when issues are reported), but in this case I personally do feel the patch should not have been reintroduced into mainline before the regression had been sorted out. Cheers, FJP [1] Update: Eric just added a nice reply in Bugzilla. [2] http://bugzilla.kernel.org/show_bug.cgi?id=10428 http://lkml.org/lkml/2008/4/19/181 ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:40 ` Linus Torvalds ` (3 preceding siblings ...) 2008-05-01 3:53 ` Frans Pop @ 2008-05-01 11:38 ` Rafael J. Wysocki 2008-04-30 14:28 ` Arjan van de Ven 4 siblings, 1 reply; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 11:38 UTC (permalink / raw) To: Linus Torvalds Cc: Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thursday, 1 of May 2008, Linus Torvalds wrote: > > On Wed, 30 Apr 2008, Linus Torvalds wrote: > > > > You (and Andrew) have tried to argue that slowing things down results in > > better quality, > > Sorry, not Andrew. DavidN. > > Andrew argued the other way (quality->slower), which I also happen to not > necessarily believe in, but that's a separate argument. > > Nobody should ever argue against raising quality. > > The question could be about "at what cost"? (although I think that's not > necessarily a good argument, since I personally suspect that good quality > code comes from _lowering_ costs, not raising them). > > But what's really relevant is "how?" > > Now, we do know that open-source code tends to be higher quality (along a > number of metrics) than closed source code, and my argument is that it's > not because of bike-shedding (aka code review), but simply because the > code is out there and available and visible. > > And as a result of that, my personal belief is that the best way to raise > quality of code is to distribute it. Yes, as patches for discussion, but > even more so as a part of a cohesive whole - as _merged_ patches! > > The thing is, the quality of individual patches isn't what matters! What > matters is the quality of the end result. And people are going to be a lot > more involved in looking at, testing, and working with code that is > merged, rather than code that isn't. > > So _my_ answer to the "how do we raise quality" is actually the exact > reverse of what you guys seem to be arguing. > > IOW, I argue that the high speed of merging very much is a big part of > what gives us quality in the end. It may result in bugs along the way, but > it also results in fixes, and lots of people looking at the result (and > looking at it in *context*, not just as a patch flying around). And we introduce bugs that nobody sees until they appear in a CERT advisory. IMnsHO, the quick merging results in lots of code that nobody looked at, except for the author, nobody is looking at and nobody will _ever_ look at. Simply, because there's no time for looking at that code, since we're supposed to be working on preparing new code for the next merge window, testing the already merged code etc., around the clock. Now, you may hope that this not-looked-at-by-anyone code is of high quality nevertheless, but I somehow doubt it. [Note that it's not directly related to the issue at hand, which is the fact that people affected by regressions are heavily punished by our current process. Never mind, though.] And that's not to mention bugs that appear in the code everybody looked at and happily reach the mainline because that code has not been tested well enough before merging. Take SLUB as an example, if you wish. The fact is, we're merging stuff with minimal-to-no review and with minimal testing reasonably possible. Is _that_ supposed to produce the high quality? Also, I'm not buying the argument that the quality of code improves over time just because it's open and available to everyone. That only happens to the code which is actually looked at by someone or attempted to modify. This obviously doesn't apply to the whole kernel code. For this reason, IMO, we should do our best to ensure that the code being merged is of high quality already at the moment we merge it. How to achieve that is a separate issue. BTW, we seem to underestimate testing in this discussion. In fact, the vast majority of kernel bugs are discovered by testing, so perhaps the way to go is to make regular testing of the new code a part of the process. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 11:38 ` Rafael J. Wysocki @ 2008-04-30 14:28 ` Arjan van de Ven 2008-05-01 12:41 ` Rafael J. Wysocki 0 siblings, 1 reply; 229+ messages in thread From: Arjan van de Ven @ 2008-04-30 14:28 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, 1 May 2008 13:38:33 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > BTW, we seem to underestimate testing in this discussion. In fact, > the vast majority of kernel bugs are discovered by testing, so > perhaps the way to go is to make regular testing of the new code a > part of the process. well.. -rc1 to -rc8 are doing that already, somewhat. Can we do better? Always. The more testing the better, and the more testers the better. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 14:28 ` Arjan van de Ven @ 2008-05-01 12:41 ` Rafael J. Wysocki 2008-04-30 15:06 ` Arjan van de Ven 0 siblings, 1 reply; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 12:41 UTC (permalink / raw) To: Arjan van de Ven Cc: Linus Torvalds, Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wednesday, 30 of April 2008, Arjan van de Ven wrote: > On Thu, 1 May 2008 13:38:33 +0200 > "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > > > BTW, we seem to underestimate testing in this discussion. In fact, > > the vast majority of kernel bugs are discovered by testing, so > > perhaps the way to go is to make regular testing of the new code a > > part of the process. > > well.. -rc1 to -rc8 are doing that already, somewhat. Somewhat. > Can we do better? Always. The more testing the better, and the more > testers the better. The testing is not really a part of the process right now, though. We somehow hope that the kernel will be tested sufficiently before a major release, but we don't measure the testing coverage, for example. Of course, that will involve more work independent of the code writing, but at one point it'll just become a necessity. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 12:41 ` Rafael J. Wysocki @ 2008-04-30 15:06 ` Arjan van de Ven 0 siblings, 0 replies; 229+ messages in thread From: Arjan van de Ven @ 2008-04-30 15:06 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, Willy Tarreau, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, 1 May 2008 14:41:05 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > > The testing is not really a part of the process right now, though. > We somehow hope that the kernel will be tested sufficiently before a > major release, but we don't measure the testing coverage, for > example. Well. Take 2.6.25.. we know Fedora shipped it in their alpha's and betas (and in rawhide). Those are used by a lot of people; so for me that's a whole bunch of coverage right there. Is it perfect? No. But in a way it's in the spirit of open source: the people who care about a stable release the most (distros) [1], helped us getting this tested. The other people on this thread we care greatly at least also help us test in general. [1] Not trying to say no single person wouldn't care; but a distro tends to care more due to the sheer number of users... ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:19 ` Linus Torvalds 2008-05-01 1:31 ` Andrew Morton 2008-05-01 1:40 ` Linus Torvalds @ 2008-05-01 5:50 ` Willy Tarreau 2008-05-01 11:53 ` Rafael J. Wysocki 2 siblings, 1 reply; 229+ messages in thread From: Willy Tarreau @ 2008-05-01 5:50 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, Apr 30, 2008 at 06:19:56PM -0700, Linus Torvalds wrote: > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > I do _not_ want to slow down development by setting some kind of "quality > > > bar" - but I do believe that we should keep our quality high, not because > > > of any hoops we need to jump through, but because we take pride in the > > > thing we do. > > > > Well, we certainly should, but do we always remeber about it? Honest, guv? > > Hey, guv, do you _honestly_ believe that some kind of ISO-9000-like > process generates quality? > > And I dislike how people try to conflate "quality" and "merging speed" as > if there was any reason what-so-ever to believe that they are related. > > You (and Andrew) have tried to argue that slowing things down results in > better quality, and I simply don't for a moment believe that. I believe > the exact opposite. Note that I'm not necessarily arguing for slowing down, but for reduced functional conflicts (which slow down may help but it's not the only solution). I think that refining the time resolution might achieve the same goal. Instead of merging 10000 changes which each have 1% chance of breaking any other area, and have all developers try to hunt bugs caused by unrelated changes, I think we could do that in steps. To illustrate, instead of changing 100 areas with one of them causing breaking in the other ones, and having 100 victims try to hunt the bug in 99 other areas, then theirs, and finally insult the faulty author, we could merge 50 areas in version X and 50 in X+1 (or 3*33 or 4*25, etc...). That way, we would only have 50 victims trying to find the bug in 49 other areas (or 32 or 24). Less people wasting their time will mean faster validation of changes, and possibly faster release cycle with better quality. People send you their crap every two months. If you accept half of it every month, they don't have to sleep on their code, and at the same time at most half of them are in trouble during half the time (since bugs are found faster). > So if we can get the discussion *away* from the "let's slow things down", > then I'm interested. Because at that point we don't have to fight made-up > arguments about something irrelevant. well, is "let's split changes" ok ? > Linus Willy ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 5:50 ` Willy Tarreau @ 2008-05-01 11:53 ` Rafael J. Wysocki 2008-05-01 12:11 ` Will Newton ` (2 more replies) 0 siblings, 3 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 11:53 UTC (permalink / raw) To: Willy Tarreau Cc: Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thursday, 1 of May 2008, Willy Tarreau wrote: > On Wed, Apr 30, 2008 at 06:19:56PM -0700, Linus Torvalds wrote: > > > > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > > > I do _not_ want to slow down development by setting some kind of "quality > > > > bar" - but I do believe that we should keep our quality high, not because > > > > of any hoops we need to jump through, but because we take pride in the > > > > thing we do. > > > > > > Well, we certainly should, but do we always remeber about it? Honest, guv? > > > > Hey, guv, do you _honestly_ believe that some kind of ISO-9000-like > > process generates quality? > > > > And I dislike how people try to conflate "quality" and "merging speed" as > > if there was any reason what-so-ever to believe that they are related. > > > > You (and Andrew) have tried to argue that slowing things down results in > > better quality, and I simply don't for a moment believe that. I believe > > the exact opposite. > > Note that I'm not necessarily arguing for slowing down, but for reduced > functional conflicts (which slow down may help but it's not the only > solution). I think that refining the time resolution might achieve the > same goal. Instead of merging 10000 changes which each have 1% chance > of breaking any other area, and have all developers try to hunt bugs > caused by unrelated changes, I think we could do that in steps. > > To illustrate, instead of changing 100 areas with one of them causing > breaking in the other ones, and having 100 victims try to hunt the > bug in 99 other areas, then theirs, and finally insult the faulty > author, we could merge 50 areas in version X and 50 in X+1 (or 3*33 > or 4*25, etc...). That way, we would only have 50 victims trying to > find the bug in 49 other areas (or 32 or 24). Less people wasting > their time will mean faster validation of changes, and possibly > faster release cycle with better quality. > > People send you their crap every two months. If you accept half of > it every month, they don't have to sleep on their code, and at the > same time at most half of them are in trouble during half the time > (since bugs are found faster). Well, as far as I'm concerned, that will work too. > > So if we can get the discussion *away* from the "let's slow things down", > > then I'm interested. Because at that point we don't have to fight made-up > > arguments about something irrelevant. > > well, is "let's split changes" ok ? How about: (1) Merge a couple of trees at a time (one tree at a time would be ideal, but that's impossible due to the total number of trees). (2) After (1) give testers some time to report problems introduced by the merge. (3) Wait until the most urgent problems are resolved. Revert the offending changes if there's no solution within given time. (4) Repeat for another couple of trees. (5) Arrange things so that every tree gets merged once every two months. This would also give us an idea of which trees introduce more problems. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 11:53 ` Rafael J. Wysocki @ 2008-05-01 12:11 ` Will Newton 2008-05-01 13:16 ` Bartlomiej Zolnierkiewicz 2008-05-01 19:36 ` Valdis.Kletnieks 2 siblings, 0 replies; 229+ messages in thread From: Will Newton @ 2008-05-01 12:11 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Willy Tarreau, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, May 1, 2008 at 12:53 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > On Thursday, 1 of May 2008, Willy Tarreau wrote: > > On Wed, Apr 30, 2008 at 06:19:56PM -0700, Linus Torvalds wrote: > > > > > > > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > > > > > I do _not_ want to slow down development by setting some kind of "quality > > > > > bar" - but I do believe that we should keep our quality high, not because > > > > > of any hoops we need to jump through, but because we take pride in the > > > > > thing we do. > > > > > > > > Well, we certainly should, but do we always remeber about it? Honest, guv? > > > > > > Hey, guv, do you _honestly_ believe that some kind of ISO-9000-like > > > process generates quality? > > > > > > And I dislike how people try to conflate "quality" and "merging speed" as > > > if there was any reason what-so-ever to believe that they are related. > > > > > > You (and Andrew) have tried to argue that slowing things down results in > > > better quality, and I simply don't for a moment believe that. I believe > > > the exact opposite. > > > > Note that I'm not necessarily arguing for slowing down, but for reduced > > functional conflicts (which slow down may help but it's not the only > > solution). I think that refining the time resolution might achieve the > > same goal. Instead of merging 10000 changes which each have 1% chance > > of breaking any other area, and have all developers try to hunt bugs > > caused by unrelated changes, I think we could do that in steps. > > > > To illustrate, instead of changing 100 areas with one of them causing > > breaking in the other ones, and having 100 victims try to hunt the > > bug in 99 other areas, then theirs, and finally insult the faulty > > author, we could merge 50 areas in version X and 50 in X+1 (or 3*33 > > or 4*25, etc...). That way, we would only have 50 victims trying to > > find the bug in 49 other areas (or 32 or 24). Less people wasting > > their time will mean faster validation of changes, and possibly > > faster release cycle with better quality. > > > > People send you their crap every two months. If you accept half of > > it every month, they don't have to sleep on their code, and at the > > same time at most half of them are in trouble during half the time > > (since bugs are found faster). > > Well, as far as I'm concerned, that will work too. > > > > > So if we can get the discussion *away* from the "let's slow things down", > > > then I'm interested. Because at that point we don't have to fight made-up > > > arguments about something irrelevant. > > > > well, is "let's split changes" ok ? > > How about: > > (1) Merge a couple of trees at a time (one tree at a time would be ideal, but > that's impossible due to the total number of trees). > (2) After (1) give testers some time to report problems introduced by the > merge. > (3) Wait until the most urgent problems are resolved. Revert the offending > changes if there's no solution within given time. > (4) Repeat for another couple of trees. > (5) Arrange things so that every tree gets merged once every two months. > > This would also give us an idea of which trees introduce more problems. Perhaps it would make sense to split the merge window into 2 - first week kernel/net/mm/lib etc., second week arch/drivers/fs? Obviously some changes are going to span those two areas but it might help in pinpointing where breakage was introduced as well as quietening the thundering herd of pull requests at the start of a merge window and thereby allow review to happen over a longer period. Or I could just be dreaming... ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 11:53 ` Rafael J. Wysocki 2008-05-01 12:11 ` Will Newton @ 2008-05-01 13:16 ` Bartlomiej Zolnierkiewicz 2008-05-01 13:53 ` Rafael J. Wysocki 2008-05-01 15:29 ` Ray Lee 2008-05-01 19:36 ` Valdis.Kletnieks 2 siblings, 2 replies; 229+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-05-01 13:16 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Willy Tarreau, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thursday 01 May 2008, Rafael J. Wysocki wrote: > On Thursday, 1 of May 2008, Willy Tarreau wrote: > > On Wed, Apr 30, 2008 at 06:19:56PM -0700, Linus Torvalds wrote: > > > > > > > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > > > > > I do _not_ want to slow down development by setting some kind of "quality > > > > > bar" - but I do believe that we should keep our quality high, not because > > > > > of any hoops we need to jump through, but because we take pride in the > > > > > thing we do. > > > > > > > > Well, we certainly should, but do we always remeber about it? Honest, guv? > > > > > > Hey, guv, do you _honestly_ believe that some kind of ISO-9000-like > > > process generates quality? > > > > > > And I dislike how people try to conflate "quality" and "merging speed" as > > > if there was any reason what-so-ever to believe that they are related. > > > > > > You (and Andrew) have tried to argue that slowing things down results in > > > better quality, and I simply don't for a moment believe that. I believe > > > the exact opposite. > > > > Note that I'm not necessarily arguing for slowing down, but for reduced > > functional conflicts (which slow down may help but it's not the only > > solution). I think that refining the time resolution might achieve the > > same goal. Instead of merging 10000 changes which each have 1% chance > > of breaking any other area, and have all developers try to hunt bugs > > caused by unrelated changes, I think we could do that in steps. > > > > To illustrate, instead of changing 100 areas with one of them causing > > breaking in the other ones, and having 100 victims try to hunt the > > bug in 99 other areas, then theirs, and finally insult the faulty > > author, we could merge 50 areas in version X and 50 in X+1 (or 3*33 > > or 4*25, etc...). That way, we would only have 50 victims trying to > > find the bug in 49 other areas (or 32 or 24). Less people wasting > > their time will mean faster validation of changes, and possibly > > faster release cycle with better quality. > > > > People send you their crap every two months. If you accept half of > > it every month, they don't have to sleep on their code, and at the > > same time at most half of them are in trouble during half the time > > (since bugs are found faster). > > Well, as far as I'm concerned, that will work too. > > > > So if we can get the discussion *away* from the "let's slow things down", > > > then I'm interested. Because at that point we don't have to fight made-up > > > arguments about something irrelevant. > > > > well, is "let's split changes" ok ? > > How about: > > (1) Merge a couple of trees at a time (one tree at a time would be ideal, but > that's impossible due to the total number of trees). > (2) After (1) give testers some time to report problems introduced by the > merge. > (3) Wait until the most urgent problems are resolved. Revert the offending > changes if there's no solution within given time. > (4) Repeat for another couple of trees. > (5) Arrange things so that every tree gets merged once every two months. > > This would also give us an idea of which trees introduce more problems. ...and what would you do with such information? I'm not actually worried about my tree but if (theoretically) it happens to be amongst the "problematic" ones I would be a bit pissed by blame shifting, especially given that it is very difficult to compare different trees as they (usually) deal with quite different areas of the code (some are messy and problematic, yet critical while others can be more forgiving). Also slowing down things to focus on quality is really a bad idea. You can trust me on this one, I've tried it once on the smaller scale and it was a big disaster cause people won't focus on quality just because you want them to. They'll continue to operate in the usual way and try to workaround you instead (which in turn causes extra tensions which may become quiet warfare). In the end you will have a lot more problems to deal with... Same goes for any other kind of improvement by incorporating "punishment" as the part of the process. You are much better helping people and trying them to understand that they should apply some changes to their way of work because it would be also beneficial for _them_, not only for _you_. Now regarding the development model - I think that there is really no need for a revolution yet, instead we should focus on refining the current process (which works great IMO), just to summarize various ideas given by people: - try to persuade few black sheeps that skipping linux-next completely for whole patch series is a really bad idea and that they should try to spend a bit more time on planning for merge instead of LastMinute assembly+push (by doing it right they could spend more time after merge to prepare for the next one or fixing old bugs instead of chasing new regressions, overall they should have _more_ time for development by doing it right) - encourage flatting of merges during the merge window so instead of 1-2 big merges per tree at the beginning of the merge you have few smaller ones (majority of maintainers do it this way already) - more testing for linux-next, distros may be of a great help here (-mm and -next often catches bugs that you wouldn't have ever imagined in the first place and they get fixed before the problem propagates into Linus' tree) - more documentation for lowering the entry barrier for people who would like to review the code (what Al has mentioned in this thread is a great idea so no need for me to repeat it here) - more co-operation between people from different areas of the code (i.e. testing linux-next instead of your own tree) and just not to forget - changes happen by people actually putting the work into them not by endless discussions. Thanks, Bart ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 13:16 ` Bartlomiej Zolnierkiewicz @ 2008-05-01 13:53 ` Rafael J. Wysocki 2008-05-01 14:35 ` Bartlomiej Zolnierkiewicz 2008-05-01 15:29 ` Ray Lee 1 sibling, 1 reply; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 13:53 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: Willy Tarreau, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thursday, 1 of May 2008, Bartlomiej Zolnierkiewicz wrote: > On Thursday 01 May 2008, Rafael J. Wysocki wrote: > > On Thursday, 1 of May 2008, Willy Tarreau wrote: > > > On Wed, Apr 30, 2008 at 06:19:56PM -0700, Linus Torvalds wrote: > > > > > > > > > > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > > > > > > > I do _not_ want to slow down development by setting some kind of "quality > > > > > > bar" - but I do believe that we should keep our quality high, not because > > > > > > of any hoops we need to jump through, but because we take pride in the > > > > > > thing we do. > > > > > > > > > > Well, we certainly should, but do we always remeber about it? Honest, guv? > > > > > > > > Hey, guv, do you _honestly_ believe that some kind of ISO-9000-like > > > > process generates quality? > > > > > > > > And I dislike how people try to conflate "quality" and "merging speed" as > > > > if there was any reason what-so-ever to believe that they are related. > > > > > > > > You (and Andrew) have tried to argue that slowing things down results in > > > > better quality, and I simply don't for a moment believe that. I believe > > > > the exact opposite. > > > > > > Note that I'm not necessarily arguing for slowing down, but for reduced > > > functional conflicts (which slow down may help but it's not the only > > > solution). I think that refining the time resolution might achieve the > > > same goal. Instead of merging 10000 changes which each have 1% chance > > > of breaking any other area, and have all developers try to hunt bugs > > > caused by unrelated changes, I think we could do that in steps. > > > > > > To illustrate, instead of changing 100 areas with one of them causing > > > breaking in the other ones, and having 100 victims try to hunt the > > > bug in 99 other areas, then theirs, and finally insult the faulty > > > author, we could merge 50 areas in version X and 50 in X+1 (or 3*33 > > > or 4*25, etc...). That way, we would only have 50 victims trying to > > > find the bug in 49 other areas (or 32 or 24). Less people wasting > > > their time will mean faster validation of changes, and possibly > > > faster release cycle with better quality. > > > > > > People send you their crap every two months. If you accept half of > > > it every month, they don't have to sleep on their code, and at the > > > same time at most half of them are in trouble during half the time > > > (since bugs are found faster). > > > > Well, as far as I'm concerned, that will work too. > > > > > > So if we can get the discussion *away* from the "let's slow things down", > > > > then I'm interested. Because at that point we don't have to fight made-up > > > > arguments about something irrelevant. > > > > > > well, is "let's split changes" ok ? > > > > How about: > > > > (1) Merge a couple of trees at a time (one tree at a time would be ideal, but > > that's impossible due to the total number of trees). > > (2) After (1) give testers some time to report problems introduced by the > > merge. > > (3) Wait until the most urgent problems are resolved. Revert the offending > > changes if there's no solution within given time. > > (4) Repeat for another couple of trees. > > (5) Arrange things so that every tree gets merged once every two months. > > > > This would also give us an idea of which trees introduce more problems. > > ...and what would you do with such information? > > I'm not actually worried about my tree but if (theoretically) it happens to > be amongst the "problematic" ones I would be a bit pissed by blame shifting, > especially given that it is very difficult to compare different trees as > they (usually) deal with quite different areas of the code (some are messy > and problematic, yet critical while others can be more forgiving). > > Also slowing down things to focus on quality is really a bad idea. You can > trust me on this one, I've tried it once on the smaller scale and it was a > big disaster cause people won't focus on quality just because you want them > to. They'll continue to operate in the usual way and try to workaround you > instead (which in turn causes extra tensions which may become quiet warfare). > In the end you will have a lot more problems to deal with... Well, I won't discuss with your experience. > Same goes for any other kind of improvement by incorporating "punishment" as > the part of the process. You are much better helping people and trying them > to understand that they should apply some changes to their way of work because > it would be also beneficial for _them_, not only for _you_. I agree. > Now regarding the development model - I think that there is really no need > for a revolution yet, instead we should focus on refining the current process > (which works great IMO), just to summarize various ideas given by people: > > - try to persuade few black sheeps that skipping linux-next completely for > whole patch series is a really bad idea and that they should try to spend > a bit more time on planning for merge instead of LastMinute assembly+push > (by doing it right they could spend more time after merge to prepare for > the next one or fixing old bugs instead of chasing new regressions, overall > they should have _more_ time for development by doing it right) > > - encourage flatting of merges during the merge window so instead of 1-2 big > merges per tree at the beginning of the merge you have few smaller ones > (majority of maintainers do it this way already) > > - more testing for linux-next, distros may be of a great help here (-mm and > -next often catches bugs that you wouldn't have ever imagined in the first > place and they get fixed before the problem propagates into Linus' tree) There still are too many bugs of this kind that make it to the Linus' tree and they are the source of this thread. > - more documentation for lowering the entry barrier for people who would like > to review the code (what Al has mentioned in this thread is a great idea > so no need for me to repeat it here) Agreed. > - more co-operation between people from different areas of the code > (i.e. testing linux-next instead of your own tree) Agreed. > and just not to forget - changes happen by people actually putting the work > into them not by endless discussions. Well, I'm not sure what that's supposed to mean, so I won't comment. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 13:53 ` Rafael J. Wysocki @ 2008-05-01 14:35 ` Bartlomiej Zolnierkiewicz 0 siblings, 0 replies; 229+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-05-01 14:35 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Willy Tarreau, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thursday 01 May 2008, Rafael J. Wysocki wrote: > On Thursday, 1 of May 2008, Bartlomiej Zolnierkiewicz wrote: > > On Thursday 01 May 2008, Rafael J. Wysocki wrote: > > > On Thursday, 1 of May 2008, Willy Tarreau wrote: > > > > On Wed, Apr 30, 2008 at 06:19:56PM -0700, Linus Torvalds wrote: > > > > > > > > > > > > > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > > > > > > > > > I do _not_ want to slow down development by setting some kind of "quality > > > > > > > bar" - but I do believe that we should keep our quality high, not because > > > > > > > of any hoops we need to jump through, but because we take pride in the > > > > > > > thing we do. > > > > > > > > > > > > Well, we certainly should, but do we always remeber about it? Honest, guv? > > > > > > > > > > Hey, guv, do you _honestly_ believe that some kind of ISO-9000-like > > > > > process generates quality? > > > > > > > > > > And I dislike how people try to conflate "quality" and "merging speed" as > > > > > if there was any reason what-so-ever to believe that they are related. > > > > > > > > > > You (and Andrew) have tried to argue that slowing things down results in > > > > > better quality, and I simply don't for a moment believe that. I believe > > > > > the exact opposite. > > > > > > > > Note that I'm not necessarily arguing for slowing down, but for reduced > > > > functional conflicts (which slow down may help but it's not the only > > > > solution). I think that refining the time resolution might achieve the > > > > same goal. Instead of merging 10000 changes which each have 1% chance > > > > of breaking any other area, and have all developers try to hunt bugs > > > > caused by unrelated changes, I think we could do that in steps. > > > > > > > > To illustrate, instead of changing 100 areas with one of them causing > > > > breaking in the other ones, and having 100 victims try to hunt the > > > > bug in 99 other areas, then theirs, and finally insult the faulty > > > > author, we could merge 50 areas in version X and 50 in X+1 (or 3*33 > > > > or 4*25, etc...). That way, we would only have 50 victims trying to > > > > find the bug in 49 other areas (or 32 or 24). Less people wasting > > > > their time will mean faster validation of changes, and possibly > > > > faster release cycle with better quality. > > > > > > > > People send you their crap every two months. If you accept half of > > > > it every month, they don't have to sleep on their code, and at the > > > > same time at most half of them are in trouble during half the time > > > > (since bugs are found faster). > > > > > > Well, as far as I'm concerned, that will work too. > > > > > > > > So if we can get the discussion *away* from the "let's slow things down", > > > > > then I'm interested. Because at that point we don't have to fight made-up > > > > > arguments about something irrelevant. > > > > > > > > well, is "let's split changes" ok ? > > > > > > How about: > > > > > > (1) Merge a couple of trees at a time (one tree at a time would be ideal, but > > > that's impossible due to the total number of trees). > > > (2) After (1) give testers some time to report problems introduced by the > > > merge. > > > (3) Wait until the most urgent problems are resolved. Revert the offending > > > changes if there's no solution within given time. > > > (4) Repeat for another couple of trees. > > > (5) Arrange things so that every tree gets merged once every two months. > > > > > > This would also give us an idea of which trees introduce more problems. > > > > ...and what would you do with such information? > > > > I'm not actually worried about my tree but if (theoretically) it happens to > > be amongst the "problematic" ones I would be a bit pissed by blame shifting, > > especially given that it is very difficult to compare different trees as > > they (usually) deal with quite different areas of the code (some are messy > > and problematic, yet critical while others can be more forgiving). > > > > Also slowing down things to focus on quality is really a bad idea. You can > > trust me on this one, I've tried it once on the smaller scale and it was a > > big disaster cause people won't focus on quality just because you want them > > to. They'll continue to operate in the usual way and try to workaround you > > instead (which in turn causes extra tensions which may become quiet warfare). > > In the end you will have a lot more problems to deal with... > > Well, I won't discuss with your experience. > > > Same goes for any other kind of improvement by incorporating "punishment" as > > the part of the process. You are much better helping people and trying them > > to understand that they should apply some changes to their way of work because > > it would be also beneficial for _them_, not only for _you_. > > I agree. > > > Now regarding the development model - I think that there is really no need > > for a revolution yet, instead we should focus on refining the current process > > (which works great IMO), just to summarize various ideas given by people: > > > > - try to persuade few black sheeps that skipping linux-next completely for > > whole patch series is a really bad idea and that they should try to spend > > a bit more time on planning for merge instead of LastMinute assembly+push > > (by doing it right they could spend more time after merge to prepare for > > the next one or fixing old bugs instead of chasing new regressions, overall > > they should have _more_ time for development by doing it right) > > > > - encourage flatting of merges during the merge window so instead of 1-2 big > > merges per tree at the beginning of the merge you have few smaller ones > > (majority of maintainers do it this way already) > > > > - more testing for linux-next, distros may be of a great help here (-mm and > > -next often catches bugs that you wouldn't have ever imagined in the first > > place and they get fixed before the problem propagates into Linus' tree) > > There still are too many bugs of this kind that make it to the Linus' tree and > they are the source of this thread. Agreed but if you trace the way of these bugs into the Linus' tree many of them follow one of two patterns: * -mm / -next skipped completely * short time in -mm / -next (< 2 weeks) [ disclaimer: this is based on my observations, no hard data to prove it ] Please also remember that linux-next concept is still quite _fresh_ with a _plenty_ of room for enhancements like having kernel-du-jour packages for the most popular distros, doing more automated testing + searching for error strings in logs etc. > > - more documentation for lowering the entry barrier for people who would like > > to review the code (what Al has mentioned in this thread is a great idea > > so no need for me to repeat it here) > > Agreed. > > > - more co-operation between people from different areas of the code > > (i.e. testing linux-next instead of your own tree) > > Agreed. > > > and just not to forget - changes happen by people actually putting the work > > into them not by endless discussions. > > Well, I'm not sure what that's supposed to mean, so I won't comment. This was not directed at you (you are doing great work BTW) but rather at some people trolling the thread. Thanks, Bart ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 13:16 ` Bartlomiej Zolnierkiewicz 2008-05-01 13:53 ` Rafael J. Wysocki @ 2008-05-01 15:29 ` Ray Lee 2008-05-01 19:03 ` Willy Tarreau 1 sibling, 1 reply; 229+ messages in thread From: Ray Lee @ 2008-05-01 15:29 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: Rafael J. Wysocki, Willy Tarreau, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, May 1, 2008 at 6:16 AM, Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> wrote: > > On Thursday 01 May 2008, Rafael J. Wysocki wrote: > > How about: > > > > (1) Merge a couple of trees at a time (one tree at a time would be ideal, but > > that's impossible due to the total number of trees). > > (2) After (1) give testers some time to report problems introduced by the > > merge. > > (3) Wait until the most urgent problems are resolved. Revert the offending > > changes if there's no solution within given time. > > (4) Repeat for another couple of trees. > > (5) Arrange things so that every tree gets merged once every two months. > > > > This would also give us an idea of which trees introduce more problems. > > ...and what would you do with such information? [...] > Same goes for any other kind of improvement by incorporating "punishment" as > the part of the process. When a teacher assigns grades in a class, it's not punishment, it's feedback. I don't think anyone *intends* to push crap into the tree. However, with the barrier to getting things into the tree so low, some may feel there's less incentive to try to get things right the first (or second) time. It would be nice to provide that incentive. Normally, it'd be peer-review of the uncommitted patches. We don't have a lot of that going on here, though. So, peer-review-after-the-fact, ie, who placed this massive turd in the tree, and everyone swivels an eye over there and asks what went wrong, and how do we prevent it in the future. Those conversations seem to be happening already, time to time. And as a policy suggestion, if we're past rc1 and someone has identified a commit as the root of a regression/bug, then the policy should be just to revert it immediately, no questions asked. Let the original author work with the person who identified the problem and resend a fixed commit later. We lose testers in the meantime, and perhaps the extra effort involved in having the author work out the issues and redo the patch will help prevent drive-by patching in the future. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 15:29 ` Ray Lee @ 2008-05-01 19:03 ` Willy Tarreau 0 siblings, 0 replies; 229+ messages in thread From: Willy Tarreau @ 2008-05-01 19:03 UTC (permalink / raw) To: Ray Lee Cc: Bartlomiej Zolnierkiewicz, Rafael J. Wysocki, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, May 01, 2008 at 08:29:18AM -0700, Ray Lee wrote: > And as a policy suggestion, if we're past rc1 and someone has > identified a commit as the root of a regression/bug, then the policy > should be just to revert it immediately, no questions asked. Let the > original author work with the person who identified the problem and > resend a fixed commit later. We lose testers in the meantime, and > perhaps the extra effort involved in having the author work out the > issues and redo the patch will help prevent drive-by patching in the > future. you make a valid point here : "we lose testers in the meantime". Maybe it would help if -rc2 would be released a few days after -rc1 with the first most obvious showstoppers (often build issues). The most problematic ones are often fixed within an hour or so, but for most testers, it still means they have to wait for -rc2. Most external testers might then only try -rc2 first, but that's not a problem. What we really want is them to test widely and not revert back at the first problem. If only 20% of testers try -rc1, and the remaining 80% actively wait for -rc2 3 days after, then we'll get broader testing in the first two weeks. Willy ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 11:53 ` Rafael J. Wysocki 2008-05-01 12:11 ` Will Newton 2008-05-01 13:16 ` Bartlomiej Zolnierkiewicz @ 2008-05-01 19:36 ` Valdis.Kletnieks 2 siblings, 0 replies; 229+ messages in thread From: Valdis.Kletnieks @ 2008-05-01 19:36 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Willy Tarreau, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby [-- Attachment #1: Type: text/plain, Size: 897 bytes --] On Thu, 01 May 2008 13:53:18 +0200, "Rafael J. Wysocki" said: > How about: > > (1) Merge a couple of trees at a time (one tree at a time would be ideal, but > that's impossible due to the total number of trees). > (2) After (1) give testers some time to report problems introduced by the > merge. > (3) Wait until the most urgent problems are resolved. Revert the offending > changes if there's no solution within given time. > (4) Repeat for another couple of trees. > (5) Arrange things so that every tree gets merged once every two months. You can't get there from here (at least not very easily). If you have 60 trees, and want a merge for each one every 2 months, you have to average 1 tree a day. How big a delay you want in step (2) directly impacts how many trees you merge at once - if you want a week of cook time, you have to merge 7 trees every Monday, and so on... [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:20 ` Linus Torvalds 2008-05-01 0:42 ` Rafael J. Wysocki @ 2008-05-01 1:30 ` Jeremy Fitzhardinge 2008-05-01 5:35 ` Willy Tarreau 1 sibling, 1 reply; 229+ messages in thread From: Jeremy Fitzhardinge @ 2008-05-01 1:30 UTC (permalink / raw) To: Linus Torvalds Cc: Willy Tarreau, Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby Linus Torvalds wrote: > And I think publicly announced git trees and -mm and linux-next are > great partly because they end up doing that same thing. I heartily > encourage submaintainers to always Cc: linux-kernel when they send me a > "please pull" request - I don't know if anybody else ever really pulls > that tree, but I do think that it's very healthy to write that message > and think of it as a publication event. ] > And, ideally, they would have posted the changes as patches to the list for review anyway, so there shouldn't be anything surprising in that pull... J ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:30 ` Jeremy Fitzhardinge @ 2008-05-01 5:35 ` Willy Tarreau 0 siblings, 0 replies; 229+ messages in thread From: Willy Tarreau @ 2008-05-01 5:35 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: Linus Torvalds, Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, Apr 30, 2008 at 06:30:43PM -0700, Jeremy Fitzhardinge wrote: > Linus Torvalds wrote: > > And I think publicly announced git trees and -mm and linux-next are > > great partly because they end up doing that same thing. I heartily > > encourage submaintainers to always Cc: linux-kernel when they send me a > > "please pull" request - I don't know if anybody else ever really pulls > > that tree, but I do think that it's very healthy to write that message > > and think of it as a publication event. ] > > > > And, ideally, they would have posted the changes as patches to the list > for review anyway, so there shouldn't be anything surprising in that pull... yes, it's something which has been disappearing since use of bk then git. It would be impratical and useless to post everything during the merge window now, but if we can get everyone to pass through linux-next, the posts will be evenly distributed and it would make sense to require everyone to post their changes to the list at the same time. Right now, some developers already always post their changes. Jeff, Greg and Bartlomiej come to mind, and I must say that I'm always interested in performing a quick look, just in case something really obvious catches my attention (which never happens). Willy ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:31 ` Linus Torvalds 2008-04-30 22:41 ` Andrew Morton 2008-04-30 22:46 ` Willy Tarreau @ 2008-04-30 23:03 ` Rafael J. Wysocki 2 siblings, 0 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 23:03 UTC (permalink / raw) To: Linus Torvalds Cc: David Miller, linux-kernel, Andrew Morton, Jiri Slaby, Greg KH On Thursday, 1 of May 2008, Linus Torvalds wrote: > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > And there's no way to avoid the fact that during the merge window, we will > > > get something on the order of ten thousand commits (eg 2.6.24->25-rc1 was > > > 9629 commits). > > > > Well, do we _have_ _to_ take that much? I know we _can_, but is this really > > necessary? > > Do you want me to stop merging your code? Well, no, but actually there are only a few of my patches in this merge window. :-) Moreover, if the maintainers who took them told me they would be scheduled for the next merge window, I wouldn't mind. That actually happended to some of my patches that are in the Greg's tree at the moment and that's fine (although I consider the patches as important). IMO, this is a question of balance. Of course, a maintainer can take everything from everyone, but at the same time he can have a look at the patches and say "Well, I have lots of stuff scheduled for this merge window already, this stuff of yours will wait for the next merge window. Please improve the code or review the others' patches in the meantime". The only thing is to give everyone a fair treatment, which may be a challenge. > Do you think anybody else does? I think the majority of developers would understand if you told them you could only merge a limited amount of changes in a single merge window, provided that they would be treated fairly. When you take everything from everyone, you actually reward people who are able to develop more code between merge windows. Not necessarily those who spend time on different important activities, such as reviewing the others' code, bug tracking etc. > Any suggestions on how to convince people that their code is not worth > merging? That shouldn't be necessary. :-) The point is to tell people to develop the code less rapidly, so to speak. Or maybe more carefully. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:23 ` Rafael J. Wysocki 2008-04-30 22:31 ` Linus Torvalds @ 2008-04-30 22:40 ` david 2008-04-30 23:45 ` Rafael J. Wysocki 1 sibling, 1 reply; 229+ messages in thread From: david @ 2008-04-30 22:40 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, 1 May 2008, Rafael J. Wysocki wrote: > On Wednesday, 30 of April 2008, Linus Torvalds wrote: >> >> On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: >> So your "fewer commits over a unit of time" doesn't make sense. > > Oh, yes it does. Equally well you could say that having brakes in a car > didn't make sense, even if you could drive it as fast as the engine allowed > you to. ;-) > >> We have those ten thousand commits. They need to go in. They cannot take >> forever. > > But perhaps some of them can wait a bit longer. not really, if patches are produced at a rate of 1000/week and you decide to only accept 2000 of them this month, a month later you have 6000 patches to deal with. history has shown that developers do not stop developing if their patches are not accepted, they just fork and go their own way. David Lang ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:40 ` david @ 2008-04-30 23:45 ` Rafael J. Wysocki 2008-04-30 23:57 ` david 2008-05-01 0:38 ` Adrian Bunk 0 siblings, 2 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 23:45 UTC (permalink / raw) To: david Cc: Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thursday, 1 of May 2008, david@lang.hm wrote: > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > On Wednesday, 30 of April 2008, Linus Torvalds wrote: > >> > >> On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > >> So your "fewer commits over a unit of time" doesn't make sense. > > > > Oh, yes it does. Equally well you could say that having brakes in a car > > didn't make sense, even if you could drive it as fast as the engine allowed > > you to. ;-) > > > >> We have those ten thousand commits. They need to go in. They cannot take > >> forever. > > > > But perhaps some of them can wait a bit longer. > > not really, if patches are produced at a rate of 1000/week and you decide > to only accept 2000 of them this month, a month later you have 6000 > patches to deal with. Well, I think you know how TCP works. The sender can only send as much data as the receiver lets it, no matter how much data there are to send. I'm thinking about an analogous approach. If the developers who produce those patches know in advance about the rate limit and are promised to be treated fairly, they should be able to organize their work in a different way. > history has shown that developers do not stop developing if their patches are > not accepted, they just fork and go their own way. That's mostly when they feel that they are treated unfairly. OTOH, insisting that your patches should be merged at the same rate that you're able to develop them is unreasonable to me. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:45 ` Rafael J. Wysocki @ 2008-04-30 23:57 ` david 2008-05-01 0:01 ` Chris Shoemaker 2008-05-01 0:38 ` Adrian Bunk 1 sibling, 1 reply; 229+ messages in thread From: david @ 2008-04-30 23:57 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, 1 May 2008, Rafael J. Wysocki wrote: > On Thursday, 1 of May 2008, david@lang.hm wrote: >> On Thu, 1 May 2008, Rafael J. Wysocki wrote: >> >>> On Wednesday, 30 of April 2008, Linus Torvalds wrote: >>>> >>>> On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: >>>> So your "fewer commits over a unit of time" doesn't make sense. >>> >>> Oh, yes it does. Equally well you could say that having brakes in a car >>> didn't make sense, even if you could drive it as fast as the engine allowed >>> you to. ;-) >>> >>>> We have those ten thousand commits. They need to go in. They cannot take >>>> forever. >>> >>> But perhaps some of them can wait a bit longer. >> >> not really, if patches are produced at a rate of 1000/week and you decide >> to only accept 2000 of them this month, a month later you have 6000 >> patches to deal with. > > Well, I think you know how TCP works. The sender can only send as much > data as the receiver lets it, no matter how much data there are to send. > I'm thinking about an analogous approach. > > If the developers who produce those patches know in advance about the rate > limit and are promised to be treated fairly, they should be able to organize > their work in a different way. they will make the patches bigger to get the changes in a smaller number of patches. arbatrary limits produce gaming the system :-) >> history has shown that developers do not stop developing if their patches are >> not accepted, they just fork and go their own way. > > That's mostly when they feel that they are treated unfairly. > > OTOH, insisting that your patches should be merged at the same rate that you're > able to develop them is unreasonable to me. it's not nessasarily the individuals that fork, it's the distros who want to include the fixes and other changes that the individuals that create the fork. David Lang ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:57 ` david @ 2008-05-01 0:01 ` Chris Shoemaker 2008-05-01 0:14 ` david 0 siblings, 1 reply; 229+ messages in thread From: Chris Shoemaker @ 2008-05-01 0:01 UTC (permalink / raw) To: david Cc: Rafael J. Wysocki, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, Apr 30, 2008 at 04:57:38PM -0700, david@lang.hm wrote: >>> history has shown that developers do not stop developing if their patches are >>> not accepted, they just fork and go their own way. >> >> That's mostly when they feel that they are treated unfairly. >> >> OTOH, insisting that your patches should be merged at the same rate that you're >> able to develop them is unreasonable to me. > > it's not nessasarily the individuals that fork, it's the distros who want > to include the fixes and other changes that the individuals that create the > fork. Is that really bad? Isn't that effectively equivalent to "increased testing of earlier intergrations"? -chris ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 0:01 ` Chris Shoemaker @ 2008-05-01 0:14 ` david 2008-05-01 0:38 ` Linus Torvalds 0 siblings, 1 reply; 229+ messages in thread From: david @ 2008-05-01 0:14 UTC (permalink / raw) To: Chris Shoemaker Cc: Rafael J. Wysocki, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, 30 Apr 2008, Chris Shoemaker wrote: > On Wed, Apr 30, 2008 at 04:57:38PM -0700, david@lang.hm wrote: >>>> history has shown that developers do not stop developing if their patches are >>>> not accepted, they just fork and go their own way. >>> >>> That's mostly when they feel that they are treated unfairly. >>> >>> OTOH, insisting that your patches should be merged at the same rate that you're >>> able to develop them is unreasonable to me. >> >> it's not nessasarily the individuals that fork, it's the distros who want >> to include the fixes and other changes that the individuals that create the >> fork. > > Is that really bad? Isn't that effectively equivalent to "increased testing of > earlier intergrations"? not if there are so many changes that the testing isn't really relavent to mainline. not if the changes don't get into mainline. look at the mess of the distro kernels in the 2.5 and earlier days. having them maintain a large body of patches didn't work for them or for the mainline kernel. David Lang ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 0:14 ` david @ 2008-05-01 0:38 ` Linus Torvalds 2008-05-01 1:39 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 0:38 UTC (permalink / raw) To: david Cc: Chris Shoemaker, Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Wed, 30 Apr 2008, david@lang.hm wrote: > > look at the mess of the distro kernels in the 2.5 and earlier days. having > them maintain a large body of patches didn't work for them or for the mainline > kernel. Exactly. I do think Rafael's TCP analogy is somewhat germane, but it misses the point that the longer the queue gets, the *worse* the quality gets. It gets worse because the queued-up patches don't actually get tested any more during their queueing, and because everybody else who isn't intimately involved with production of said patches just gets *less* inclined to look at big patch-queue than a small one. So having a long queue and trying to manage it (by some kind of negative feedback) is counter-productive, because by the time that situation happens, you're basically screwed already. That's what we largely had with the Xen merge, for example. A lot of the code had been around for basically _forever_, and the people involved in reviewing it got really tired of it, and there was no way in *hell* a new person would ever start reviewing the huge backlog. Once it is massive, it's just too massive. So trying to push back from the destination is really painful. It's also aggravating for everybody else. When people were complaining about me not scaling (remember those flame-wars? Now the complaint is basically the reverse), it was very painful for everybody, and most of all me. So I really really hope that if we need throttling (and I do want to point out that I'm not entirely sure we do - I think the issue is not "number of commits", but "quality of code", and I do _not_ agree that the two are directly related in any way), it should be source-based. Trying to make sure that the source throttles, and not by making developers feel unproductive. And quite frankly, most things that throttle the source are of the annoying and non-productive kind. The classic source throttle tends to be to make it very "expensive" to do development, by introducing various barriers. The barriers are usually "you need to have <n> other people look at it", or "you need to pass this five-hour test-suite", and almost invariably, the big issue is not code quality, but literally to slow things down. And call me crazy, but I think that a process that is designed to not primarily get quality, but slow things down, is likely to generate not just bad feelings, but actually much worse code too! And the thing is, I don't even think our main problem is "lots of changes". I think we've actually been very successful at managing lots of change. Our problems are elsewhere. So I think our primary problems are: - making mistakes is inevitable and cannot be avoided, but we can still add more layers to make it less likely. But these should *not* be aimed at being cumbersome to slow things down - they should basically pipeline perfectly, so that there is no frustrating ping-pong latency. And linux-next falls into this kind of category: it doesn't really slow down development, but it would be another "pipeline stage" in the process. (In contrast, requiring every patch to have <n> reviewed-by etc would add huge latencies and slow down things hugely, and just generally be make-believe work once everybody started gaming the system because it's so irritating) - we do want more testing as part of the pipeline (but again, not synchronously - but to speed up feedback for when things go wrong. So it wouldn't get rid of the errors, but if it happens quickly enough, maybe we'd catch things early in the development pipeline before it even hits my tree) Having more linux-next testing would be great. - Regular *user* debuggability and reporting. Quite frankly, I think the reason a lot of people really like being able to bisect bugs is not that "git bisect" is such an inherently cool program, but because it is a really great tool for *users* to participate in the debugging, in ways oops reports etc were not. Similarly, I love the oops/warning report statistics that Arjan sends out. With vendor support users help debugging and reporting without even necessarily knowing about it. Things like *that* matter a lot. Notice how none of the above are about slowing down development. I don't think quality and speed of development are related. In fact, I think quality and speed often go hand-in-hand: the same way some of the best programmers are also the most productive, I think some of the most productive flows are likely to generate the best code! Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 0:38 ` Linus Torvalds @ 2008-05-01 1:39 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 229+ messages in thread From: Jeremy Fitzhardinge @ 2008-05-01 1:39 UTC (permalink / raw) To: Linus Torvalds Cc: david, Chris Shoemaker, Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby Linus Torvalds wrote: > That's what we largely had with the Xen merge, for example. A lot of the > code had been around for basically _forever_, and the people involved in > reviewing it got really tired of it, and there was no way in *hell* a new > person would ever start reviewing the huge backlog. Once it is massive, > it's just too massive. > Heh. The Xen code in the kernel now is a complete rewrite, with only trace elements from the original patchset. And yes, that's partly because the original patches were unreviewable. J ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:45 ` Rafael J. Wysocki 2008-04-30 23:57 ` david @ 2008-05-01 0:38 ` Adrian Bunk 2008-05-01 0:56 ` Rafael J. Wysocki 1 sibling, 1 reply; 229+ messages in thread From: Adrian Bunk @ 2008-05-01 0:38 UTC (permalink / raw) To: Rafael J. Wysocki Cc: david, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, May 01, 2008 at 01:45:38AM +0200, Rafael J. Wysocki wrote: > On Thursday, 1 of May 2008, david@lang.hm wrote: > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > On Wednesday, 30 of April 2008, Linus Torvalds wrote: > > >> > > >> On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > > >> So your "fewer commits over a unit of time" doesn't make sense. > > > > > > Oh, yes it does. Equally well you could say that having brakes in a car > > > didn't make sense, even if you could drive it as fast as the engine allowed > > > you to. ;-) > > > > > >> We have those ten thousand commits. They need to go in. They cannot take > > >> forever. > > > > > > But perhaps some of them can wait a bit longer. > > > > not really, if patches are produced at a rate of 1000/week and you decide > > to only accept 2000 of them this month, a month later you have 6000 > > patches to deal with. > > Well, I think you know how TCP works. The sender can only send as much > data as the receiver lets it, no matter how much data there are to send. > I'm thinking about an analogous approach. > > If the developers who produce those patches know in advance about the rate > limit and are promised to be treated fairly, they should be able to organize > their work in a different way. >... We cannot control who develops what. When someone wants some feature or wants to get Linux running on his hardware he will always develop the code. We can only control what we merge. And the main rationale for the 2.6 development model was that we do no longer want distributions to ship kernels with insane amounts of patches. > Thanks, > Rafael cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 0:38 ` Adrian Bunk @ 2008-05-01 0:56 ` Rafael J. Wysocki 2008-05-01 1:25 ` Adrian Bunk 0 siblings, 1 reply; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 0:56 UTC (permalink / raw) To: Adrian Bunk Cc: david, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thursday, 1 of May 2008, Adrian Bunk wrote: > On Thu, May 01, 2008 at 01:45:38AM +0200, Rafael J. Wysocki wrote: > > On Thursday, 1 of May 2008, david@lang.hm wrote: > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > > > On Wednesday, 30 of April 2008, Linus Torvalds wrote: > > > >> > > > >> On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > > > >> So your "fewer commits over a unit of time" doesn't make sense. > > > > > > > > Oh, yes it does. Equally well you could say that having brakes in a car > > > > didn't make sense, even if you could drive it as fast as the engine allowed > > > > you to. ;-) > > > > > > > >> We have those ten thousand commits. They need to go in. They cannot take > > > >> forever. > > > > > > > > But perhaps some of them can wait a bit longer. > > > > > > not really, if patches are produced at a rate of 1000/week and you decide > > > to only accept 2000 of them this month, a month later you have 6000 > > > patches to deal with. > > > > Well, I think you know how TCP works. The sender can only send as much > > data as the receiver lets it, no matter how much data there are to send. > > I'm thinking about an analogous approach. > > > > If the developers who produce those patches know in advance about the rate > > limit and are promised to be treated fairly, they should be able to organize > > their work in a different way. > >... > > We cannot control who develops what. We don't need to. > When someone wants some feature or wants to get Linux running on his > hardware he will always develop the code. > > We can only control what we merge. To be exact, we control what we merge and when. There's no rule saying that every patch has to be merged as soon as it appears to be ready for merging, or during the nearest merge window, AFAICS. > And the main rationale for the 2.6 development model was that we do no > longer want distributions to ship kernels with insane amounts of > patches. This was an argument agaist starting a separate development branch in analogy with 2.5, IIRC, and I agree with that. Still, I think we don't need to merge patches at the current rate and it might help improve their overall quality if we didn't. Of course, the latter is only a speculation, although it's based on my experience. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 0:56 ` Rafael J. Wysocki @ 2008-05-01 1:25 ` Adrian Bunk 2008-05-01 12:05 ` Rafael J. Wysocki 0 siblings, 1 reply; 229+ messages in thread From: Adrian Bunk @ 2008-05-01 1:25 UTC (permalink / raw) To: Rafael J. Wysocki Cc: david, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, May 01, 2008 at 02:56:23AM +0200, Rafael J. Wysocki wrote: > On Thursday, 1 of May 2008, Adrian Bunk wrote: > > On Thu, May 01, 2008 at 01:45:38AM +0200, Rafael J. Wysocki wrote: > > > On Thursday, 1 of May 2008, david@lang.hm wrote: > > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > > > > > On Wednesday, 30 of April 2008, Linus Torvalds wrote: > > > > >> > > > > >> On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > > > > >> So your "fewer commits over a unit of time" doesn't make sense. > > > > > > > > > > Oh, yes it does. Equally well you could say that having brakes in a car > > > > > didn't make sense, even if you could drive it as fast as the engine allowed > > > > > you to. ;-) > > > > > > > > > >> We have those ten thousand commits. They need to go in. They cannot take > > > > >> forever. > > > > > > > > > > But perhaps some of them can wait a bit longer. > > > > > > > > not really, if patches are produced at a rate of 1000/week and you decide > > > > to only accept 2000 of them this month, a month later you have 6000 > > > > patches to deal with. > > > > > > Well, I think you know how TCP works. The sender can only send as much > > > data as the receiver lets it, no matter how much data there are to send. > > > I'm thinking about an analogous approach. > > > > > > If the developers who produce those patches know in advance about the rate > > > limit and are promised to be treated fairly, they should be able to organize > > > their work in a different way. > > >... > > > > We cannot control who develops what. > > We don't need to. > > > When someone wants some feature or wants to get Linux running on his > > hardware he will always develop the code. > > > > We can only control what we merge. > > To be exact, we control what we merge and when. There's no rule saying that > every patch has to be merged as soon as it appears to be ready for merging, > or during the nearest merge window, AFAICS. What currently gets applied to the kernel are between two and three million lines changed per year. We can discuss when and how to apply them. But unless we want to create an evergrowing backlog we have to change roughly 200.000 lines per month on average. Even with higher quality criteria that might result in some code not being merged we will still be > 100.000 lines per month on average. > > And the main rationale for the 2.6 development model was that we do no > > longer want distributions to ship kernels with insane amounts of > > patches. > > This was an argument agaist starting a separate development branch in analogy > with 2.5, IIRC, and I agree with that. > > Still, I think we don't need to merge patches at the current rate and it might > help improve their overall quality if we didn't. Of course, the latter is only > a speculation, although it's based on my experience. See above - what do you want to do if we'd merge less and have a backlog of let's say one million lines to change after one year, much of it already in distribution kernels? I also don't like this situation, but we have to cope with it. > Thanks, > Rafael cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:25 ` Adrian Bunk @ 2008-05-01 12:05 ` Rafael J. Wysocki 0 siblings, 0 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 12:05 UTC (permalink / raw) To: Adrian Bunk Cc: david, Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thursday, 1 of May 2008, Adrian Bunk wrote: > On Thu, May 01, 2008 at 02:56:23AM +0200, Rafael J. Wysocki wrote: > > On Thursday, 1 of May 2008, Adrian Bunk wrote: > > > On Thu, May 01, 2008 at 01:45:38AM +0200, Rafael J. Wysocki wrote: > > > > On Thursday, 1 of May 2008, david@lang.hm wrote: > > > > > On Thu, 1 May 2008, Rafael J. Wysocki wrote: > > > > > > > > > > > On Wednesday, 30 of April 2008, Linus Torvalds wrote: > > > > > >> > > > > > >> On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > > > > > >> So your "fewer commits over a unit of time" doesn't make sense. > > > > > > > > > > > > Oh, yes it does. Equally well you could say that having brakes in a car > > > > > > didn't make sense, even if you could drive it as fast as the engine allowed > > > > > > you to. ;-) > > > > > > > > > > > >> We have those ten thousand commits. They need to go in. They cannot take > > > > > >> forever. > > > > > > > > > > > > But perhaps some of them can wait a bit longer. > > > > > > > > > > not really, if patches are produced at a rate of 1000/week and you decide > > > > > to only accept 2000 of them this month, a month later you have 6000 > > > > > patches to deal with. > > > > > > > > Well, I think you know how TCP works. The sender can only send as much > > > > data as the receiver lets it, no matter how much data there are to send. > > > > I'm thinking about an analogous approach. > > > > > > > > If the developers who produce those patches know in advance about the rate > > > > limit and are promised to be treated fairly, they should be able to organize > > > > their work in a different way. > > > >... > > > > > > We cannot control who develops what. > > > > We don't need to. > > > > > When someone wants some feature or wants to get Linux running on his > > > hardware he will always develop the code. > > > > > > We can only control what we merge. > > > > To be exact, we control what we merge and when. There's no rule saying that > > every patch has to be merged as soon as it appears to be ready for merging, > > or during the nearest merge window, AFAICS. > > What currently gets applied to the kernel are between two and three > million lines changed per year. > > We can discuss when and how to apply them. > > But unless we want to create an evergrowing backlog we have to change > roughly 200.000 lines per month on average. > > Even with higher quality criteria that might result in some code not > being merged we will still be > 100.000 lines per month on average. > > > > And the main rationale for the 2.6 development model was that we do no > > > longer want distributions to ship kernels with insane amounts of > > > patches. > > > > This was an argument agaist starting a separate development branch in analogy > > with 2.5, IIRC, and I agree with that. > > > > Still, I think we don't need to merge patches at the current rate and it might > > help improve their overall quality if we didn't. Of course, the latter is only > > a speculation, although it's based on my experience. > > See above - what do you want to do if we'd merge less and have a backlog > of let's say one million lines to change after one year, much of it > already in distribution kernels? > > I also don't like this situation, but we have to cope with it. Well, I'm feeling that's what Linus is tryig to say too. :-) I, for one, don't really want to cope with a situation I don't feel comfortable in, because in the long run that leads to growing frustration. It seems pretty obvious to me that people generally get more and more frustrated with the current development process and it will have to be addressed somehow anyway. If there's a problem, and I think that there really _is_ one, we should at least try to _address_ it instead of just trying to duck it. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:45 ` Rafael J. Wysocki 2008-04-30 21:37 ` Linus Torvalds @ 2008-05-01 13:54 ` Stefan Richter 2008-05-01 14:06 ` Rafael J. Wysocki 1 sibling, 1 reply; 229+ messages in thread From: Stefan Richter @ 2008-05-01 13:54 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby Rafael J. Wysocki wrote: > And what do you think is happening _after_ the merge window closes, when > we're supposed to be fixing bugs? People work on new code. That's not correct. People work on new code before, during, and after the merge window. They also fix bugs before, during, and after it. > And, in fact, they have to, if they want to be ready for the next merge > window. To be ready for the next merge window just means to know which code is sufficiently reviewed and tested, and to have it queued up and if necessary synchronized with other pending code. -- Stefan Richter -=====-==--- -=-= ----= http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 13:54 ` Stefan Richter @ 2008-05-01 14:06 ` Rafael J. Wysocki 0 siblings, 0 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 14:06 UTC (permalink / raw) To: Stefan Richter Cc: Linus Torvalds, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thursday, 1 of May 2008, Stefan Richter wrote: > Rafael J. Wysocki wrote: > > And what do you think is happening _after_ the merge window closes, when > > we're supposed to be fixing bugs? People work on new code. > > That's not correct. People work on new code before, during, and after > the merge window. They also fix bugs before, during, and after it. I'm not quite sure if really all of them do. Well, I should have said "some people" instead of just "poeple" to be fair. > > And, in fact, they have to, if they want to be ready for the next merge > > window. > > To be ready for the next merge window just means to know which code is > sufficiently reviewed and tested, and to have it queued up and if > necessary synchronized with other pending code. Of course it _should_ mean that, but the fact is unreviewed and untested patches are pushed to Linus, at least from time to time. [Even some known broken patches were pushed to Linus in the past, but we can't prevent that from happening by any process changes.] Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:05 ` Linus Torvalds 2008-04-30 20:14 ` Linus Torvalds 2008-04-30 20:45 ` Rafael J. Wysocki @ 2008-04-30 23:29 ` Paul Mackerras 2008-05-01 1:57 ` Jeff Garzik 2008-05-01 3:47 ` Linus Torvalds 2 siblings, 2 replies; 229+ messages in thread From: Paul Mackerras @ 2008-04-30 23:29 UTC (permalink / raw) To: Linus Torvalds Cc: Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby Linus Torvalds writes: > So one of the major things about the short merge window is that it's > hopefully encouraging people to have things ready by the time the merge > window opens, because it's too late to do anything later. Having things ready by the time the merge window opens is difficult when you don't know when the merge window is going to open. OK, after you release a -rc6 or -rc7, we know it's close, but it could still be three weeks off at that point. Or it could be tomorrow. That's mitigated at the moment by having the merge window be two weeks long. So if you open the merge window at a point where I, or someone downstream of me, thought we still had two weeks to go, we can hurry up and try to get stuff finished within the first week and still get it merged. But if you made a really hard and fast rule that only stuff that is in linux-next at the point where the merge window opens can be merged, AND the point at which the merge window opens is unknown and unpredictable within a period of about 4 weeks, then that makes it really tough for those of us downstream of you to plan our work. By the way, if you do want to make that rule, then there's a really easy way to do it - just pull linux-next, and make that one pull be the entire merge window. :) But please give us at least a week's notice that you're going to do that. Paul. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:29 ` Paul Mackerras @ 2008-05-01 1:57 ` Jeff Garzik 2008-05-01 2:52 ` Frans Pop 2008-05-01 3:47 ` Linus Torvalds 1 sibling, 1 reply; 229+ messages in thread From: Jeff Garzik @ 2008-05-01 1:57 UTC (permalink / raw) To: Paul Mackerras Cc: Linus Torvalds, Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby Paul Mackerras wrote: > By the way, if you do want to make that rule, then there's a really > easy way to do it - just pull linux-next, and make that one pull be > the entire merge window. :) That's a unique and interesting idea... Jeff ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 1:57 ` Jeff Garzik @ 2008-05-01 2:52 ` Frans Pop 0 siblings, 0 replies; 229+ messages in thread From: Frans Pop @ 2008-05-01 2:52 UTC (permalink / raw) To: Jeff Garzik; +Cc: paulus, torvalds, rjw, davem, linux-kernel, akpm, jirislaby Jeff Garzik wrote: > Paul Mackerras wrote: >> By the way, if you do want to make that rule, then there's a really >> easy way to do it - just pull linux-next, and make that one pull be >> the entire merge window. :) > > That's a unique and interesting idea... Full ack. Especially if there was some kind of "pre-merge linux-next freeze" where people (arch maintainers, kernel testers) would be actively invited to do pre-merge testing. During that period only changes that fix reported issues (be it build issues or regressions) would be allowed: - either a revert of the problematic commit - or a targeted fix This could even hugely improve the bisectability of mainline after the merge as such changes could be merged/rebased into the subsystem tree _before_ Linus pulls them into mainline. Currently I avoid -next and -mm and I also don't do any merge window testing. Why? Too much flux, too many issues, too much energy required. But if there was some sort of pre-merge call for testing of an identifiable and relatively stable tree, I would definitely participate in that and be willing to spend time to bisect the hell out of any issues I'd find. Cheers, FJP ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:29 ` Paul Mackerras 2008-05-01 1:57 ` Jeff Garzik @ 2008-05-01 3:47 ` Linus Torvalds 2008-05-01 4:17 ` Jeff Garzik 1 sibling, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 3:47 UTC (permalink / raw) To: Paul Mackerras Cc: Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, 1 May 2008, Paul Mackerras wrote: > > Having things ready by the time the merge window opens is difficult > when you don't know when the merge window is going to open. OK, after > you release a -rc6 or -rc7, we know it's close, but it could still be > three weeks off at that point. Or it could be tomorrow. Well, if the tree is ready, you shouldn't need to care ;) That said: > By the way, if you do want to make that rule, then there's a really > easy way to do it - just pull linux-next, and make that one pull be > the entire merge window. :) But please give us at least a week's > notice that you're going to do that. I'm not going to pull linux-next, because I hate how it gets rebuilt every time it gets done, so I would basically have to pick one at random, and then that would be it. I also do actually try to spread the early pulls out a _bit_, so that if/when problems happen, there's some amount of information in the fact that something started showing up between -git2 and -git3. HOWEVER. One thing that was discussed when linux-next was starting up was whether I would maintain a next branch myself, that people could actually depend on (unlike linux-next, which gets rebuilt). And while I could do that for really core infrastructure changes, I really would hate to see something like that become part of the flow - because I'd hope things that really require it should be so rare that it's not worth it for me to maintain a separate branch for it. But there could be some kind of carrot here - maybe I could maintain a "next" branch myself, not for core infrastructure, but for stuff where the maintainer says "hey, I'm ready early, you can pull me into 'next' already". In other words, it wouldn't be "core infrastructure", it would simply be stuff that you already know you'd send to me on the first day of the merge window. And if by maintaining a "next" branch I could encourage people to go early, _and_ let others perhaps build on it and sort out merge conflicts (which you can't do well on linux-next, exactly because it's a bit of a quick-sand and you cannot depend on merging the same order or even the same base in the end), maybe me having a 'next' branch would be worth it. But it would have to be low-maintenance. Something I might open after -rc4, say, and something where I'd expect people to only ask me to pull _once_ (because they really are mostly ready, and can sort out the rest after the merge window), and if they have no open regressions (again, the "carrot" for good behaviour). I'm not saying it's a great idea, but if that kind of flow makes sense to people, maybe it should be on the table as an idea or at least see if it might work. But let's see how linux-next works out. Maybe all the subsystem maintainers can just get their tree in shape, see that it merges in linux-next, and not even need anything else. Then, when the merge window opens, if you're ready, just let me know. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 3:47 ` Linus Torvalds @ 2008-05-01 4:17 ` Jeff Garzik 2008-05-01 4:46 ` Linus Torvalds 2008-05-01 9:17 ` Alan Cox 0 siblings, 2 replies; 229+ messages in thread From: Jeff Garzik @ 2008-05-01 4:17 UTC (permalink / raw) To: Linus Torvalds Cc: Paul Mackerras, Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby Linus Torvalds wrote: > But there could be some kind of carrot here - maybe I could maintain a > "next" branch myself, not for core infrastructure, but for stuff where the > maintainer says "hey, I'm ready early, you can pull me into 'next' > already". > > In other words, it wouldn't be "core infrastructure", it would simply be > stuff that you already know you'd send to me on the first day of the merge > window. And if by maintaining a "next" branch I could encourage people to > go early, _and_ let others perhaps build on it and sort out merge > conflicts (which you can't do well on linux-next, exactly because it's a > bit of a quick-sand and you cannot depend on merging the same order or > even the same base in the end), maybe me having a 'next' branch would be > worth it. linux-next is _supposed_ to be solely the stuff that is ready to be sent to you upon window-open. The only thing that isn't reliable are the commit ids -- and that's at the request of a large majority of maintainers, who noted to Stephen R that the branch he was pulling from them might get rebased -- thus necessitating the daily tree regeneration. So, I think a 'next' branch from you would open cans o worms: - one more tree to test, and judging from linux-next and -mm it's tough to get developers to test more than just upstream - is the value of holy penguin pee great enough to overcome this another-tree-to-test obstacle? - opens all the debates about running parallel branches, such as, would it be better to /branch/ for 2.6.X-rc, and then keep going full steam on the trunk? After all, the primary logic behind 2.6.X-rc is to only take bug fixes, theoretically focusing developers more on that task. But now we are slowly undoing that logic, or at least openly admitting that has been the reality all along. Jeff ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 4:17 ` Jeff Garzik @ 2008-05-01 4:46 ` Linus Torvalds 2008-05-04 13:47 ` Krzysztof Halasa 2008-05-01 9:17 ` Alan Cox 1 sibling, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-05-01 4:46 UTC (permalink / raw) To: Jeff Garzik Cc: Paul Mackerras, Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby On Thu, 1 May 2008, Jeff Garzik wrote: > > linux-next is _supposed_ to be solely the stuff that is ready to be sent to > you upon window-open. Yes, the "stuff" may be supposed to be stable. But the trees feeding it certainly are not. People are rebasing them etc, and it doesn't matter because I think linux-next starts largely from scratch next time around. > So, I think a 'next' branch from you would open cans o worms: > > - one more tree to test, and judging from linux-next and -mm it's tough to get > developers to test more than just upstream > > - is the value of holy penguin pee great enough to overcome this > another-tree-to-test obstacle? > > - opens all the debates about running parallel branches, such as, would it be > better to /branch/ for 2.6.X-rc, and then keep going full steam on the trunk? I do agree. And maybe I should have made it clear that I think it's worth it to me only if it then means that the merge window can shrink. If I'd have both a 'next' branch _and_ a full 2-week merge window, there's no upside. Btw, it wouldn't be another tree to test, since it would presumaby be what 'linux-next' starts out from - so it would purely be something that doesn't have the constant re-merging of the more wild-and-crazy 'linux-next' tree. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 4:46 ` Linus Torvalds @ 2008-05-04 13:47 ` Krzysztof Halasa 2008-05-04 15:05 ` Jacek Luczak 0 siblings, 1 reply; 229+ messages in thread From: Krzysztof Halasa @ 2008-05-04 13:47 UTC (permalink / raw) To: Linus Torvalds Cc: Jeff Garzik, Paul Mackerras, Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby Personally I think the current process works reasonably well, though as we should always try to improve it further... Linus Torvalds <torvalds@linux-foundation.org> writes: > On Thu, 1 May 2008, Jeff Garzik wrote: >> - opens all the debates about running parallel branches, such as, would it be >> better to /branch/ for 2.6.X-rc, and then keep going full steam on >> the trunk? I think you could branch at ~ rc3 (strictly critical fixes only from this point). This way 'next' wouldn't be low-maintenance but the release branch would be. I.e., the merge window would open at ~ rc3. At 'final', the merge window would probably be already closed :-) Something like: - 2.6.26-rc3: 2.6.27 merge window opens, 2.6.26 - fixes only - 1 week later: no core changes for 2.6.27 except fixes (drivers only?) 2.6.26* would receive backports from 2.6.27 (cherry-picking? applying on 2.6.26 and merging?). The "no open regressions" rule would make sense certainly - unless in a specific case agreed otherwise. Perhaps if needed you could let other people do the final release ("stable" extension) and concentrate on the trunk. > If I'd have both a 'next' branch _and_ a full 2-week merge window, there's > no upside. Shorter cycle is the big upside. Perhaps we could start branching later at first - say at 2.6.26-rc5, and see how does it work. -- Krzysztof Halasa ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-04 13:47 ` Krzysztof Halasa @ 2008-05-04 15:05 ` Jacek Luczak 0 siblings, 0 replies; 229+ messages in thread From: Jacek Luczak @ 2008-05-04 15:05 UTC (permalink / raw) To: Krzysztof Halasa Cc: Linus Torvalds, Jeff Garzik, Paul Mackerras, Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby Krzysztof Halasa pisze: > Personally I think the current process works reasonably well, though > as we should always try to improve it further... > > Linus Torvalds <torvalds@linux-foundation.org> writes: > >> On Thu, 1 May 2008, Jeff Garzik wrote: >>> - opens all the debates about running parallel branches, such as, would it be >>> better to /branch/ for 2.6.X-rc, and then keep going full steam on >>> the trunk? > > I think you could branch at ~ rc3 (strictly critical fixes only from > this point). This way 'next' wouldn't be low-maintenance but the > release branch would be. > > I.e., the merge window would open at ~ rc3. At 'final', the merge window > would probably be already closed :-) > > Something like: > - 2.6.26-rc3: 2.6.27 merge window opens, 2.6.26 - fixes only > - 1 week later: no core changes for 2.6.27 except fixes (drivers only?) Yep, that sounds pretty interesting. But It would be better to start something like ,,slow merge window'' (explained below) around -rc4 where things really slow down (or used to). The idea of ,,slow merge window'' would look like: - merge only *obvious* (long awaiting) changes; - merge stuff (fixes) which comes to -rc releases; - merge non-core changes from -mm; After releasing stable kernel the old style merge window opens. > 2.6.26* would receive backports from 2.6.27 (cherry-picking? applying > on 2.6.26 and merging?). > The "no open regressions" rule would make sense certainly - unless in > a specific case agreed otherwise. > > Perhaps if needed you could let other people do the final release > ("stable" extension) and concentrate on the trunk. > >> If I'd have both a 'next' branch _and_ a full 2-week merge window, there's >> no upside. > > Shorter cycle is the big upside. > > Perhaps we could start branching later at first - say at 2.6.26-rc5, > and see how does it work. -Jacek ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 4:17 ` Jeff Garzik 2008-05-01 4:46 ` Linus Torvalds @ 2008-05-01 9:17 ` Alan Cox 1 sibling, 0 replies; 229+ messages in thread From: Alan Cox @ 2008-05-01 9:17 UTC (permalink / raw) To: Jeff Garzik Cc: Linus Torvalds, Paul Mackerras, Rafael J. Wysocki, David Miller, linux-kernel, Andrew Morton, Jiri Slaby > - opens all the debates about running parallel branches, such as, would > it be better to /branch/ for 2.6.X-rc, and then keep going full steam on > the trunk? After all, the primary logic behind 2.6.X-rc is to only take That encourages developers to continue ignoring that stabilizing work. The stall does have a side effect of refocussing them. A branch for -rc and a monthly cycle would be interesting as it would mean that the pushback for not fixing stability problems would be not getting you work pulled for the main tree if you didn't fix the bugs first - and could be both sufficient an incentive and not too vicious as it would be with a 2 month cycle. Alan ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 19:36 ` Rafael J. Wysocki 2008-04-30 20:00 ` Andrew Morton 2008-04-30 20:05 ` Linus Torvalds @ 2008-04-30 20:15 ` Andrew Morton 2008-04-30 20:31 ` Linus Torvalds 2 siblings, 1 reply; 229+ messages in thread From: Andrew Morton @ 2008-04-30 20:15 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: davem, linux-kernel, torvalds, jirislaby On Wed, 30 Apr 2008 21:36:57 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > IMO, the merge window is way too short for actually testing anything. <jumps up and down> There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! _anything_ which appears in 2.6.x-rc1 and which wasn't in 2.6.x-mm1 was snuck in too late (OK, apart from trivia and bugfixes). If we decide that we need to fix the oh-shit-lets-slam-this-in-and-hope problem then I expect we can do so, via fairly relible means. But the first attempt at solving it should be to ask people to not do that. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:15 ` Andrew Morton @ 2008-04-30 20:31 ` Linus Torvalds 2008-04-30 20:47 ` Dan Noe ` (3 more replies) 0 siblings, 4 replies; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 20:31 UTC (permalink / raw) To: Andrew Morton; +Cc: Rafael J. Wysocki, davem, linux-kernel, jirislaby On Wed, 30 Apr 2008, Andrew Morton wrote: > > <jumps up and down> > > There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! The problem I see with both -mm and linux-next is that they tend to be better at finding the "physical conflict" kind of issues (ie the merge itself fails) than the "code looks ok but doesn't actually work" kind of issue. Why? The tester base is simply too small. Now, if *that* could be improved, that would be wonderful, but I'm not seeing it as very likely. I think we have fairly good penetration these days with the regular -git tree, but I think that one is quite frankly a *lot* less scary than -mm or -next are, and there it has been an absolutely huge boon to get the kernel into the Fedora test-builds etc (and I _think_ Ubuntu and SuSE also started something like that). So I'm very pessimistic about getting a lot of test coverage before -rc1. Maybe too pessimistic, who knows? Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:31 ` Linus Torvalds @ 2008-04-30 20:47 ` Dan Noe 2008-04-30 20:59 ` Andrew Morton 2008-04-30 20:54 ` Andrew Morton ` (2 subsequent siblings) 3 siblings, 1 reply; 229+ messages in thread From: Dan Noe @ 2008-04-30 20:47 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby On 4/30/2008 16:31, Linus Torvalds wrote: > > On Wed, 30 Apr 2008, Andrew Morton wrote: >> <jumps up and down> >> >> There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! > > The problem I see with both -mm and linux-next is that they tend to be > better at finding the "physical conflict" kind of issues (ie the merge > itself fails) than the "code looks ok but doesn't actually work" kind of > issue. > > Why? > > The tester base is simply too small. > > Now, if *that* could be improved, that would be wonderful, but I'm not > seeing it as very likely. Perhaps we should be clear and simple about what potential testers should be running at any given point in time. With -mm, linux-next, linux-2.6, etc, as a newcomer I find it difficult to know where my testing time and energy is best directed. Is linux-next the right thing to be running at this point? Is there a need for testing in a particular tree (netdev, x86, etc)? Cheers, Dan -- /--------------- - - - - - - | Dan Noe | http://isomerica.net/~dpn/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:47 ` Dan Noe @ 2008-04-30 20:59 ` Andrew Morton 2008-04-30 21:30 ` Rafael J. Wysocki 2008-04-30 22:53 ` Mariusz Kozlowski 0 siblings, 2 replies; 229+ messages in thread From: Andrew Morton @ 2008-04-30 20:59 UTC (permalink / raw) To: Dan Noe; +Cc: torvalds, rjw, davem, linux-kernel, jirislaby On Wed, 30 Apr 2008 16:47:00 -0400 Dan Noe <dpn@isomerica.net> wrote: > On 4/30/2008 16:31, Linus Torvalds wrote: > > > > On Wed, 30 Apr 2008, Andrew Morton wrote: > >> <jumps up and down> > >> > >> There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! > > > > The problem I see with both -mm and linux-next is that they tend to be > > better at finding the "physical conflict" kind of issues (ie the merge > > itself fails) than the "code looks ok but doesn't actually work" kind of > > issue. > > > > Why? > > > > The tester base is simply too small. > > > > Now, if *that* could be improved, that would be wonderful, but I'm not > > seeing it as very likely. > > Perhaps we should be clear and simple about what potential testers > should be running at any given point in time. With -mm, linux-next, > linux-2.6, etc, as a newcomer I find it difficult to know where my > testing time and energy is best directed. -mm consists of the sum of a) the ~80 subsytem maintainers trees (git and quilt) b) the ~100 subsytem trees which are hosted only in -mm. linux-next consists of only a) Soon I shall remove a) from -mm and will replace it with linux-next (this should be a no-op). Later, I shall start feeding those 100 random subsystems into linux-next as well (somehow). > Is linux-next the right thing to be running at this point? yes. 85% of the code which goes into Linux goes via the ~80 subsystem maintainers' trees and is (or should be) in linux-next. The other 15% is the hosted-in-mm work. > Is there a > need for testing in a particular tree (netdev, x86, etc)? No, please test the sum-of-all-trees in linux-next. If you hit problems then, as part of the problem resolving process a developer _might_ ask you to test one tree specifically, but that would be a pretty unusual circumstance. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:59 ` Andrew Morton @ 2008-04-30 21:30 ` Rafael J. Wysocki 2008-04-30 21:37 ` Andrew Morton 2008-04-30 22:08 ` Linus Torvalds 2008-04-30 22:53 ` Mariusz Kozlowski 1 sibling, 2 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 21:30 UTC (permalink / raw) To: Andrew Morton Cc: Dan Noe, torvalds, davem, linux-kernel, jirislaby, Stephen Rothwell On Wednesday, 30 of April 2008, Andrew Morton wrote: > On Wed, 30 Apr 2008 16:47:00 -0400 > Dan Noe <dpn@isomerica.net> wrote: > > > On 4/30/2008 16:31, Linus Torvalds wrote: > > > > > > On Wed, 30 Apr 2008, Andrew Morton wrote: > > >> <jumps up and down> > > >> > > >> There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! > > > > > > The problem I see with both -mm and linux-next is that they tend to be > > > better at finding the "physical conflict" kind of issues (ie the merge > > > itself fails) than the "code looks ok but doesn't actually work" kind of > > > issue. > > > > > > Why? > > > > > > The tester base is simply too small. > > > > > > Now, if *that* could be improved, that would be wonderful, but I'm not > > > seeing it as very likely. > > > > Perhaps we should be clear and simple about what potential testers > > should be running at any given point in time. With -mm, linux-next, > > linux-2.6, etc, as a newcomer I find it difficult to know where my > > testing time and energy is best directed. > > -mm consists of the sum of > > a) the ~80 subsytem maintainers trees (git and quilt) > > b) the ~100 subsytem trees which are hosted only in -mm. > > > linux-next consists of only a) > > Soon I shall remove a) from -mm and will replace it with linux-next (this > should be a no-op). > > Later, I shall start feeding those 100 random subsystems into linux-next > as well (somehow). > > > Is linux-next the right thing to be running at this point? > > yes. 85% of the code which goes into Linux goes via the ~80 subsystem > maintainers' trees and is (or should be) in linux-next. The other 15% > is the hosted-in-mm work. > > > Is there a > > need for testing in a particular tree (netdev, x86, etc)? > > No, please test the sum-of-all-trees in linux-next. If you hit problems > then, as part of the problem resolving process a developer _might_ ask you > to test one tree specifically, but that would be a pretty unusual > circumstance. How bisectable is linux-next, BTW? ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 21:30 ` Rafael J. Wysocki @ 2008-04-30 21:37 ` Andrew Morton 2008-04-30 22:08 ` Linus Torvalds 1 sibling, 0 replies; 229+ messages in thread From: Andrew Morton @ 2008-04-30 21:37 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: dpn, torvalds, davem, linux-kernel, jirislaby, sfr On Wed, 30 Apr 2008 23:30:20 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > > No, please test the sum-of-all-trees in linux-next. If you hit problems > > then, as part of the problem resolving process a developer _might_ ask you > > to test one tree specifically, but that would be a pretty unusual > > circumstance. > > How bisectable is linux-next, BTW? don't know. Fully, one hopes. Laurent Riffard did a successful bisection last month; I don't see many other signs on the linux-next list. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 21:30 ` Rafael J. Wysocki 2008-04-30 21:37 ` Andrew Morton @ 2008-04-30 22:08 ` Linus Torvalds 1 sibling, 0 replies; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 22:08 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Andrew Morton, Dan Noe, davem, linux-kernel, jirislaby, Stephen Rothwell On Wed, 30 Apr 2008, Rafael J. Wysocki wrote: > > How bisectable is linux-next, BTW? Each _individual_ release will be entirely bisectable, since it's all git trees, and at no point does anything collapse individual commits together like -mm does. HOWEVER. Due to the way linux-next works, each individual release will be basically unrelated to the previous one, so it gets a bit more exciting indeed when you say "the last linux-next version worked for me, but the current one does not". Git can actually do this - you can make the previous (good) linux-next version be one branch, and the not-directly-related next linux-next build be another, and then "git bisect" will _technically_ work, but: - it will not necessarily be as efficient (because the linux-next trees will have re-done all the merges, so there will be new commits and patterns in between them) - but much more distressingly, if the individual git trees that got merged into linux-next were also using rebasing etc, now even all the *base* commits will be different, and saying that the old release was good tells you almost nothing about the new release! (The good news is that if only a couple of trees do that, the bisection information from the other trees that don't do it will still be valid and useful and help bisection) - also, while it's very easy for somebody who knows and understands git branches, it's technically still quite a bit more challenging than just following a single tree that never rebases (ie mine) and just bisecting within that one. So yes, git bisect will work in linux-next, and the fundamental nature of git-bisect will not change at all, but it's going to be a bit weaker "between different versions" of linux-next than it would be for the normal git tree that doesn't do the "merge different trees all over again" thing that linux-next does. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:59 ` Andrew Morton 2008-04-30 21:30 ` Rafael J. Wysocki @ 2008-04-30 22:53 ` Mariusz Kozlowski 2008-04-30 23:11 ` Andrew Morton 2008-05-02 10:20 ` Andi Kleen 1 sibling, 2 replies; 229+ messages in thread From: Mariusz Kozlowski @ 2008-04-30 22:53 UTC (permalink / raw) To: Andrew Morton; +Cc: Dan Noe, torvalds, rjw, davem, linux-kernel, jirislaby Hello, > > Perhaps we should be clear and simple about what potential testers > > should be running at any given point in time. With -mm, linux-next, > > linux-2.6, etc, as a newcomer I find it difficult to know where my > > testing time and energy is best directed. Speaking of energy and time of a tester. I'd like to know where these resources should be directed from the arch point of view. Once I had a plan to buy as many arches as I could get and run a farm of test boxes 8-) But that's hard because of various reasons (money, time, room, energy). What arches need more attention? Which are forgotten? Which are going away? For example does buying an alphaserver DS 20 (hey - it's cheap) and running tests on it makes sense these days? Mariusz ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:53 ` Mariusz Kozlowski @ 2008-04-30 23:11 ` Andrew Morton 2008-05-12 9:27 ` Ben Dooks 2008-05-02 10:20 ` Andi Kleen 1 sibling, 1 reply; 229+ messages in thread From: Andrew Morton @ 2008-04-30 23:11 UTC (permalink / raw) To: Mariusz Kozlowski; +Cc: dpn, torvalds, rjw, davem, linux-kernel, jirislaby On Thu, 1 May 2008 00:53:31 +0200 Mariusz Kozlowski <m.kozlowski@tuxland.pl> wrote: > Hello, > > > > Perhaps we should be clear and simple about what potential testers > > > should be running at any given point in time. With -mm, linux-next, > > > linux-2.6, etc, as a newcomer I find it difficult to know where my > > > testing time and energy is best directed. > > Speaking of energy and time of a tester. I'd like to know where these resources > should be directed from the arch point of view. Once I had a plan to buy as > many arches as I could get and run a farm of test boxes 8-) But that's hard > because of various reasons (money, time, room, energy). What arches need more > attention? Which are forgotten? Which are going away? For example does buying > an alphaserver DS 20 (hey - it's cheap) and running tests on it makes sense > these days? > gee. I think to a large extent this problem solves itself - the "more important" architectures have more people using them, so they get more testing and more immediate testing. However there are gaps. I'd say that arm is one of the more important architectures, but many people who are interested in arm tend to shy away from bleeding-edge kernels for various reasons. Mainly because they have real products to get out the door, rather than dinking around with mainline kernel developement. So testing bleeding-edge on some arm systems would be good, I expect. otoh, the platform we break most often is surely plain-old-PCs. If it's bugs you're looking for, I expect that dumpster-diving for as many different PCs as you can and trying to get them to boot (let alone suspend and resume!) would keep you entertained ;) ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:11 ` Andrew Morton @ 2008-05-12 9:27 ` Ben Dooks 0 siblings, 0 replies; 229+ messages in thread From: Ben Dooks @ 2008-05-12 9:27 UTC (permalink / raw) To: Andrew Morton Cc: Mariusz Kozlowski, dpn, torvalds, rjw, davem, linux-kernel, jirislaby On Wed, Apr 30, 2008 at 04:11:30PM -0700, Andrew Morton wrote: > On Thu, 1 May 2008 00:53:31 +0200 > Mariusz Kozlowski <m.kozlowski@tuxland.pl> wrote: > > > Hello, > > > > > > Perhaps we should be clear and simple about what potential testers > > > > should be running at any given point in time. With -mm, linux-next, > > > > linux-2.6, etc, as a newcomer I find it difficult to know where my > > > > testing time and energy is best directed. > > > > Speaking of energy and time of a tester. I'd like to know where these resources > > should be directed from the arch point of view. Once I had a plan to buy as > > many arches as I could get and run a farm of test boxes 8-) But that's hard > > because of various reasons (money, time, room, energy). What arches need more > > attention? Which are forgotten? Which are going away? For example does buying > > an alphaserver DS 20 (hey - it's cheap) and running tests on it makes sense > > these days? > > > > gee. > > I think to a large extent this problem solves itself - the "more important" > architectures have more people using them, so they get more testing and > more immediate testing. > > However there are gaps. I'd say that arm is one of the more important > architectures, but many people who are interested in arm tend to shy away > from bleeding-edge kernels for various reasons. Mainly because they have > real products to get out the door, rather than dinking around with mainline > kernel developement. So testing bleeding-edge on some arm systems would be > good, I expect. As both personally, and the policy of my employer we try and ensure we can offer our customers at-least the previous 'stable' kernel release and ensure that our development process tracks the kernel -rcX candidates. We also run an autobuilder[1] which runs all -git releases through an automated build (no auto-test yet) to ensure that we can detect any build or configuration errors in the releases. ARM is a fast moving area due to the amount of sillicon vendors out there who seem intent on doing their own thing, and often forking hardware blocks they use during differing development branches. I am currently looking at merging support for the S3C6400 (new) and finishing S3C2443 (similar to 6400) and the S3C24A0... this means that I have a lot of code to look through before each release and having a stall will just keep the backlog building, making my job a lot more difficult. [1] http://armlinux.simtec.co.uk/kautobuild/ -- Ben (ben@fluff.org, http://www.fluff.org/) 'a smiley only costs 4 bytes' ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:53 ` Mariusz Kozlowski 2008-04-30 23:11 ` Andrew Morton @ 2008-05-02 10:20 ` Andi Kleen 2008-05-02 15:33 ` Mariusz Kozlowski 1 sibling, 1 reply; 229+ messages in thread From: Andi Kleen @ 2008-05-02 10:20 UTC (permalink / raw) To: Mariusz Kozlowski Cc: Andrew Morton, Dan Noe, torvalds, rjw, davem, linux-kernel, jirislaby Mariusz Kozlowski <m.kozlowski@tuxland.pl> writes: > > Speaking of energy and time of a tester. I'd like to know where these resources > should be directed from the arch point of view. Once I had a plan to buy as > many arches as I could get and run a farm of test boxes 8-) But that's hard > because of various reasons (money, time, room, energy). What arches need more > attention? Which are forgotten? Which are going away? For example does buying > an alphaserver DS 20 (hey - it's cheap) and running tests on it makes sense > these days? A lot of bugs are not architecture specific. Or when they are architecture specific they only affect some specific machines in that architecture. But really a lot of bugs should happen on most architectures. Just focussing on lots of boxes is not necessarily productive. My recommendation would be to concentrate on deeper testing (more coverage) on the architectures you have. A interestig project for example would be to play with the kernel gcov patch that was recently reposted (I hope it makes mainline eventually). Apply that patch, run all the test suites and tests you usually run on your favourite test box and check how much of the code that is compiled into your kernel was really tested using the coverage information Then think: what additional tests can you do to get more coverage? Write tests then? Or just write descriptions on what is not tested and send them to the list, as a project for others looking to contribute to the kernel. -Andi ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-02 10:20 ` Andi Kleen @ 2008-05-02 15:33 ` Mariusz Kozlowski 0 siblings, 0 replies; 229+ messages in thread From: Mariusz Kozlowski @ 2008-05-02 15:33 UTC (permalink / raw) To: Andi Kleen Cc: Andrew Morton, Dan Noe, torvalds, rjw, davem, linux-kernel, jirislaby Hello, > > Speaking of energy and time of a tester. I'd like to know where these resources > > should be directed from the arch point of view. Once I had a plan to buy as > > many arches as I could get and run a farm of test boxes 8-) But that's hard > > because of various reasons (money, time, room, energy). What arches need more > > attention? Which are forgotten? Which are going away? For example does buying > > an alphaserver DS 20 (hey - it's cheap) and running tests on it makes sense > > these days? > > A lot of bugs are not architecture specific. Or when they are architecture > specific they only affect some specific machines in that architecture. Yes, there is some amount of bugs that I see only on specific architecture. These which are reproducible or have an easy test case I do report to LKML, but there are also bugs I see rarely or just once and they never come back and sometimes as a bonus leave no trace - and these I ususaly don't report. Providing a test case is a challenge and one can really learn a lot. > But really a lot of bugs should happen on most architectures. Just focussing > on lots of boxes is not necessarily productive. What I meant was one box per architecture, preferably an SMP one where possible - so the number of required boxes is limited. This way instead of just cross-compiling I could actually _run_ the kernel. On the other hand if some arch is close to be dead and has no foreseable future then there is no point in testing it. Also my thinking was that sometimes bugs from other (than x86) architectures can point to some more generic problems. Well - I'll buy just a few more and that's it ;) > My recommendation would be to concentrate on deeper testing (more coverage) > on the architectures you have. Can do. > A interestig project for example would be to play with the kernel gcov patch that > was recently reposted (I hope it makes mainline eventually). Apply that patch, > run all the test suites and tests you usually run on your favourite test box > and check how much of the code that is compiled into your kernel was really tested > using the coverage information Then think: what additional tests can you do to get > more coverage? Write tests then? Or just write descriptions on what is not tested > and send them to the list, as a project for others looking to contribute to the > kernel. Sounds like a plan - will look into that. Mariusz aka arch'aeologist ;) ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:31 ` Linus Torvalds 2008-04-30 20:47 ` Dan Noe @ 2008-04-30 20:54 ` Andrew Morton 2008-04-30 21:21 ` David Miller ` (2 more replies) 2008-04-30 21:52 ` H. Peter Anvin 2008-05-01 0:31 ` RFC: starting a kernel-testers group for newbies Adrian Bunk 3 siblings, 3 replies; 229+ messages in thread From: Andrew Morton @ 2008-04-30 20:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: rjw, davem, linux-kernel, jirislaby On Wed, 30 Apr 2008 13:31:08 -0700 (PDT) Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Wed, 30 Apr 2008, Andrew Morton wrote: > > > > <jumps up and down> > > > > There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! > > The problem I see with both -mm and linux-next is that they tend to be > better at finding the "physical conflict" kind of issues (ie the merge > itself fails) than the "code looks ok but doesn't actually work" kind of > issue. > > Why? > > The tester base is simply too small. > > Now, if *that* could be improved, that would be wonderful, but I'm not > seeing it as very likely. > > I think we have fairly good penetration these days with the regular -git > tree, but I think that one is quite frankly a *lot* less scary than -mm or > -next are, and there it has been an absolutely huge boon to get the kernel > into the Fedora test-builds etc (and I _think_ Ubuntu and SuSE also > started something like that). > > So I'm very pessimistic about getting a lot of test coverage before -rc1. > > Maybe too pessimistic, who knows? > Well. We'll see. linux-next is more than another-tree-to-test. It is (or will be) a change in our processes and culture. For a start, subsystem maintainers can no longer whack away at their own tree as if the rest of use don't exist. They now have to be more mindful of merge issues. Secondly, linux-next is more accessible than -mm: more releases, more stable, better tested by he-who-releases it, available via git:// etc. It should be very easy for developers to do their weekly "does linux-next boot" test. Plus, of course, people who complain about merge-window breakage only to find that the breakage was already in linux-next except they didn't test it will not have a leg to stand on. I feared that linux-next wouldn't work: that Stephen would stomp off in disgust at all the crap people send at him. But in fact it seems to be going very well from that POV. I get the impression that we're seeing very little non-Stephen testing of linux-next at this stage. I hope we can ramp that up a bit, initially by having core developers doing at least some basic sanity testing. linux-next does little to address our two largest (IMO) problems: inadequate review and inadequate response to bug and regression reports. But those problems are harder to fix.. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:54 ` Andrew Morton @ 2008-04-30 21:21 ` David Miller 2008-04-30 21:47 ` Rafael J. Wysocki ` (3 more replies) 2008-04-30 21:42 ` Dmitri Vorobiev 2008-05-09 9:28 ` Jiri Kosina 2 siblings, 4 replies; 229+ messages in thread From: David Miller @ 2008-04-30 21:21 UTC (permalink / raw) To: akpm; +Cc: torvalds, rjw, linux-kernel, jirislaby From: Andrew Morton <akpm@linux-foundation.org> Date: Wed, 30 Apr 2008 13:54:05 -0700 > linux-next does little to address our two largest (IMO) problems: > inadequate review and inadequate response to bug and regression reports. > But those problems are harder to fix.. This is all about positive and negative reinforcement. The people who sit and git bisect their lives away to get the regressions fixed need more positive reinforcement. And the people who stick these regressions into the tree need more negative reinforcement. The current way of dealing with folks who stick broken crud into the tree results in zero change in behvaior. People who insert the bum changes into the tree only really have one core thing that they are sensitive to, their reputation. That's why there is an enormous reluctance to even suggest reverts, it looks bad for them and it also makes more work for them in the end. I guess what these folks are truly afraid of is that someone will start tracking reverts and post their results in some presentation at some big conference. I say that would be a good thing. To be honest, hitting the revert button more aggressively and putting the fear of being the "revert king" into everyone's minds might really help with this problem. Currently there is no sufficient negative pushback on people who insert broken crud into the tree. So it should be no surprise that it continues. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 21:21 ` David Miller @ 2008-04-30 21:47 ` Rafael J. Wysocki 2008-04-30 22:02 ` Dmitri Vorobiev ` (2 subsequent siblings) 3 siblings, 0 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 21:47 UTC (permalink / raw) To: David Miller; +Cc: akpm, torvalds, linux-kernel, jirislaby On Wednesday, 30 of April 2008, David Miller wrote: > From: Andrew Morton <akpm@linux-foundation.org> > Date: Wed, 30 Apr 2008 13:54:05 -0700 > > > linux-next does little to address our two largest (IMO) problems: > > inadequate review and inadequate response to bug and regression reports. > > But those problems are harder to fix.. > > This is all about positive and negative reinforcement. > > The people who sit and git bisect their lives away to get the > regressions fixed need more positive reinforcement. And the people > who stick these regressions into the tree need more negative > reinforcement. > > The current way of dealing with folks who stick broken crud into the > tree results in zero change in behvaior. > > People who insert the bum changes into the tree only really have one > core thing that they are sensitive to, their reputation. That's why > there is an enormous reluctance to even suggest reverts, it looks bad > for them and it also makes more work for them in the end. > > I guess what these folks are truly afraid of is that someone will > start tracking reverts and post their results in some presentation > at some big conference. I say that would be a good thing. To > be honest, hitting the revert button more aggressively and putting > the fear of being the "revert king" into everyone's minds might > really help with this problem. Well, probably ... > Currently there is no sufficient negative pushback on people who > insert broken crud into the tree. So it should be no surprise that it > continues. ... but that should also point at the trees through which the bugs are introduced. I mean, the maintainers should be more careful for what they take to their trees and push upstream. If that happens, they'll (hopefully) put some more pressure on patch submitters. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 21:21 ` David Miller 2008-04-30 21:47 ` Rafael J. Wysocki @ 2008-04-30 22:02 ` Dmitri Vorobiev 2008-04-30 22:19 ` Ingo Molnar 2008-05-02 13:37 ` Helge Hafting 3 siblings, 0 replies; 229+ messages in thread From: Dmitri Vorobiev @ 2008-04-30 22:02 UTC (permalink / raw) To: David Miller; +Cc: akpm, torvalds, rjw, linux-kernel, jirislaby, Ingo Molnar David Miller пишет: > From: Andrew Morton <akpm@linux-foundation.org> > Date: Wed, 30 Apr 2008 13:54:05 -0700 > >> linux-next does little to address our two largest (IMO) problems: >> inadequate review and inadequate response to bug and regression reports. >> But those problems are harder to fix.. > > This is all about positive and negative reinforcement. > > The people who sit and git bisect their lives away to get the > regressions fixed need more positive reinforcement. And the people > who stick these regressions into the tree need more negative > reinforcement. > > The current way of dealing with folks who stick broken crud into the > tree results in zero change in behvaior. > > People who insert the bum changes into the tree only really have one > core thing that they are sensitive to, their reputation. That's why > there is an enormous reluctance to even suggest reverts, it looks bad > for them and it also makes more work for them in the end. > > I guess what these folks are truly afraid of is that someone will > start tracking reverts and post their results in some presentation > at some big conference. I say that would be a good thing. To > be honest, hitting the revert button more aggressively and putting > the fear of being the "revert king" into everyone's minds might > really help with this problem. > > Currently there is no sufficient negative pushback on people who > insert broken crud into the tree. So it should be no surprise that it > continues. I'm not a frequent poster to this mailing list, but I do spend a good portion of my life reading it. Please excuse me for expressing my very personal opinion, but I thought you might probably be interested in a detached view of the situation. I think that many have guessed that I would like to talk about the attacks to Ingo and backwards. Believe me, this fight looks childish, as it becomes obvious that that went beyond purely technical disputes, which Linus is so keen of rightfully writing about. In no case am I implying any kind of offense, but I do believe that bad emotions do hinder the community from advancing with the technical things. Dmitri ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 21:21 ` David Miller 2008-04-30 21:47 ` Rafael J. Wysocki 2008-04-30 22:02 ` Dmitri Vorobiev @ 2008-04-30 22:19 ` Ingo Molnar 2008-04-30 22:22 ` David Miller ` (2 more replies) 2008-05-02 13:37 ` Helge Hafting 3 siblings, 3 replies; 229+ messages in thread From: Ingo Molnar @ 2008-04-30 22:19 UTC (permalink / raw) To: David Miller; +Cc: akpm, torvalds, rjw, linux-kernel, jirislaby * David Miller <davem@davemloft.net> wrote: > > linux-next does little to address our two largest (IMO) problems: > > inadequate review and inadequate response to bug and regression > > reports. But those problems are harder to fix.. > > This is all about positive and negative reinforcement. > > The people who sit and git bisect their lives away to get the > regressions fixed need more positive reinforcement. And the people > who stick these regressions into the tree need more negative > reinforcement. What we need is not 'negative reinforcement'. That is just nasty, open warfare between isolated parties, expressed in a politically correct way. The core problem is that every maintainer has his own subjective, assymetric view and experience about this matter: to him his own tree is almost problem-free and most problems are very easy to fix, while other problems in other trees are nuisance that should never have been put upstream. Also, people get defensive when their regressions gets pointed out in anything but the most respectful and casual manner. For example, how on earth do i tell you that during the v2.6.24 merge window, half of all x86 test-machines for me and others were broken because they had no networking, for more than a week in a row? Are you surprised about this (true) experience we had? Do you feel insulted? Do you feel unfairly handled and slandered? The same goes in the other direction as well - you were just hit by scheduler tree related regressions that were only triggered on your 128-way sparc64, but not on our 64way x86 and smaller boxes. The thing is, what we really need is more cooperation and earlier integration - more people actually testing linux-next occasionally to see how things will look like in the next merge window. linux-next doing build tests is fine, but the nasty regressions that will hit your box can only be solved if _you_ boot linux-next at least once before the merge window opens. The regressions that will hit my box can only be avoided if i test your tree. hm? And can we please somehow talk about this without flaming each other in the process? Ingo ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:19 ` Ingo Molnar @ 2008-04-30 22:22 ` David Miller 2008-04-30 22:39 ` Rafael J. Wysocki 2008-04-30 22:35 ` Ingo Molnar 2008-05-05 3:04 ` Rusty Russell 2 siblings, 1 reply; 229+ messages in thread From: David Miller @ 2008-04-30 22:22 UTC (permalink / raw) To: mingo; +Cc: akpm, torvalds, rjw, linux-kernel, jirislaby From: Ingo Molnar <mingo@elte.hu> Date: Thu, 1 May 2008 00:19:36 +0200 > The same goes in the other direction as well - you were just hit by > scheduler tree related regressions that were only triggered on your > 128-way sparc64, but not on our 64way x86 and smaller boxes. You keep saying this over and over again, but the powerpc folks hit this stuff too. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:22 ` David Miller @ 2008-04-30 22:39 ` Rafael J. Wysocki 2008-04-30 22:54 ` david 2008-04-30 23:12 ` Willy Tarreau 0 siblings, 2 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 22:39 UTC (permalink / raw) To: David Miller; +Cc: mingo, akpm, torvalds, linux-kernel, jirislaby On Thursday, 1 of May 2008, David Miller wrote: > From: Ingo Molnar <mingo@elte.hu> > Date: Thu, 1 May 2008 00:19:36 +0200 > > > The same goes in the other direction as well - you were just hit by > > scheduler tree related regressions that were only triggered on your > > 128-way sparc64, but not on our 64way x86 and smaller boxes. > > You keep saying this over and over again, but the powerpc folks hit > this stuff too. Well, I think that some changes need some wider testing anyway. They may be correct from the author's point of view and even from the knowledge and point of view of the maintainer who takes them into his tree. That's because no one knows everything and it'll always be like this. Still, with the current process such "suspicious" changes go in as parts of large series of commits and need to be "rediscovered" by the affected testers with the help of bisection. Moreover, many changes of this kind may go in from many different sources at the same time and that's really problematic. In fact, so many changes go in at a time during a merge window, that we often can't really say which of them causes the breakage observed by testers and bisection, that IMO should really be a last-resort tool, is used on the main debugging techinque. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:39 ` Rafael J. Wysocki @ 2008-04-30 22:54 ` david 2008-04-30 23:12 ` Willy Tarreau 1 sibling, 0 replies; 229+ messages in thread From: david @ 2008-04-30 22:54 UTC (permalink / raw) To: Rafael J. Wysocki Cc: David Miller, mingo, akpm, torvalds, linux-kernel, jirislaby On Thu, 1 May 2008, Rafael J. Wysocki wrote: > On Thursday, 1 of May 2008, David Miller wrote: >> From: Ingo Molnar <mingo@elte.hu> >> Date: Thu, 1 May 2008 00:19:36 +0200 >> >>> The same goes in the other direction as well - you were just hit by >>> scheduler tree related regressions that were only triggered on your >>> 128-way sparc64, but not on our 64way x86 and smaller boxes. >> >> You keep saying this over and over again, but the powerpc folks hit >> this stuff too. > > Well, I think that some changes need some wider testing anyway. > > They may be correct from the author's point of view and even from the knowledge > and point of view of the maintainer who takes them into his tree. That's > because no one knows everything and it'll always be like this. I think this is a very important point to keep in mind > Still, with the current process such "suspicious" changes go in as parts of > large series of commits and need to be "rediscovered" by the affected testers > with the help of bisection. Moreover, many changes of this kind may go in from > many different sources at the same time and that's really problematic. git makes it easy to have many branches that get merged upstream, would it really help much if these changes were initially done as seperate branches and then merged in? if so there are two ways to do this have Ingo (and others) create a small forest of branches that get merged into linux-next have Ingo (and others) create a small forest of branches that get merged into one 'please pull' branch that gets merged into linux-next the second has the advantage that merge conflicts between the different branches will be resolved before they go upstream, and there's less work to be done upstream (as the upstream doesn't need to keep adding branches to pull) the first may have an advantage in terms of making the different branches more visable. > In fact, so many changes go in at a time during a merge window, that we often > can't really say which of them causes the breakage observed by testers and > bisection, that IMO should really be a last-resort tool, is used on the main > debugging techinque. there are always going to be cases where the problem can only be found by bisecting it, but I agree that there seems to be a little too much reliance on bisecting (but that was a heated topic a few weeks ago, let's not re-hash it now) David Lang ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:39 ` Rafael J. Wysocki 2008-04-30 22:54 ` david @ 2008-04-30 23:12 ` Willy Tarreau 2008-04-30 23:59 ` Rafael J. Wysocki 2008-05-01 0:15 ` Chris Shoemaker 1 sibling, 2 replies; 229+ messages in thread From: Willy Tarreau @ 2008-04-30 23:12 UTC (permalink / raw) To: Rafael J. Wysocki Cc: David Miller, mingo, akpm, torvalds, linux-kernel, jirislaby On Thu, May 01, 2008 at 12:39:01AM +0200, Rafael J. Wysocki wrote: > On Thursday, 1 of May 2008, David Miller wrote: > > From: Ingo Molnar <mingo@elte.hu> > > Date: Thu, 1 May 2008 00:19:36 +0200 > > > > > The same goes in the other direction as well - you were just hit by > > > scheduler tree related regressions that were only triggered on your > > > 128-way sparc64, but not on our 64way x86 and smaller boxes. > > > > You keep saying this over and over again, but the powerpc folks hit > > this stuff too. > > Well, I think that some changes need some wider testing anyway. > > They may be correct from the author's point of view and even from the knowledge > and point of view of the maintainer who takes them into his tree. That's > because no one knows everything and it'll always be like this. > > Still, with the current process such "suspicious" changes go in as parts of > large series of commits and need to be "rediscovered" by the affected testers > with the help of bisection. Moreover, many changes of this kind may go in from > many different sources at the same time and that's really problematic. That's very true IMHO and is the thing which has been progressively appearing since we merge large amounts of code at once. In the "good old days", something did not work, the first one to discover it could quickly report it on LKML : "hey, my 128-way sparc64 does not boot anymore, anybody has any clue", and another one immediately found this mail (better signal/noise ratio on LKML at this time) and say "oops, I suspect that change, try to revert it". Now, it's close to impossible. Maintainers frequently ask for bisection, in part because nobody knows what code is merged, and they have to pull Linus' tree to know when their changes have been pulled. That may be part of the "fun" aspect that Davem is seeing going away in exchange for more administrative relations. But if we agree that nobody knows all the changes, we must agree that we need tools to track them, and tools are fundamentally incompatible with smart human relations. > In fact, so many changes go in at a time during a merge window, that we often > can't really say which of them causes the breakage observed by testers and > bisection, that IMO should really be a last-resort tool, is used on the main > debugging techinque. Maybe we could slightly improve the process by releasing more often, but based on topics. Small sets of minimally-overlapping topics would get merged in each release, and other topics would only be allowed to pull fixes. That way everybody still gets some work merged, everybody tests and problems are more easily spotted. I know this is in part what Andrew tries to do when proposing to integrate trees, but maybe some approximate rules should be proposed in order for developers to organize their works. This would begin with announcing topics to be considered for next branch very early. This would also make it more natural for developers to have creation and bug-tracking phases. Willy ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:12 ` Willy Tarreau @ 2008-04-30 23:59 ` Rafael J. Wysocki 2008-05-01 0:15 ` Chris Shoemaker 1 sibling, 0 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-04-30 23:59 UTC (permalink / raw) To: Willy Tarreau Cc: David Miller, mingo, akpm, torvalds, linux-kernel, jirislaby On Thursday, 1 of May 2008, Willy Tarreau wrote: > On Thu, May 01, 2008 at 12:39:01AM +0200, Rafael J. Wysocki wrote: > > On Thursday, 1 of May 2008, David Miller wrote: > > > From: Ingo Molnar <mingo@elte.hu> > > > Date: Thu, 1 May 2008 00:19:36 +0200 > > > > > > > The same goes in the other direction as well - you were just hit by > > > > scheduler tree related regressions that were only triggered on your > > > > 128-way sparc64, but not on our 64way x86 and smaller boxes. > > > > > > You keep saying this over and over again, but the powerpc folks hit > > > this stuff too. > > > > Well, I think that some changes need some wider testing anyway. > > > > They may be correct from the author's point of view and even from the knowledge > > and point of view of the maintainer who takes them into his tree. That's > > because no one knows everything and it'll always be like this. > > > > Still, with the current process such "suspicious" changes go in as parts of > > large series of commits and need to be "rediscovered" by the affected testers > > with the help of bisection. Moreover, many changes of this kind may go in from > > many different sources at the same time and that's really problematic. > > That's very true IMHO and is the thing which has been progressively > appearing since we merge large amounts of code at once. In the "good > old days", something did not work, the first one to discover it could > quickly report it on LKML : "hey, my 128-way sparc64 does not boot > anymore, anybody has any clue", and another one immediately found > this mail (better signal/noise ratio on LKML at this time) and say > "oops, I suspect that change, try to revert it". > > Now, it's close to impossible. Maintainers frequently ask for bisection, > in part because nobody knows what code is merged, and they have to pull > Linus' tree to know when their changes have been pulled. That may be > part of the "fun" aspect that Davem is seeing going away in exchange > for more administrative relations. But if we agree that nobody knows > all the changes, we must agree that we need tools to track them, and > tools are fundamentally incompatible with smart human relations. > > > In fact, so many changes go in at a time during a merge window, that we often > > can't really say which of them causes the breakage observed by testers and > > bisection, that IMO should really be a last-resort tool, is used on the main > > debugging techinque. > > Maybe we could slightly improve the process by releasing more often, but > based on topics. Small sets of minimally-overlapping topics would get > merged in each release, and other topics would only be allowed to pull > fixes. That way everybody still gets some work merged, everybody tests > and problems are more easily spotted. I like this idea. > I know this is in part what Andrew tries to do when proposing to > integrate trees, but maybe some approximate rules should be proposed > in order for developers to organize their works. This would begin > with announcing topics to be considered for next branch very early. > This would also make it more natural for developers to have creation > and bug-tracking phases. Yes, that's reasonable. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:12 ` Willy Tarreau 2008-04-30 23:59 ` Rafael J. Wysocki @ 2008-05-01 0:15 ` Chris Shoemaker 2008-05-01 5:09 ` Willy Tarreau 1 sibling, 1 reply; 229+ messages in thread From: Chris Shoemaker @ 2008-05-01 0:15 UTC (permalink / raw) To: Willy Tarreau Cc: Rafael J. Wysocki, David Miller, mingo, akpm, torvalds, linux-kernel, jirislaby On Thu, May 01, 2008 at 01:12:21AM +0200, Willy Tarreau wrote: > On Thu, May 01, 2008 at 12:39:01AM +0200, Rafael J. Wysocki wrote: > > In fact, so many changes go in at a time during a merge window, that we often > > can't really say which of them causes the breakage observed by testers and > > bisection, that IMO should really be a last-resort tool, is used on the main > > debugging techinque. > > Maybe we could slightly improve the process by releasing more often, but > based on topics. Small sets of minimally-overlapping topics would get > merged in each release, and other topics would only be allowed to pull > fixes. That way everybody still gets some work merged, everybody tests > and problems are more easily spotted. > > I know this is in part what Andrew tries to do when proposing to > integrate trees, but maybe some approximate rules should be proposed > in order for developers to organize their works. This would begin > with announcing topics to be considered for next branch very early. > This would also make it more natural for developers to have creation > and bug-tracking phases. What would this look like, notionally? Say the releases were twice as frequent with Stage A and Stage B. How could the topic be grouped into the stages? Could bugfixes of any type be merged in either window? Would this only apply to "new" features, API changes, etc? or would maintenance-type changes have to be assigned to a stage, too? -chris ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 0:15 ` Chris Shoemaker @ 2008-05-01 5:09 ` Willy Tarreau 0 siblings, 0 replies; 229+ messages in thread From: Willy Tarreau @ 2008-05-01 5:09 UTC (permalink / raw) To: Chris Shoemaker Cc: Rafael J. Wysocki, David Miller, mingo, akpm, torvalds, linux-kernel, jirislaby On Wed, Apr 30, 2008 at 08:15:00PM -0400, Chris Shoemaker wrote: > On Thu, May 01, 2008 at 01:12:21AM +0200, Willy Tarreau wrote: > > On Thu, May 01, 2008 at 12:39:01AM +0200, Rafael J. Wysocki wrote: > > > In fact, so many changes go in at a time during a merge window, that we often > > > can't really say which of them causes the breakage observed by testers and > > > bisection, that IMO should really be a last-resort tool, is used on the main > > > debugging techinque. > > > > Maybe we could slightly improve the process by releasing more often, but > > based on topics. Small sets of minimally-overlapping topics would get > > merged in each release, and other topics would only be allowed to pull > > fixes. That way everybody still gets some work merged, everybody tests > > and problems are more easily spotted. > > > > I know this is in part what Andrew tries to do when proposing to > > integrate trees, but maybe some approximate rules should be proposed > > in order for developers to organize their works. This would begin > > with announcing topics to be considered for next branch very early. > > This would also make it more natural for developers to have creation > > and bug-tracking phases. > > What would this look like, notionally? Say the releases were twice as > frequent with Stage A and Stage B. How could the topic be grouped > into the stages? Could bugfixes of any type be merged in either > window? Would this only apply to "new" features, API changes, etc? or > would maintenance-type changes have to be assigned to a stage, too? bug fixes are of course always possible, just that we limit important changes, i.e. the ones which randomly break and that take a lot of time to track down because everyone has changed something. > -chris willy ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:19 ` Ingo Molnar 2008-04-30 22:22 ` David Miller @ 2008-04-30 22:35 ` Ingo Molnar 2008-04-30 22:49 ` Andrew Morton 2008-04-30 22:51 ` David Miller 2008-05-05 3:04 ` Rusty Russell 2 siblings, 2 replies; 229+ messages in thread From: Ingo Molnar @ 2008-04-30 22:35 UTC (permalink / raw) To: David Miller Cc: akpm, torvalds, rjw, linux-kernel, jirislaby, Thomas Gleixner * Ingo Molnar <mingo@elte.hu> wrote: > What we need is not 'negative reinforcement'. That is just nasty, open > warfare between isolated parties, expressed in a politically correct > way. in more detail: any "negative reinforcement" should be on the _technical_ level, i.e. when changes are handled - not at the broad tree level. Sure, there are exceptions, etc. - but by the time stuff goes upstream it's too late and we've got to fix stuff instead of trying to push back on each other. by earlier integration (= linux-next) we can do the pushback much earlier, in a much more granular, much more technical in a much less personal way: "hey Ingo, your new sched-dizzy-blah patch broke stuff here, zap it" or "hey Dave, that socket-foo rewrite just broke things here, zap it". git-revert _kind of_ makes that possible too, but people still feel too personal about reverts - they take it as intrusion into their subsystem and regard it as an attack against their competence as a maintainer. and this is all so typical btw.: the most effective measure against human warfare is for people to see each other and to talk to each other. [ That's one reason why i am so worried about mailing list isolation. People get more distant, they mean less to each other, work less with each other => Linux suffers. I do accept that for some people lkml is simply too noisy - but i think the cure is worse than the disease. ] Ingo ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:35 ` Ingo Molnar @ 2008-04-30 22:49 ` Andrew Morton 2008-04-30 22:51 ` David Miller 1 sibling, 0 replies; 229+ messages in thread From: Andrew Morton @ 2008-04-30 22:49 UTC (permalink / raw) To: Ingo Molnar; +Cc: davem, torvalds, rjw, linux-kernel, jirislaby, tglx On Thu, 1 May 2008 00:35:09 +0200 Ingo Molnar <mingo@elte.hu> wrote: > git-revert _kind of_ makes that possible too, but people still feel too > personal about reverts - they take it as intrusion into their subsystem > and regard it as an attack against their competence as a maintainer. I'd question this. People often seem pretty happy to yank their stuff out of there - it relieves ongoing embarrassment and it relieves time pressure - they can have another go and get it right at their leisure. Of course, reverting is easy. The hard part is often finding the thing which needs to be reverted. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:35 ` Ingo Molnar 2008-04-30 22:49 ` Andrew Morton @ 2008-04-30 22:51 ` David Miller 2008-05-01 1:40 ` Ingo Molnar 2008-05-01 2:48 ` Adrian Bunk 1 sibling, 2 replies; 229+ messages in thread From: David Miller @ 2008-04-30 22:51 UTC (permalink / raw) To: mingo; +Cc: akpm, torvalds, rjw, linux-kernel, jirislaby, tglx From: Ingo Molnar <mingo@elte.hu> Date: Thu, 1 May 2008 00:35:09 +0200 > > * Ingo Molnar <mingo@elte.hu> wrote: > > > What we need is not 'negative reinforcement'. That is just nasty, open > > warfare between isolated parties, expressed in a politically correct > > way. > > in more detail: any "negative reinforcement" should be on the > _technical_ level, i.e. when changes are handled - not at the broad tree > level. Sure, and I'll provide some right here. Ingo, let me know what I need to do to change your behavior in situations like the one I'm about to describe, ok? Today, you merged in this bogus "regression fix". commit ae3a0064e6d69068b1c9fd075095da062430bda9 Author: Ingo Molnar <mingo@elte.hu> Date: Wed Apr 30 00:15:31 2008 +0200 inlining: do not allow gcc below version 4 to optimize inlining fix the condition to match intention: always use the old inlining behavior on all gcc versions below 4. this should solve the UML build problem. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Did you actually read the UML build failure report? Adrian Bunk specifically stated that the UML build failure regression occurs with GCC version 4.3 Next, did you test this regression fix? Next, if you could not test this regression fix, did you wait patiently for the bug reporter to validate your fix? Adrian responded that it didn't fix the problem, but that was after you queued this up to Linus already. This proves my main beef with you Ingo. You're way too trigger happy, you merge things in too quickly, without checks and without verifications. To an arbitrary person reading the commit logs, the above looks like you fixed something, when you actually didn't fix anything. And let's address this specific inlining optimization and all the fallout it's generating. You said you merged this thing in because you didn't want to "wait a year for such a useful feature." In hindsight, that's exactly what we should have done, waited until we could sort out all of these issues. Yes, even if it would take a year. Now we're forced to sort it out somehow, unless you can get beyond your pride and revert the original change. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:51 ` David Miller @ 2008-05-01 1:40 ` Ingo Molnar 2008-05-01 2:48 ` Adrian Bunk 1 sibling, 0 replies; 229+ messages in thread From: Ingo Molnar @ 2008-05-01 1:40 UTC (permalink / raw) To: David Miller; +Cc: akpm, torvalds, rjw, linux-kernel, jirislaby, tglx * David Miller <davem@davemloft.net> wrote: > Ingo, let me know what I need to do to change your behavior in > situations like the one I'm about to describe, ok? > > Today, you merged in this bogus "regression fix". the motivation of that fix wasnt UML - that was just an (indeed incorrect) after-thought when i wrote up the commit log. The fix is obviously right - although it doesnt fix UML. btw., did you see my stream of fixes about UML? > To an arbitrary person reading the commit logs, the above looks like > you fixed something, when you actually didn't fix anything. it is wrong that it "doesnt fix anything". Look at the change itself: - * Force always-inline if the user requests it so via the .config: + * Force always-inline if the user requests it so via the .config, + * or if gcc is too old: */ #if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \ - !defined(CONFIG_OPTIMIZE_INLINING) && (__GNUC__ >= 4) + !defined(CONFIG_OPTIMIZE_INLINING) || (__GNUC__ < 4) before the change it was only possible to disable the optimization on gcc 4 and above. The intended (and now implemented) condition is to only change anything on gcc 4 and above. I.e. on gcc3x the config option has no effect at all - and that's what we want. Ingo ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:51 ` David Miller 2008-05-01 1:40 ` Ingo Molnar @ 2008-05-01 2:48 ` Adrian Bunk 1 sibling, 0 replies; 229+ messages in thread From: Adrian Bunk @ 2008-05-01 2:48 UTC (permalink / raw) To: David Miller; +Cc: mingo, akpm, torvalds, rjw, linux-kernel, jirislaby, tglx On Wed, Apr 30, 2008 at 03:51:49PM -0700, David Miller wrote: > From: Ingo Molnar <mingo@elte.hu> > Date: Thu, 1 May 2008 00:35:09 +0200 > > > > > * Ingo Molnar <mingo@elte.hu> wrote: > > > > > What we need is not 'negative reinforcement'. That is just nasty, open > > > warfare between isolated parties, expressed in a politically correct > > > way. > > > > in more detail: any "negative reinforcement" should be on the > > _technical_ level, i.e. when changes are handled - not at the broad tree > > level. > > Sure, and I'll provide some right here. > > Ingo, let me know what I need to do to change your behavior in > situations like the one I'm about to describe, ok? > > Today, you merged in this bogus "regression fix". > > commit ae3a0064e6d69068b1c9fd075095da062430bda9 > Author: Ingo Molnar <mingo@elte.hu> > Date: Wed Apr 30 00:15:31 2008 +0200 > > inlining: do not allow gcc below version 4 to optimize inlining > > fix the condition to match intention: always use the old inlining > behavior on all gcc versions below 4. > > this should solve the UML build problem. > > Signed-off-by: Ingo Molnar <mingo@elte.hu> > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > > Did you actually read the UML build failure report? > > Adrian Bunk specifically stated that the UML build failure regression > occurs with GCC version 4.3 > > Next, did you test this regression fix? > > Next, if you could not test this regression fix, did you wait > patiently for the bug reporter to validate your fix? Adrian > responded that it didn't fix the problem, but that was after > you queued this up to Linus already. >... You got the facts wrong, it is even worse: It was Ingo himself who reported this bug. [1] Ingo managed to send an untested and not working patch for a bug he reported himself... cu Adrian BTW: I finally figured out what is behind the problems on UML, and this is not related to any recent kernel changes. Patch comes when I'm awake again. [1] http://lkml.org/lkml/2008/4/26/151 -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:19 ` Ingo Molnar 2008-04-30 22:22 ` David Miller 2008-04-30 22:35 ` Ingo Molnar @ 2008-05-05 3:04 ` Rusty Russell 2 siblings, 0 replies; 229+ messages in thread From: Rusty Russell @ 2008-05-05 3:04 UTC (permalink / raw) To: Ingo Molnar; +Cc: David Miller, akpm, torvalds, rjw, linux-kernel, jirislaby On Thursday 01 May 2008 08:19:36 Ingo Molnar wrote: > * David Miller <davem@davemloft.net> wrote: > > And the people who stick these regressions into the tree need more > > negative reinforcement. > > What we need is not 'negative reinforcement'. Over time as patches succeed more I reduce testing so I can "get things done faster". Eventually I screw up, and get more cautious on checking. It's a dynamic balance. With reduced review comes sloppier code. If we can't increase review, we can at least increase the penalty for screwing up when I do get caught. If vger dropped all my emails for a week after I broke the kernel, I'd be far more careful OR I'd find efficient ways to avoid doing that (like increasing review, or automated testing). Either way, it's a win. But I'm sure everyone else is far more disciplined than I... Rusty. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 21:21 ` David Miller ` (2 preceding siblings ...) 2008-04-30 22:19 ` Ingo Molnar @ 2008-05-02 13:37 ` Helge Hafting 3 siblings, 0 replies; 229+ messages in thread From: Helge Hafting @ 2008-05-02 13:37 UTC (permalink / raw) To: David Miller; +Cc: akpm, torvalds, rjw, linux-kernel, jirislaby David Miller wrote: [...] > I guess what these folks are truly afraid of is that someone will > start tracking reverts and post their results in some presentation > at some big conference. I say that would be a good thing. To > be honest, hitting the revert button more aggressively and putting > the fear of being the "revert king" into everyone's minds might > really help with this problem. > You will probably want to sort by "revert percentage" then. The absolute number of reverts might make the biggest contributor "revert king", even if his average patch quality is better than most. > Currently there is no sufficient negative pushback on people who > insert broken crud into the tree. So it should be no surprise that it > continues. Helge Hafting ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:54 ` Andrew Morton 2008-04-30 21:21 ` David Miller @ 2008-04-30 21:42 ` Dmitri Vorobiev 2008-04-30 22:06 ` Jiri Slaby 2008-04-30 22:10 ` Andrew Morton 2008-05-09 9:28 ` Jiri Kosina 2 siblings, 2 replies; 229+ messages in thread From: Dmitri Vorobiev @ 2008-04-30 21:42 UTC (permalink / raw) To: Andrew Morton Cc: Linus Torvalds, rjw, davem, linux-kernel, jirislaby, Ingo Molnar Andrew Morton пишет: > On Wed, 30 Apr 2008 13:31:08 -0700 (PDT) > Linus Torvalds <torvalds@linux-foundation.org> wrote: > >> >> On Wed, 30 Apr 2008, Andrew Morton wrote: >>> <jumps up and down> >>> >>> There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! >> The problem I see with both -mm and linux-next is that they tend to be >> better at finding the "physical conflict" kind of issues (ie the merge >> itself fails) than the "code looks ok but doesn't actually work" kind of >> issue. >> >> Why? >> >> The tester base is simply too small. >> >> Now, if *that* could be improved, that would be wonderful, but I'm not >> seeing it as very likely. >> >> I think we have fairly good penetration these days with the regular -git >> tree, but I think that one is quite frankly a *lot* less scary than -mm or >> -next are, and there it has been an absolutely huge boon to get the kernel >> into the Fedora test-builds etc (and I _think_ Ubuntu and SuSE also >> started something like that). >> >> So I'm very pessimistic about getting a lot of test coverage before -rc1. >> >> Maybe too pessimistic, who knows? >> > > Well. We'll see. > > linux-next is more than another-tree-to-test. It is (or will be) a change > in our processes and culture. For a start, subsystem maintainers can no > longer whack away at their own tree as if the rest of use don't exist. > They now have to be more mindful of merge issues. > > Secondly, linux-next is more accessible than -mm: more releases, more > stable, better tested by he-who-releases it, available via git:// etc. Andrew, the latter thing is a very good point. For me personally, the fact that -mm is not available via git is the major obstacle for trying your tree more frequently than just a few times per year. How difficult it would be to switch to git for you? I guess there are good reasons for still using the source code management system from the last century; please correct me if I'm wrong, but I believe that using a modern SCM system could make life easier for you and your testers, no? > > I get the impression that we're seeing very little non-Stephen testing of > linux-next at this stage. I hope we can ramp that up a bit, initially by > having core developers doing at least some basic sanity testing. > For busy (or lazy) people like myself, the big problem with linux-next are the frequent merge breakages, when pulling the tree stops with "you are in the middle of a merge conflict". Perhaps, there is a better way to resolve this without just removing the whole repo and cloning it once again - this is what I'm doing, please flame me for stupidity or ignorance if I simply am not aware of some git feature that could be useful in such cases. Finally, while the list is at it, I'd like to make another technical comment. My development zoo is a pretty fast 4-way Xeon server, where I keep a handful of trees, a few cross-toolchains, Qemu, etc. The network setup in our organization is such that I can use git only over http from that server. This cannot be changed, it's the company policy. In view of that, it's a pity that quite a few tree owners don't make sure that http access to their trees works (I added Ingo to the Cc: list in the hope that this will be corrected soon for the x86 tree, which I am using quite extensively), and I have to use a much slower machine (a two and a half year old laptop) for these trees. Please see this: <<<<<<< [dmitri.vorobiev@amber ~]$ git clone http://www.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git Initialized empty Git repository in /home/dmitri.vorobiev/linux-2.6-x86/.git/ Getting alternates list for http://www.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git Also look at http://www.kernel.org/home/ftp/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/ Getting pack list for http://www.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git Getting index for pack ded7039bef9c148e5bb991a1b61da1d67c0ad3c2 Getting pack list for http://www.kernel.org/home/ftp/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/ error: Unable to find 08acd4f8af42affd8cbed81cc1b69fa12ddb213f under http://www.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86.git Cannot obtain needed object 08acd4f8af42affd8cbed81cc1b69fa12ddb213f [dmitri.vorobiev@amber ~]$ <<<<<<< Thanks, Dmitri ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 21:42 ` Dmitri Vorobiev @ 2008-04-30 22:06 ` Jiri Slaby 2008-04-30 22:10 ` Andrew Morton 1 sibling, 0 replies; 229+ messages in thread From: Jiri Slaby @ 2008-04-30 22:06 UTC (permalink / raw) To: Dmitri Vorobiev Cc: Andrew Morton, Linus Torvalds, rjw, davem, linux-kernel, Ingo Molnar On 04/30/2008 11:42 PM, Dmitri Vorobiev wrote: > For busy (or lazy) people like myself, the big problem with linux-next are > the frequent merge breakages, when pulling the tree stops with "you are in > the middle of a merge conflict". Perhaps, there is a better way to resolve > this without just removing the whole repo and cloning it once again - this If this is still an issue of -next, I would say we won't get too much testers. I gave up after first time I was attacked by that and got back to pure -mm. I think greg-kh asked why this happens (Stephen rebases?), if you search archives, I'm sure you'll find it. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 21:42 ` Dmitri Vorobiev 2008-04-30 22:06 ` Jiri Slaby @ 2008-04-30 22:10 ` Andrew Morton 2008-04-30 22:19 ` Linus Torvalds ` (2 more replies) 1 sibling, 3 replies; 229+ messages in thread From: Andrew Morton @ 2008-04-30 22:10 UTC (permalink / raw) To: Dmitri Vorobiev; +Cc: torvalds, rjw, davem, linux-kernel, jirislaby, mingo On Thu, 01 May 2008 01:42:59 +0400 Dmitri Vorobiev <dmitri.vorobiev@gmail.com> wrote: > Andrew Morton __________: > > On Wed, 30 Apr 2008 13:31:08 -0700 (PDT) > > Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > >> > >> On Wed, 30 Apr 2008, Andrew Morton wrote: > >>> <jumps up and down> > >>> > >>> There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! > >> The problem I see with both -mm and linux-next is that they tend to be > >> better at finding the "physical conflict" kind of issues (ie the merge > >> itself fails) than the "code looks ok but doesn't actually work" kind of > >> issue. > >> > >> Why? > >> > >> The tester base is simply too small. > >> > >> Now, if *that* could be improved, that would be wonderful, but I'm not > >> seeing it as very likely. > >> > >> I think we have fairly good penetration these days with the regular -git > >> tree, but I think that one is quite frankly a *lot* less scary than -mm or > >> -next are, and there it has been an absolutely huge boon to get the kernel > >> into the Fedora test-builds etc (and I _think_ Ubuntu and SuSE also > >> started something like that). > >> > >> So I'm very pessimistic about getting a lot of test coverage before -rc1. > >> > >> Maybe too pessimistic, who knows? > >> > > > > Well. We'll see. > > > > linux-next is more than another-tree-to-test. It is (or will be) a change > > in our processes and culture. For a start, subsystem maintainers can no > > longer whack away at their own tree as if the rest of use don't exist. > > They now have to be more mindful of merge issues. > > > > Secondly, linux-next is more accessible than -mm: more releases, more > > stable, better tested by he-who-releases it, available via git:// etc. > > Andrew, the latter thing is a very good point. For me personally, the fact > that -mm is not available via git is the major obstacle for trying your > tree more frequently than just a few times per year. Every -mm release if available via git://, as described in the release announcements. The scripts which do this are a bit cantankerous but I believe they do work. <tests it> yup, 2.6.25-mm1 is there. > How difficult it > would be to switch to git for you? Fatal, I expect. A tool which manages source-code files is just the wrong paradigm. I manage _changes_ against someone else's source files. > I guess there are good reasons for still > using the source code management system from the last century; please > correct me if I'm wrong, but I believe that using a modern SCM system could > make life easier for you and your testers, no? > > > > > I get the impression that we're seeing very little non-Stephen testing of > > linux-next at this stage. I hope we can ramp that up a bit, initially by > > having core developers doing at least some basic sanity testing. > > > > For busy (or lazy) people like myself, the big problem with linux-next are > the frequent merge breakages, when pulling the tree stops with "you are in > the middle of a merge conflict". Really? Doesn't Stephen handle all those problems? It should be a clean fetch each time? > Perhaps, there is a better way to resolve > this without just removing the whole repo and cloning it once again - this > is what I'm doing, please flame me for stupidity or ignorance if I simply > am not aware of some git feature that could be useful in such cases. > > Finally, while the list is at it, I'd like to make another technical comment. > My development zoo is a pretty fast 4-way Xeon server, where I keep a handful > of trees, a few cross-toolchains, Qemu, etc. The network setup in our > organization is such that I can use git only over http from that server. Don't know what to do about that, sorry. An off-site git->http proxy might work, but I doubt if anyone has written the code. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:10 ` Andrew Morton @ 2008-04-30 22:19 ` Linus Torvalds 2008-04-30 22:28 ` Dmitri Vorobiev 2008-05-01 23:06 ` Kevin Winchester 2008-04-30 23:04 ` Dmitri Vorobiev 2008-05-01 6:15 ` Jan Engelhardt 2 siblings, 2 replies; 229+ messages in thread From: Linus Torvalds @ 2008-04-30 22:19 UTC (permalink / raw) To: Andrew Morton; +Cc: Dmitri Vorobiev, rjw, davem, linux-kernel, jirislaby, mingo On Wed, 30 Apr 2008, Andrew Morton wrote: > > For busy (or lazy) people like myself, the big problem with linux-next are > > the frequent merge breakages, when pulling the tree stops with "you are in > > the middle of a merge conflict". > > Really? Doesn't Stephen handle all those problems? It should be a clean > fetch each time? It should indeed be a clean fetch, but I wonder if Dmitri perhaps does a "git pull" - which will do the fetch, but then try to _merge_ that fetched state into whatever the last base Dmitri happened to have. Dmitry: you cannot just "git pull" on linux-next, because each version of linux-next is independent of the next one. What you should do is basically # Set this up just once.. git remote add linux-next git://git.kernel.org/pub/scm/linux/kernel/git/sfr/linux-next.git and then after that, you keep on just doing git fetch linux-next git checkout linux-next/master which will get you the actual objects and check out the state of that remote (and then you'll normally never be on a local branch on that tree, git will end up using a so-called "detached head" for this). IOW, you should never need to do any merges, because Stephen did all those in linux-next already. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:19 ` Linus Torvalds @ 2008-04-30 22:28 ` Dmitri Vorobiev 2008-05-01 16:26 ` Diego Calleja 2008-05-01 23:06 ` Kevin Winchester 1 sibling, 1 reply; 229+ messages in thread From: Dmitri Vorobiev @ 2008-04-30 22:28 UTC (permalink / raw) To: Linus Torvalds; +Cc: Andrew Morton, rjw, davem, linux-kernel, jirislaby, mingo Linus Torvalds пишет: > > On Wed, 30 Apr 2008, Andrew Morton wrote: >>> For busy (or lazy) people like myself, the big problem with linux-next are >>> the frequent merge breakages, when pulling the tree stops with "you are in >>> the middle of a merge conflict". >> Really? Doesn't Stephen handle all those problems? It should be a clean >> fetch each time? > > It should indeed be a clean fetch, but I wonder if Dmitri perhaps does a > "git pull" - which will do the fetch, but then try to _merge_ that fetched > state into whatever the last base Dmitri happened to have. > > Dmitry: you cannot just "git pull" on linux-next, because each version of > linux-next is independent of the next one. What you should do is basically > > # Set this up just once.. > git remote add linux-next git://git.kernel.org/pub/scm/linux/kernel/git/sfr/linux-next.git > > and then after that, you keep on just doing > > git fetch linux-next > git checkout linux-next/master > > which will get you the actual objects and check out the state of that > remote (and then you'll normally never be on a local branch on that tree, > git will end up using a so-called "detached head" for this). > > IOW, you should never need to do any merges, because Stephen did all those > in linux-next already. Linus, thanks a lot for the detailed explanation. Indeed, it seems that I foolishly tried to duplicate Stephen's work. In the future I'll do as you suggest here. Dmitri > > Linus > ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:28 ` Dmitri Vorobiev @ 2008-05-01 16:26 ` Diego Calleja 2008-05-01 16:31 ` Dmitri Vorobiev 2008-05-02 1:48 ` Stephen Rothwell 0 siblings, 2 replies; 229+ messages in thread From: Diego Calleja @ 2008-05-01 16:26 UTC (permalink / raw) To: Dmitri Vorobiev Cc: Linus Torvalds, Andrew Morton, rjw, davem, linux-kernel, jirislaby, mingo, Stephen Rothwell El Thu, 01 May 2008 02:28:33 +0400, Dmitri Vorobiev <dmitri.vorobiev@gmail.com> escribió: > Linus, thanks a lot for the detailed explanation. Indeed, it seems that I foolishly > tried to duplicate Stephen's work. In the future I'll do as you suggest here. That "howto" should probably be added to the linux-next announcements... (CC'ing Stephen) ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 16:26 ` Diego Calleja @ 2008-05-01 16:31 ` Dmitri Vorobiev 2008-05-02 1:48 ` Stephen Rothwell 1 sibling, 0 replies; 229+ messages in thread From: Dmitri Vorobiev @ 2008-05-01 16:31 UTC (permalink / raw) To: Diego Calleja Cc: Linus Torvalds, Andrew Morton, rjw, davem, linux-kernel, jirislaby, mingo, Stephen Rothwell Diego Calleja пишет: > El Thu, 01 May 2008 02:28:33 +0400, Dmitri Vorobiev <dmitri.vorobiev@gmail.com> escribió: > >> Linus, thanks a lot for the detailed explanation. Indeed, it seems that I foolishly >> tried to duplicate Stephen's work. In the future I'll do as you suggest here. > > That "howto" should probably be added to the linux-next announcements... > (CC'ing Stephen) > Excellent idea. Thanks, Diego! Dmitri ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-01 16:26 ` Diego Calleja 2008-05-01 16:31 ` Dmitri Vorobiev @ 2008-05-02 1:48 ` Stephen Rothwell 1 sibling, 0 replies; 229+ messages in thread From: Stephen Rothwell @ 2008-05-02 1:48 UTC (permalink / raw) To: Diego Calleja Cc: Dmitri Vorobiev, Linus Torvalds, Andrew Morton, rjw, davem, linux-kernel, jirislaby, mingo [-- Attachment #1: Type: text/plain, Size: 723 bytes --] On Thu, 1 May 2008 18:26:58 +0200 Diego Calleja <diegocg@gmail.com> wrote: > > El Thu, 01 May 2008 02:28:33 +0400, Dmitri Vorobiev <dmitri.vorobiev@gmail.com> escribió: > > > Linus, thanks a lot for the detailed explanation. Indeed, it seems that I foolishly > > tried to duplicate Stephen's work. In the future I'll do as you suggest here. > > That "howto" should probably be added to the linux-next announcements... > (CC'ing Stephen) This is already mentioned in the linux-next wiki (http://linux.f-seidel.de/linux-next/pmwiki/) in the FAQ. I will add a link to the wiki to the announcements. -- Cheers, Stephen Rothwell sfr@canb.auug.org.au http://www.canb.auug.org.au/~sfr/ [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:19 ` Linus Torvalds 2008-04-30 22:28 ` Dmitri Vorobiev @ 2008-05-01 23:06 ` Kevin Winchester 1 sibling, 0 replies; 229+ messages in thread From: Kevin Winchester @ 2008-05-01 23:06 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Dmitri Vorobiev, rjw, davem, linux-kernel, jirislaby, mingo Linus Torvalds wrote: > > On Wed, 30 Apr 2008, Andrew Morton wrote: >>> For busy (or lazy) people like myself, the big problem with linux-next are >>> the frequent merge breakages, when pulling the tree stops with "you are in >>> the middle of a merge conflict". >> Really? Doesn't Stephen handle all those problems? It should be a clean >> fetch each time? > > It should indeed be a clean fetch, but I wonder if Dmitri perhaps does a > "git pull" - which will do the fetch, but then try to _merge_ that fetched > state into whatever the last base Dmitri happened to have. > > Dmitry: you cannot just "git pull" on linux-next, because each version of > linux-next is independent of the next one. What you should do is basically > > # Set this up just once.. > git remote add linux-next git://git.kernel.org/pub/scm/linux/kernel/git/sfr/linux-next.git > > and then after that, you keep on just doing > > git fetch linux-next > git checkout linux-next/master > > which will get you the actual objects and check out the state of that > remote (and then you'll normally never be on a local branch on that tree, > git will end up using a so-called "detached head" for this). > > IOW, you should never need to do any merges, because Stephen did all those > in linux-next already. > Just to add some emphasis here - this is something that took me a long time to figure out, and since it is the pattern for dealing with the x86 trees and with the mm git tree and with linux-next, it would help if it were documented somewhere (not that I can imagine where). Once you know it, it becomes obvious, but try staring at a merge conflict for a while trying to figure out what to do, and it gets frustrating. I wonder if we can guess how many testers abandon the mm git tree or the linux-next tree because of this. It might be nice if git supported a command like git-remote-help or something that would fetch a predefined help file from a remote tree that describes the workflow for that tree. But at least with an extra reply to this mail, it might creep higher in the google search results when looking for merge conflicts with linux-next. -- Kevin Winchester ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:10 ` Andrew Morton 2008-04-30 22:19 ` Linus Torvalds @ 2008-04-30 23:04 ` Dmitri Vorobiev 2008-05-01 15:19 ` Jim Schutt 2008-05-01 6:15 ` Jan Engelhardt 2 siblings, 1 reply; 229+ messages in thread From: Dmitri Vorobiev @ 2008-04-30 23:04 UTC (permalink / raw) To: Andrew Morton; +Cc: torvalds, rjw, davem, linux-kernel, jirislaby, mingo Andrew Morton пишет: [skipped] >> Finally, while the list is at it, I'd like to make another technical comment. >> My development zoo is a pretty fast 4-way Xeon server, where I keep a handful >> of trees, a few cross-toolchains, Qemu, etc. The network setup in our >> organization is such that I can use git only over http from that server. > > Don't know what to do about that, sorry. An off-site git->http proxy might > work, but I doubt if anyone has written the code. But there is another solution, which I believe is straightforward: have the tree maintainer set up his tree properly. Dmitri > > ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 23:04 ` Dmitri Vorobiev @ 2008-05-01 15:19 ` Jim Schutt 0 siblings, 0 replies; 229+ messages in thread From: Jim Schutt @ 2008-05-01 15:19 UTC (permalink / raw) To: linux-kernel Dmitri Vorobiev <dmitri.vorobiev <at> gmail.com> writes: > > Andrew Morton пишет: > > >> The network setup in our > >> organization is such that I can use git only over http from that server. > > > > Don't know what to do about that, sorry. An off-site git->http proxy might > > work, but I doubt if anyone has written the code. Maybe your organization's http proxy will let you tunnel the git protocol through it? GIT_PROXY_COMMAND as described in http://www.gelato.unsw.edu.au/archives/git/0605/20509.html works for me, except I substitue "nc" for "socket" in the proxy script. I.e.: export GIT_PROXY_COMMAND=/usr/local/bin/proxy-cmd.sh where proxy-cmd.sh is: #! /bin/bash (echo "CONNECT $1:$2 HTTP/1.0"; echo; cat ) | \ nc my.proxy.com proxy_port | (read a; read a; cat ) In .git/config there's also gitproxy = /usr/local/bin/proxy-cmd.sh -- Jim ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 22:10 ` Andrew Morton 2008-04-30 22:19 ` Linus Torvalds 2008-04-30 23:04 ` Dmitri Vorobiev @ 2008-05-01 6:15 ` Jan Engelhardt 2 siblings, 0 replies; 229+ messages in thread From: Jan Engelhardt @ 2008-05-01 6:15 UTC (permalink / raw) To: Andrew Morton Cc: Dmitri Vorobiev, torvalds, rjw, davem, linux-kernel, jirislaby, mingo On Thursday 2008-05-01 00:10, Andrew Morton wrote: >> >> Andrew, the latter thing is a very good point. For me personally, the fact >> that -mm is not available via git is the major obstacle for trying your >> tree more frequently than just a few times per year. > >Every -mm release if available via git://, as described in the release >announcements. [...] >> How difficult it >> would be to switch to git for you? > >Fatal, I expect. A tool which manages source-code files is just the wrong >paradigm. I manage _changes_ against someone else's source files. Would you mind using stgit? That you way have the queue patch functionality, yet a simple git-push -f will send the whole patch stack over to a repo (without the stgit bits that is), leaving what looks like a regular tree with just lots of recent commits. Does not even need extra scripts to do a patchset->git conversion. >> For busy (or lazy) people like myself, the big problem with linux-next are >> the frequent merge breakages, when pulling the tree stops with "you are in >> the middle of a merge conflict". > >Really? Doesn't Stephen handle all those problems? It should be a clean >fetch each time? Indeed, assuming the remote is set up and you have a local branch, `git reset --hard mm/master` after a fetch is the thing. But be sure not to have any changed files. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:54 ` Andrew Morton 2008-04-30 21:21 ` David Miller 2008-04-30 21:42 ` Dmitri Vorobiev @ 2008-05-09 9:28 ` Jiri Kosina 2008-05-09 15:00 ` Jeff Garzik 2 siblings, 1 reply; 229+ messages in thread From: Jiri Kosina @ 2008-05-09 9:28 UTC (permalink / raw) To: Andrew Morton; +Cc: Linus Torvalds, rjw, davem, linux-kernel, jirislaby On Wed, 30 Apr 2008, Andrew Morton wrote: > I get the impression that we're seeing very little non-Stephen testing > of linux-next at this stage. I hope we can ramp that up a bit, > initially by having core developers doing at least some basic sanity > testing. Probably it would make sense also for distro vendors to make linux-next snapshosts available in their development distro branches (redhat's rawhide, opensuse's factory, etc), to make it easier to test by those users who are willing to test if it works in their environment, but don't want to compile kernels themselves. -- Jiri Kosina ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-05-09 9:28 ` Jiri Kosina @ 2008-05-09 15:00 ` Jeff Garzik 0 siblings, 0 replies; 229+ messages in thread From: Jeff Garzik @ 2008-05-09 15:00 UTC (permalink / raw) To: Jiri Kosina Cc: Andrew Morton, Linus Torvalds, rjw, davem, linux-kernel, jirislaby Jiri Kosina wrote: > On Wed, 30 Apr 2008, Andrew Morton wrote: > >> I get the impression that we're seeing very little non-Stephen testing >> of linux-next at this stage. I hope we can ramp that up a bit, >> initially by having core developers doing at least some basic sanity >> testing. I try to test linux-next on a few SATA test boxes, but it's definitely not a daily thing. > Probably it would make sense also for distro vendors to make linux-next > snapshosts available in their development distro branches (redhat's > rawhide, opensuse's factory, etc), to make it easier to test by those > users who are willing to test if it works in their environment, but don't > want to compile kernels themselves. Agreed... any lead time on linux-next testing would be great. Jeff ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 20:31 ` Linus Torvalds 2008-04-30 20:47 ` Dan Noe 2008-04-30 20:54 ` Andrew Morton @ 2008-04-30 21:52 ` H. Peter Anvin 2008-05-01 3:24 ` Bob Tracy 2008-05-01 16:39 ` Valdis.Kletnieks 2008-05-01 0:31 ` RFC: starting a kernel-testers group for newbies Adrian Bunk 3 siblings, 2 replies; 229+ messages in thread From: H. Peter Anvin @ 2008-04-30 21:52 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby Linus Torvalds wrote: > > The tester base is simply too small. > > Now, if *that* could be improved, that would be wonderful, but I'm not > seeing it as very likely. > One thing is that we keep fragmenting the tester base by adding new confidence levels: we now have -mm, -next, mainline -git, mainline -rc, mainline release, stable, distro testing, and distro release (and some distros even have aggressive versus conservative tracks.) Furthermore, thanks to craniorectal immersion on the part of graphics vendors, a lot of users have to run proprietary drivers on their "main work" systems, which means they can't even test newer releases even if they would dare. This fragmentation is largely intentional, of course -- everyone can pick a risk level appropriate for them -- but it does mean: a) The lag for a patch to ride through the pipeline is pretty long. b) The section of people who are going to use the more aggressive trees for "real work" testing is going to be small. -hpa ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 21:52 ` H. Peter Anvin @ 2008-05-01 3:24 ` Bob Tracy 2008-05-01 16:39 ` Valdis.Kletnieks 1 sibling, 0 replies; 229+ messages in thread From: Bob Tracy @ 2008-05-01 3:24 UTC (permalink / raw) To: H. Peter Anvin Cc: Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby H. Peter Anvin wrote: > Linus Torvalds wrote: > > > > The tester base is simply too small. > > > > Now, if *that* could be improved, that would be wonderful, but I'm not > > seeing it as very likely. > > > > One thing is that we keep fragmenting the tester base by adding new > confidence levels: we now have -mm, -next, mainline -git, mainline -rc, > mainline release, stable, distro testing, and distro release (and some > distros even have aggressive versus conservative tracks.) Furthermore, > thanks to craniorectal immersion on the part of graphics vendors, a lot > of users have to run proprietary drivers on their "main work" systems, > which means they can't even test newer releases even if they would dare. Since I poke my head out of the foxhole every once in a while with a relatively late-breaking bug report, I thought I should chime in... Mr. Anvin has pretty much nailed it... As the kernel development process has evolved, which "confidence level" I select has evolved as well. The thing that *hasn't* changed through the years is, I tend to pick a "confidence level" that is appropriately close to "mainline" and has an update release schedule roughly compatible with my ability to keep up with it. Specifically, if it takes me several hours to download a patch set, apply it, build the new kernel, and test on multiple platforms/architectures, then the update release schedule is probably going to have to be no more often than twice a week if I'm going to be at all interested in even trying to keep up with it. In 2008, the "-rcX" updates are a good fit. In the not-too-distant past, keeping up with 2.5.X.Y was no problem. Yes, I realize I don't *have* to test every revision level in every major tree, but I don't have to think about which one to pick for testing if I can keep up with the update release schedule :-). -- ------------------------------------------------------------------------ Bob Tracy | "I was a beta tester for dirt. They never did rct@frus.com | get all the bugs out." - Steve McGrew on /. ------------------------------------------------------------------------ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! 2008-04-30 21:52 ` H. Peter Anvin 2008-05-01 3:24 ` Bob Tracy @ 2008-05-01 16:39 ` Valdis.Kletnieks 1 sibling, 0 replies; 229+ messages in thread From: Valdis.Kletnieks @ 2008-05-01 16:39 UTC (permalink / raw) To: H. Peter Anvin Cc: Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby [-- Attachment #1: Type: text/plain, Size: 808 bytes --] On Wed, 30 Apr 2008 14:52:44 PDT, "H. Peter Anvin" said: > This fragmentation is largely intentional, of course -- everyone can > pick a risk level appropriate for them -- but it does mean: > > a) The lag for a patch to ride through the pipeline is pretty long. > b) The section of people who are going to use the more aggressive trees > for "real work" testing is going to be small. And another problem is that often, it's hard to get good "real work" coverage over the whole tree. I just discovered an apparent borkage somewhere in the networking/wireless area that seems to have gotten into Linus's tree somewhere between 24-rc8 and 24-final, just because I haven't beaten on my wireless card in the last few weeks, so I didn't notice a regression in 'ip link show' related to the rfkill switch... [-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --] ^ permalink raw reply [flat|nested] 229+ messages in thread
* RFC: starting a kernel-testers group for newbies 2008-04-30 20:31 ` Linus Torvalds ` (2 preceding siblings ...) 2008-04-30 21:52 ` H. Peter Anvin @ 2008-05-01 0:31 ` Adrian Bunk 2008-04-30 7:03 ` Arjan van de Ven 2008-05-01 0:41 ` David Miller 3 siblings, 2 replies; 229+ messages in thread From: Adrian Bunk @ 2008-05-01 0:31 UTC (permalink / raw) To: Linus Torvalds Cc: Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Wed, Apr 30, 2008 at 01:31:08PM -0700, Linus Torvalds wrote: > > > On Wed, 30 Apr 2008, Andrew Morton wrote: > > > > <jumps up and down> > > > > There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! > > The problem I see with both -mm and linux-next is that they tend to be > better at finding the "physical conflict" kind of issues (ie the merge > itself fails) than the "code looks ok but doesn't actually work" kind of > issue. > > Why? > > The tester base is simply too small. > > Now, if *that* could be improved, that would be wonderful, but I'm not > seeing it as very likely. > > I think we have fairly good penetration these days with the regular -git > tree, but I think that one is quite frankly a *lot* less scary than -mm or > -next are, and there it has been an absolutely huge boon to get the kernel > into the Fedora test-builds etc (and I _think_ Ubuntu and SuSE also > started something like that). > > So I'm very pessimistic about getting a lot of test coverage before -rc1. > > Maybe too pessimistic, who knows? First of all: I 100% agree with Andrew that our biggest problems are in reviewing code and resolving bugs, not in finding bugs (we already have far too many unresolved bugs). But although testing mustn't replace code reviews it is a great help, especially for identifying regressions early. Finding testers should actually be relatively easy since it doesn't require much knowledge from the testers. And it could even solve a second problem: It could be a way for getting newbies into kernel development. We actually do only rarely have tasks suitable as janitor tasks for newbies, and the results of people who do neither know the kernel nor know C running checkpatch on files in the kernel have already been discussed extensively... I'll try to do this: - create some Wiki page - get a mailing list at vger - point newbies to this mailing list - tell people there which kernels to test - figure out and document stuff like how to bisect between -next kernels - help them to do whatever is required for a proper bug report > Linus cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 0:31 ` RFC: starting a kernel-testers group for newbies Adrian Bunk @ 2008-04-30 7:03 ` Arjan van de Ven 2008-05-01 8:13 ` Andrew Morton 2008-05-01 11:30 ` Adrian Bunk 2008-05-01 0:41 ` David Miller 1 sibling, 2 replies; 229+ messages in thread From: Arjan van de Ven @ 2008-04-30 7:03 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Thu, 1 May 2008 03:31:25 +0300 Adrian Bunk <bunk@kernel.org> wrote: > On Wed, Apr 30, 2008 at 01:31:08PM -0700, Linus Torvalds wrote: > > > > > > On Wed, 30 Apr 2008, Andrew Morton wrote: > > > > > > <jumps up and down> > > > > > > There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! > > > > The problem I see with both -mm and linux-next is that they tend to > > be better at finding the "physical conflict" kind of issues (ie the > > merge itself fails) than the "code looks ok but doesn't actually > > work" kind of issue. > > > > Why? > > > > The tester base is simply too small. > > > > Now, if *that* could be improved, that would be wonderful, but I'm > > not seeing it as very likely. > > > > I think we have fairly good penetration these days with the regular > > -git tree, but I think that one is quite frankly a *lot* less scary > > than -mm or -next are, and there it has been an absolutely huge > > boon to get the kernel into the Fedora test-builds etc (and I > > _think_ Ubuntu and SuSE also started something like that). > > > > So I'm very pessimistic about getting a lot of test coverage before > > -rc1. > > > > Maybe too pessimistic, who knows? > > First of all: > I 100% agree with Andrew that our biggest problems are in reviewing > code and resolving bugs, not in finding bugs (we already have far too > many unresolved bugs). I would argue instead that we don't know which bugs to fix first. We're never going to fix all bugs, and to be honest, that's ok. As long as we fix the important bugs, we're doing really well. And at least for the kerneloops.org reported issues, we're doing quite ok. For me, 'important' is a combination of effect of the bug and the number of people it'll hit. A compiler warning on parisc is less important than easy to trigger filesystem corruption in ext3 that way; more people will hit it and the effect is more grave. For oopses and WARN_ON()'s were getting to the hang of this now with kerneloops.org, at least for the oopses that aren't really hard fatal. One thing I learned at least is that lkml is a poor representation of what people actually hit; it's a very very selective audience. oopses/warnons are only a subset of the bugs of course... but still. So there's a few things we (and you / janitors) can do over time to get better data on what issues people hit: 1) Get automated collection of issues more wide spread. The wider our net the better we know which issues get hit a lot, and plain the more data we have on when things start, when they stop, etc etc. Especially if you get a lot of testers in your project, I'd like them to install the client for easy reporting of issues. 2) We should add more WARN_ON()s on "known bad" conditions. If it WARN_ON()'s, we can learn about it via the automated collection. And we can then do the statistics to figure out which ones happen a lot. 3) We need to get persistent-across-reboot oops saving going; there's some venues for this ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-04-30 7:03 ` Arjan van de Ven @ 2008-05-01 8:13 ` Andrew Morton 2008-04-30 14:15 ` Arjan van de Ven 2008-05-01 9:16 ` RFC: starting a kernel-testers group for newbies Frans Pop 2008-05-01 11:30 ` Adrian Bunk 1 sibling, 2 replies; 229+ messages in thread From: Andrew Morton @ 2008-05-01 8:13 UTC (permalink / raw) To: Arjan van de Ven Cc: Adrian Bunk, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Wed, 30 Apr 2008 00:03:38 -0700 Arjan van de Ven <arjan@infradead.org> wrote: > > First of all: > > I 100% agree with Andrew that our biggest problems are in reviewing > > code and resolving bugs, not in finding bugs (we already have far too > > many unresolved bugs). > > I would argue instead that we don't know which bugs to fix first. <boggle> How about "a bug which we just added"? One which is repeatable. Repeatable by a tester who is prepared to work with us on resolving it. Those bugs. Rafael has a list of them. We release kernels when that list still has tens of unfixed regressions dating back up to a couple of months. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 8:13 ` Andrew Morton @ 2008-04-30 14:15 ` Arjan van de Ven 2008-05-01 12:42 ` David Woodhouse 2008-05-04 12:45 ` Rene Herman 2008-05-01 9:16 ` RFC: starting a kernel-testers group for newbies Frans Pop 1 sibling, 2 replies; 229+ messages in thread From: Arjan van de Ven @ 2008-04-30 14:15 UTC (permalink / raw) To: Andrew Morton Cc: Adrian Bunk, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Thu, 1 May 2008 01:13:46 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > On Wed, 30 Apr 2008 00:03:38 -0700 Arjan van de Ven > <arjan@infradead.org> wrote: > > > > First of all: > > > I 100% agree with Andrew that our biggest problems are in > > > reviewing code and resolving bugs, not in finding bugs (we > > > already have far too many unresolved bugs). > > > > I would argue instead that we don't know which bugs to fix first. > > <boggle> > > How about "a bug which we just added"? One which is repeatable. > Repeatable by a tester who is prepared to work with us on resolving > it. Those bugs. > > Rafael has a list of them. We release kernels when that list still > has tens of unfixed regressions dating back up to a couple of months. > I know he does. But I will still argue that if that is all we work from, and treat all of those equally, we're doing the wrong thing. I'm sorry, but I really do not consider "ext4 doesn't compile on m68k" which is on that list to be as relevant as a "i915 drm driver crashes" bug which is among us for a while and not on that list, just based on the total user base for either of those. Does that mean nobody should fix the m68k bug? Someone who cares about m68k for sure should work on it, or if it's easy for an ext4 developer, sure. But if the ext4 person has to spend 8 hours on it figuring cross compilers, I say we're doing something very wrong here. (no offense to the m68k people, but there's just a few of you; maybe I should have picked voyager instead) Maybe that's a "boggle" for you; but for me that's symptomatic of where we are today: We don't make (effective) prioritization decisions. Such decisions are hard, because it effectively means telling people "I'm sorry but your bug is not yet important". That's unpopular, especially if the reporter is very motivated on lkml. And it will involve a certain amount of non-quantifiable judgement calls, which also means we won't always be right. Another hard thing is that lkml is a very self-selective audience. A bug may be reported three times there, but never hit otherwise, while another bug might not be reported at all (or only once) while thousands and thousands of people are hitting it. Not that we're doing all that bad, we ARE fixing the bugs (at least the oopses/warnings) that are frequently hit. So I wouldn't blindly say we're doing a bad job at prioritizing. I would rather say that if we focus only on what is left afterwards without doing a reality check, we'll *always* have a negative view of quality, since there will *always* be bugs we don't fix. Linux well over ten million users (much more if you count embedded devices). A lot of them will have "standard" hardware, and a bunch of them will have "weird" stuff. Cosmic rays happen. As do overclocking and bad DIMMs. And some BIOSes are just weird etc etc. If we do not prioritize effectively we'll be stuck forever chasing ghosts, or we'll be stuck saying "our quality sucks" forever without making progress. Another trap is to only look at what goes wrong, not on what goes right... we tend to only see what goes wrong on lkml and it's an easy trap to fall into doomthinking that way. Are we doing worse on quality? My (subjective) opinion is that we are doing better than last year. We are focused more on quality. We are fixing the bugs that people hit most. We are fixing most of the regressions (yes, not all). Subsystems are seeing flat or lower bugcounts/bugrates. Take ACPI, the number of outstanding bugs *halved* over the last year. Of course you can pick a single bug and say "but this one did not get fixed", but that just loses the big picture (and proves the point :). All of this with a growing userbase and a rate of development that's a bit faster than last year as well. Can we do better? Always. More testing will help. Both to detect things early, and by letting us figure out which bugs are important. Just saying "more testing is not relevant because we're not even fixing the bugs we have now" is just incorrect. Sorry. More testers helps. Wider range of hardware/usages allows us to find better patterns in the hard to track down bugs. More testers means more people willing to see if they can diagnose the bugs at least somewhat themselves, via bisection or otherwise. That's important, because that's the part of the problem that scales well with a growing userbase. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-04-30 14:15 ` Arjan van de Ven @ 2008-05-01 12:42 ` David Woodhouse 2008-04-30 15:02 ` Arjan van de Ven 2008-05-05 10:03 ` Benny Halevy 2008-05-04 12:45 ` Rene Herman 1 sibling, 2 replies; 229+ messages in thread From: David Woodhouse @ 2008-05-01 12:42 UTC (permalink / raw) To: Arjan van de Ven Cc: Andrew Morton, Adrian Bunk, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Wed, 2008-04-30 at 07:15 -0700, Arjan van de Ven wrote: > Maybe that's a "boggle" for you; but for me that's symptomatic of > where we are today: We don't make (effective) prioritization > decisions. Such decisions are hard, because it effectively means > telling people "I'm sorry but your bug is not yet important". It's not that clear-cut, either. Something which manifests itself as a build failure or an immediate test failure on m68k alone, might actually turn out to cause subtle data corruption on other platforms. You can't always know that it isn't important, just because it only shows up in some esoteric circumstances. You only really know how important it was _after_ you've fixed it. That obviously doesn't help us to prioritise. -- dwmw2 ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 12:42 ` David Woodhouse @ 2008-04-30 15:02 ` Arjan van de Ven 2008-05-05 10:03 ` Benny Halevy 1 sibling, 0 replies; 229+ messages in thread From: Arjan van de Ven @ 2008-04-30 15:02 UTC (permalink / raw) To: David Woodhouse Cc: Andrew Morton, Adrian Bunk, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Thu, 01 May 2008 13:42:44 +0100 David Woodhouse <dwmw2@infradead.org> wrote: > On Wed, 2008-04-30 at 07:15 -0700, Arjan van de Ven wrote: > > Maybe that's a "boggle" for you; but for me that's symptomatic of > > where we are today: We don't make (effective) prioritization > > decisions. Such decisions are hard, because it effectively means > > telling people "I'm sorry but your bug is not yet important". > > It's not that clear-cut, either. Something which manifests itself as a > build failure or an immediate test failure on m68k alone, might > actually turn out to cause subtle data corruption on other platforms. > > You can't always know that it isn't important, just because it only > shows up in some esoteric circumstances. You only really know how > important it was _after_ you've fixed it. > > That obviously doesn't help us to prioritise. absolutely. I'm not going to argue that prioritization is easy. Or that we'll be able to get it right all the time. Doesn't mean we shouldn't try at least somewhat.. > ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 12:42 ` David Woodhouse 2008-04-30 15:02 ` Arjan van de Ven @ 2008-05-05 10:03 ` Benny Halevy 1 sibling, 0 replies; 229+ messages in thread From: Benny Halevy @ 2008-05-05 10:03 UTC (permalink / raw) To: David Woodhouse Cc: Arjan van de Ven, Andrew Morton, Adrian Bunk, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On May. 01, 2008, 15:42 +0300, David Woodhouse <dwmw2@infradead.org> wrote: > On Wed, 2008-04-30 at 07:15 -0700, Arjan van de Ven wrote: >> Maybe that's a "boggle" for you; but for me that's symptomatic of >> where we are today: We don't make (effective) prioritization >> decisions. Such decisions are hard, because it effectively means >> telling people "I'm sorry but your bug is not yet important". > > It's not that clear-cut, either. Something which manifests itself as a > build failure or an immediate test failure on m68k alone, might actually > turn out to cause subtle data corruption on other platforms. > > You can't always know that it isn't important, just because it only > shows up in some esoteric circumstances. You only really know how > important it was _after_ you've fixed it. > > That obviously doesn't help us to prioritise. > Ideally, you'd do an analysis first and then prioritize, based on the severity of the bug, its exposure, how easy it is it fix, etc. If while doing that you already have a fix at hand, you're almost done :) Recursively, there's the problem of which bugs you analyze first. I'm inclined to say that you want to analyze most if not all bug reports in higher priority than working on fixing non-critical bug. Benny ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-04-30 14:15 ` Arjan van de Ven 2008-05-01 12:42 ` David Woodhouse @ 2008-05-04 12:45 ` Rene Herman 2008-05-04 13:00 ` Pekka Enberg 1 sibling, 1 reply; 229+ messages in thread From: Rene Herman @ 2008-05-04 12:45 UTC (permalink / raw) To: Arjan van de Ven Cc: Andrew Morton, Adrian Bunk, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On 30-04-08 16:15, Arjan van de Ven wrote: > Does that mean nobody should fix the m68k bug? Someone who cares about > m68k for sure should work on it, or if it's easy for an ext4 developer, > sure. But if the ext4 person has to spend 8 hours on it figuring cross > compilers, I say we're doing something very wrong here. (no offense to > the m68k people, but there's just a few of you; maybe I should have > picked voyager instead) On that note, I'd really like to see better binary availability of cross compilers. While it's improved over the last few years mostly due to the crossgcc stuff it's still a pain. Ideally, they would be available through the distribution package manager even but failing that some dedicated place on kernel.org with x86->lots and some of the more widely used other combinations would quite definitely be good. Perhaps not really directly relevant to this thread as such, but still good. Andrew maintain{s,ed} a number of them at http://userweb.kernel.org/~akpm/cross-compilers/ But as you see, most of the stuff there is really old again... Rene ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-04 12:45 ` Rene Herman @ 2008-05-04 13:00 ` Pekka Enberg 2008-05-04 13:19 ` Rene Herman 2008-05-05 13:13 ` crosscompiler [WAS: RFC: starting a kernel-testers group for newbies] Enrico Weigelt 0 siblings, 2 replies; 229+ messages in thread From: Pekka Enberg @ 2008-05-04 13:00 UTC (permalink / raw) To: Rene Herman Cc: Arjan van de Ven, Andrew Morton, Adrian Bunk, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt, Vegard Nossum On Sun, May 4, 2008 at 3:45 PM, Rene Herman <rene.herman@keyaccess.nl> wrote: > On that note, I'd really like to see better binary availability of cross > compilers. While it's improved over the last few years mostly due to the > crossgcc stuff it's still a pain. Ideally, they would be available through > the distribution package manager even but failing that some dedicated place > on kernel.org with x86->lots and some of the more widely used other > combinations would quite definitely be good. Perhaps not really directly > relevant to this thread as such, but still good. > > Andrew maintain{s,ed} a number of them at > > http://userweb.kernel.org/~akpm/cross-compilers/ > > But as you see, most of the stuff there is really old again... You're most welcome to help out Vegard to do this: http://www.kernel.org/pub/tools/crosstool/ ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-04 13:00 ` Pekka Enberg @ 2008-05-04 13:19 ` Rene Herman 2008-05-05 13:13 ` crosscompiler [WAS: RFC: starting a kernel-testers group for newbies] Enrico Weigelt 1 sibling, 0 replies; 229+ messages in thread From: Rene Herman @ 2008-05-04 13:19 UTC (permalink / raw) To: Pekka Enberg Cc: Arjan van de Ven, Andrew Morton, Adrian Bunk, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt, Vegard Nossum On 04-05-08 15:00, Pekka Enberg wrote: > On Sun, May 4, 2008 at 3:45 PM, Rene Herman <rene.herman@keyaccess.nl> wrote: >> On that note, I'd really like to see better binary availability of cross >> compilers. While it's improved over the last few years mostly due to the >> crossgcc stuff it's still a pain. Ideally, they would be available through >> the distribution package manager even but failing that some dedicated place >> on kernel.org with x86->lots and some of the more widely used other >> combinations would quite definitely be good. Perhaps not really directly >> relevant to this thread as such, but still good. >> >> Andrew maintain{s,ed} a number of them at >> >> http://userweb.kernel.org/~akpm/cross-compilers/ >> >> But as you see, most of the stuff there is really old again... > > You're most welcome to help out Vegard to do this: > > http://www.kernel.org/pub/tools/crosstool/ Ah, thanks, lovely, just new I see (and yes, I meant s/grossgcc/crosstool/). Good thing. I'll check it out and see if there's anything to add. Rene. ^ permalink raw reply [flat|nested] 229+ messages in thread
* crosscompiler [WAS: RFC: starting a kernel-testers group for newbies] 2008-05-04 13:00 ` Pekka Enberg 2008-05-04 13:19 ` Rene Herman @ 2008-05-05 13:13 ` Enrico Weigelt 1 sibling, 0 replies; 229+ messages in thread From: Enrico Weigelt @ 2008-05-05 13:13 UTC (permalink / raw) To: linux kernel list * Pekka Enberg <penberg@cs.helsinki.fi> wrote: > You're most welcome to help out Vegard to do this: > > http://www.kernel.org/pub/tools/crosstool/ You could also use ct-ng: http://ymorin.is-a-geek.org/dokuwiki/projects/crosstool Works excellent for me :) cu -- --------------------------------------------------------------------- Enrico Weigelt == metux IT service - http://www.metux.de/ --------------------------------------------------------------------- Please visit the OpenSource QM Taskforce: http://wiki.metux.de/public/OpenSource_QM_Taskforce Patches / Fixes for a lot dozens of packages in dozens of versions: http://patches.metux.de/ --------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 8:13 ` Andrew Morton 2008-04-30 14:15 ` Arjan van de Ven @ 2008-05-01 9:16 ` Frans Pop 2008-05-01 10:30 ` Enrico Weigelt 1 sibling, 1 reply; 229+ messages in thread From: Frans Pop @ 2008-05-01 9:16 UTC (permalink / raw) To: Andrew Morton Cc: arjan, bunk, torvalds, rjw, davem, linux-kernel, jirislaby, rostedt Andrew Morton wrote: > On Wed, 30 Apr 2008 00:03:38 -0700 Arjan van de Ven <arjan@infradead.org> > wrote: >> I would argue instead that we don't know which bugs to fix first. > > How about "a bug which we just added"? And leave unfixed all the regressions introduced in earlier kernel versions and known at the time of the release of that version but still present in the current version? Not to mention all the other bugs reported by users of recent stable versions? > One which is repeatable. > Repeatable by a tester who is prepared to work with us on resolving it. That can be true for not-so-recently introduced bugs too. There are so many bugs out there and developers tend to focus on new ones leaving a lot of others unattended, both important and not so important ones. Which ones should someone focus on? Maybe on the ones that someone (helped) introduce him/herself. Maybe that should even sometimes be prioritized over introducing new bugs^W^W^Wdoing new development. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 9:16 ` RFC: starting a kernel-testers group for newbies Frans Pop @ 2008-05-01 10:30 ` Enrico Weigelt 2008-05-01 13:02 ` Adrian Bunk 0 siblings, 1 reply; 229+ messages in thread From: Enrico Weigelt @ 2008-05-01 10:30 UTC (permalink / raw) To: linux kernel list <big_snip /> Hi folks, what do you think about Gentoo's "bug-wrangler" concept ? Maybe could do something similar: An Tester group (which eg. should be the entry point for newbies), is responsible for receiving bug reports from users (maybe even distro maintainers who're not directly involved in kernel dev.). They try to reproduce the bugs and find out as much as they can, then file a report to the actual kernel devs (just critical bugs are directly kicked to the devs with high priority). Maybe this group could also keep users informed about fixes and give some upgrade advise, etc. This way we can build an good technical support (independent from distributors ;-P), newbies can learn on the job and te load on kernel devs is reduced, so they can better concentrate on their core competences. What do you think about this ? cu -- --------------------------------------------------------------------- Enrico Weigelt == metux IT service - http://www.metux.de/ --------------------------------------------------------------------- Please visit the OpenSource QM Taskforce: http://wiki.metux.de/public/OpenSource_QM_Taskforce Patches / Fixes for a lot dozens of packages in dozens of versions: http://patches.metux.de/ --------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 10:30 ` Enrico Weigelt @ 2008-05-01 13:02 ` Adrian Bunk 0 siblings, 0 replies; 229+ messages in thread From: Adrian Bunk @ 2008-05-01 13:02 UTC (permalink / raw) To: Enrico Weigelt; +Cc: linux kernel list On Thu, May 01, 2008 at 12:30:00PM +0200, Enrico Weigelt wrote: > > <big_snip /> > > Hi folks, > > > what do you think about Gentoo's "bug-wrangler" concept ? > Maybe could do something similar: > > An Tester group (which eg. should be the entry point for newbies), > is responsible for receiving bug reports from users (maybe even > distro maintainers who're not directly involved in kernel dev.). > They try to reproduce the bugs and find out as much as they can, > then file a report to the actual kernel devs (just critical bugs > are directly kicked to the devs with high priority). Maybe this > group could also keep users informed about fixes and give some > upgrade advise, etc. > > This way we can build an good technical support (independent > from distributors ;-P), newbies can learn on the job and te > load on kernel devs is reduced, so they can better concentrate > on their core competences. > > What do you think about this ? Andrew already does more or less this. The problems are: - kernel bugs tend to very quickly reach the state where you need expert knowledge in some area, and there's definitely not much room for newbies in bug handling - "try to reproduce the bugs" works for much software, but in the kernel bugs often tend to depend on some specific hardware > cu cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-04-30 7:03 ` Arjan van de Ven 2008-05-01 8:13 ` Andrew Morton @ 2008-05-01 11:30 ` Adrian Bunk 2008-04-30 14:20 ` Arjan van de Ven 1 sibling, 1 reply; 229+ messages in thread From: Adrian Bunk @ 2008-05-01 11:30 UTC (permalink / raw) To: Arjan van de Ven Cc: Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Wed, Apr 30, 2008 at 12:03:38AM -0700, Arjan van de Ven wrote: > On Thu, 1 May 2008 03:31:25 +0300 > Adrian Bunk <bunk@kernel.org> wrote: > > > On Wed, Apr 30, 2008 at 01:31:08PM -0700, Linus Torvalds wrote: > > > > > > > > > On Wed, 30 Apr 2008, Andrew Morton wrote: > > > > > > > > <jumps up and down> > > > > > > > > There should be nothing in 2.6.x-rc1 which wasn't in 2.6.x-mm1! > > > > > > The problem I see with both -mm and linux-next is that they tend to > > > be better at finding the "physical conflict" kind of issues (ie the > > > merge itself fails) than the "code looks ok but doesn't actually > > > work" kind of issue. > > > > > > Why? > > > > > > The tester base is simply too small. > > > > > > Now, if *that* could be improved, that would be wonderful, but I'm > > > not seeing it as very likely. > > > > > > I think we have fairly good penetration these days with the regular > > > -git tree, but I think that one is quite frankly a *lot* less scary > > > than -mm or -next are, and there it has been an absolutely huge > > > boon to get the kernel into the Fedora test-builds etc (and I > > > _think_ Ubuntu and SuSE also started something like that). > > > > > > So I'm very pessimistic about getting a lot of test coverage before > > > -rc1. > > > > > > Maybe too pessimistic, who knows? > > > > First of all: > > I 100% agree with Andrew that our biggest problems are in reviewing > > code and resolving bugs, not in finding bugs (we already have far too > > many unresolved bugs). > > I would argue instead that we don't know which bugs to fix first. > We're never going to fix all bugs, and to be honest, that's ok. >... That might be OK. But our current status quo is not OK: Check Rafael's regressions lists asking yourself "How many regressions are older than two weeks?" The kernel Bugzilla curerntly knows about 212 open regression bugs. (And many more have not made it into Bugzilla.) We have unmaintained and de facto unmaintained parts of the kernel where even issues that might be easy to fix don't get fixed. >... > So there's a few things we (and you / janitors) can do over time to get better data on what issues > people hit: > 1) Get automated collection of issues more wide spread. The wider our net the better we know which > issues get hit a lot, and plain the more data we have on when things start, when they stop, etc etc. > Especially if you get a lot of testers in your project, I'd like them to install the client for easy reporting > of issues. > 2) We should add more WARN_ON()s on "known bad" conditions. If it WARN_ON()'s, we can learn about it via > the automated collection. And we can then do the statistics to figure out which ones happen a lot. > 3) We need to get persistent-across-reboot oops saving going; there's some venues for this No disagreement on this, its just a different issue than our bug fixing problem. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 11:30 ` Adrian Bunk @ 2008-04-30 14:20 ` Arjan van de Ven 2008-05-01 12:53 ` Rafael J. Wysocki 2008-05-01 13:21 ` Adrian Bunk 0 siblings, 2 replies; 229+ messages in thread From: Arjan van de Ven @ 2008-04-30 14:20 UTC (permalink / raw) To: Adrian Bunk Cc: Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Thu, 1 May 2008 14:30:38 +0300 Adrian Bunk <bunk@kernel.org> wrote: > On Wed, Apr 30, 2008 at 12:03:38AM -0700, Arjan van de Ven wrote: > > On Thu, 1 May 2008 03:31:25 +0300 > > Adrian Bunk <bunk@kernel.org> wrote: > > > > > On Wed, Apr 30, 2008 at 01:31:08PM -0700, Linus Torvalds wrote: > > > > > > > > > > > > On Wed, 30 Apr 2008, Andrew Morton wrote: > > > > > > > > > > <jumps up and down> > > > > > > > > > > There should be nothing in 2.6.x-rc1 which wasn't in > > > > > 2.6.x-mm1! > > > > > > > > The problem I see with both -mm and linux-next is that they > > > > tend to be better at finding the "physical conflict" kind of > > > > issues (ie the merge itself fails) than the "code looks ok but > > > > doesn't actually work" kind of issue. > > > > > > > > Why? > > > > > > > > The tester base is simply too small. > > > > > > > > Now, if *that* could be improved, that would be wonderful, but > > > > I'm not seeing it as very likely. > > > > > > > > I think we have fairly good penetration these days with the > > > > regular -git tree, but I think that one is quite frankly a > > > > *lot* less scary than -mm or -next are, and there it has been > > > > an absolutely huge boon to get the kernel into the Fedora > > > > test-builds etc (and I _think_ Ubuntu and SuSE also started > > > > something like that). > > > > > > > > So I'm very pessimistic about getting a lot of test coverage > > > > before -rc1. > > > > > > > > Maybe too pessimistic, who knows? > > > > > > First of all: > > > I 100% agree with Andrew that our biggest problems are in > > > reviewing code and resolving bugs, not in finding bugs (we > > > already have far too many unresolved bugs). > > > > I would argue instead that we don't know which bugs to fix first. > > We're never going to fix all bugs, and to be honest, that's ok. > >... > > That might be OK. > > But our current status quo is not OK: > > Check Rafael's regressions lists asking yourself > "How many regressions are older than two weeks?" "ext4 doesn't compile on m68k". YAWN. Wrong question... "How many bugs that a sizable portion of users will hit in reality are there?" is the right question to ask... > > We have unmaintained and de facto unmaintained parts of the kernel > where even issues that might be easy to fix don't get fixed. And how many people are hitting those issues? If a part of the kernel is really important to enough people, there tends to be someone who stands up to either fix the issue or start de-facto maintaining that part. And yes I know there's parts where that doesn't hold. But to be honest, there's not that many of them that have active development (and thus get the biggest share of regressions) > > >... > > So there's a few things we (and you / janitors) can do over time to > > get better data on what issues people hit: > > 1) Get automated collection of issues more wide spread. The wider > > our net the better we know which issues get hit a lot, and plain > > the more data we have on when things start, when they stop, etc > > etc. Especially if you get a lot of testers in your project, I'd > > like them to install the client for easy reporting of issues. 2) We > > should add more WARN_ON()s on "known bad" conditions. If it > > WARN_ON()'s, we can learn about it via the automated collection. > > And we can then do the statistics to figure out which ones happen a > > lot. 3) We need to get persistent-across-reboot oops saving going; > > there's some venues for this > > No disagreement on this, its just a different issue than our bug > fixing problem. No it's not! Knowing earlier and better which bugs get hit is NOT different to our bug fixing "problem", it's in fact an essential part to the solution of it! > ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-04-30 14:20 ` Arjan van de Ven @ 2008-05-01 12:53 ` Rafael J. Wysocki 2008-05-01 13:21 ` Adrian Bunk 1 sibling, 0 replies; 229+ messages in thread From: Rafael J. Wysocki @ 2008-05-01 12:53 UTC (permalink / raw) To: Arjan van de Ven Cc: Adrian Bunk, Linus Torvalds, Andrew Morton, davem, linux-kernel, jirislaby, Steven Rostedt On Wednesday, 30 of April 2008, Arjan van de Ven wrote: > On Thu, 1 May 2008 14:30:38 +0300 > Adrian Bunk <bunk@kernel.org> wrote: > > > On Wed, Apr 30, 2008 at 12:03:38AM -0700, Arjan van de Ven wrote: > > > On Thu, 1 May 2008 03:31:25 +0300 > > > Adrian Bunk <bunk@kernel.org> wrote: > > > > > > > On Wed, Apr 30, 2008 at 01:31:08PM -0700, Linus Torvalds wrote: > > > > > > > > > > > > > > > On Wed, 30 Apr 2008, Andrew Morton wrote: > > > > > > > > > > > > <jumps up and down> > > > > > > > > > > > > There should be nothing in 2.6.x-rc1 which wasn't in > > > > > > 2.6.x-mm1! > > > > > > > > > > The problem I see with both -mm and linux-next is that they > > > > > tend to be better at finding the "physical conflict" kind of > > > > > issues (ie the merge itself fails) than the "code looks ok but > > > > > doesn't actually work" kind of issue. > > > > > > > > > > Why? > > > > > > > > > > The tester base is simply too small. > > > > > > > > > > Now, if *that* could be improved, that would be wonderful, but > > > > > I'm not seeing it as very likely. > > > > > > > > > > I think we have fairly good penetration these days with the > > > > > regular -git tree, but I think that one is quite frankly a > > > > > *lot* less scary than -mm or -next are, and there it has been > > > > > an absolutely huge boon to get the kernel into the Fedora > > > > > test-builds etc (and I _think_ Ubuntu and SuSE also started > > > > > something like that). > > > > > > > > > > So I'm very pessimistic about getting a lot of test coverage > > > > > before -rc1. > > > > > > > > > > Maybe too pessimistic, who knows? > > > > > > > > First of all: > > > > I 100% agree with Andrew that our biggest problems are in > > > > reviewing code and resolving bugs, not in finding bugs (we > > > > already have far too many unresolved bugs). > > > > > > I would argue instead that we don't know which bugs to fix first. > > > We're never going to fix all bugs, and to be honest, that's ok. > > >... > > > > That might be OK. > > > > But our current status quo is not OK: > > > > Check Rafael's regressions lists asking yourself > > "How many regressions are older than two weeks?" > > "ext4 doesn't compile on m68k". > YAWN. > > Wrong question... > "How many bugs that a sizable portion of users will hit in reality are there?" > is the right question to ask... > > > > > > We have unmaintained and de facto unmaintained parts of the kernel > > where even issues that might be easy to fix don't get fixed. > > And how many people are hitting those issues? If a part of the kernel is really > important to enough people, there tends to be someone who stands up to either fix > the issue or start de-facto maintaining that part. > And yes I know there's parts where that doesn't hold. But to be honest, there's > not that many of them that have active development (and thus get the biggest > share of regressions) > > > > > >... > > > So there's a few things we (and you / janitors) can do over time to > > > get better data on what issues people hit: > > > 1) Get automated collection of issues more wide spread. The wider > > > our net the better we know which issues get hit a lot, and plain > > > the more data we have on when things start, when they stop, etc > > > etc. Especially if you get a lot of testers in your project, I'd > > > like them to install the client for easy reporting of issues. 2) We > > > should add more WARN_ON()s on "known bad" conditions. If it > > > WARN_ON()'s, we can learn about it via the automated collection. > > > And we can then do the statistics to figure out which ones happen a > > > lot. 3) We need to get persistent-across-reboot oops saving going; > > > there's some venues for this > > > > No disagreement on this, its just a different issue than our bug > > fixing problem. > > No it's not! Knowing earlier and better which bugs get hit is NOT different > to our bug fixing "problem", it's in fact an essential part to the solution of it! Agreed. Thanks, Rafael ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-04-30 14:20 ` Arjan van de Ven 2008-05-01 12:53 ` Rafael J. Wysocki @ 2008-05-01 13:21 ` Adrian Bunk 2008-05-01 15:49 ` Andrew Morton 2008-05-02 2:08 ` Paul Mackerras 1 sibling, 2 replies; 229+ messages in thread From: Adrian Bunk @ 2008-05-01 13:21 UTC (permalink / raw) To: Arjan van de Ven Cc: Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Wed, Apr 30, 2008 at 07:20:13AM -0700, Arjan van de Ven wrote: > On Thu, 1 May 2008 14:30:38 +0300 > Adrian Bunk <bunk@kernel.org> wrote: > > > On Wed, Apr 30, 2008 at 12:03:38AM -0700, Arjan van de Ven wrote: > > > On Thu, 1 May 2008 03:31:25 +0300 > > > Adrian Bunk <bunk@kernel.org> wrote: > > > > > > > On Wed, Apr 30, 2008 at 01:31:08PM -0700, Linus Torvalds wrote: > > > > > > > > > > > > > > > On Wed, 30 Apr 2008, Andrew Morton wrote: > > > > > > > > > > > > <jumps up and down> > > > > > > > > > > > > There should be nothing in 2.6.x-rc1 which wasn't in > > > > > > 2.6.x-mm1! > > > > > > > > > > The problem I see with both -mm and linux-next is that they > > > > > tend to be better at finding the "physical conflict" kind of > > > > > issues (ie the merge itself fails) than the "code looks ok but > > > > > doesn't actually work" kind of issue. > > > > > > > > > > Why? > > > > > > > > > > The tester base is simply too small. > > > > > > > > > > Now, if *that* could be improved, that would be wonderful, but > > > > > I'm not seeing it as very likely. > > > > > > > > > > I think we have fairly good penetration these days with the > > > > > regular -git tree, but I think that one is quite frankly a > > > > > *lot* less scary than -mm or -next are, and there it has been > > > > > an absolutely huge boon to get the kernel into the Fedora > > > > > test-builds etc (and I _think_ Ubuntu and SuSE also started > > > > > something like that). > > > > > > > > > > So I'm very pessimistic about getting a lot of test coverage > > > > > before -rc1. > > > > > > > > > > Maybe too pessimistic, who knows? > > > > > > > > First of all: > > > > I 100% agree with Andrew that our biggest problems are in > > > > reviewing code and resolving bugs, not in finding bugs (we > > > > already have far too many unresolved bugs). > > > > > > I would argue instead that we don't know which bugs to fix first. > > > We're never going to fix all bugs, and to be honest, that's ok. > > >... > > > > That might be OK. > > > > But our current status quo is not OK: > > > > Check Rafael's regressions lists asking yourself > > "How many regressions are older than two weeks?" > > "ext4 doesn't compile on m68k". > YAWN. > > Wrong question... > "How many bugs that a sizable portion of users will hit in reality are there?" > is the right question to ask... >... "Kernel oops while running kernbench and tbench on powerpc" took more than 2 months to get resolved, and we ship 2.6.25 with this regression. Granted that compared to x86 there's not a sizable portion of users crazy enough to run Linux on powerpc machines... cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 13:21 ` Adrian Bunk @ 2008-05-01 15:49 ` Andrew Morton 2008-05-01 1:13 ` Arjan van de Ven ` (2 more replies) 2008-05-02 2:08 ` Paul Mackerras 1 sibling, 3 replies; 229+ messages in thread From: Andrew Morton @ 2008-05-01 15:49 UTC (permalink / raw) To: Adrian Bunk Cc: Arjan van de Ven, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Thu, 1 May 2008 16:21:59 +0300 Adrian Bunk <bunk@kernel.org> wrote: > > > But our current status quo is not OK: > > > > > > Check Rafael's regressions lists asking yourself > > > "How many regressions are older than two weeks?" > > > > "ext4 doesn't compile on m68k". > > YAWN. > > > > Wrong question... > > "How many bugs that a sizable portion of users will hit in reality are there?" > > is the right question to ask... > >... > > "Kernel oops while running kernbench and tbench on powerpc" took more > than 2 months to get resolved, and we ship 2.6.25 with this regression. Precisely. Cherry-picking a single example such as the 68k thing and then claiming that it reflects the general is known as a "fallacy". > Granted that compared to x86 there's not a sizable portion of users > crazy enough to run Linux on powerpc machines... Another fallacy which Arjan is pushing (even though he doesn't appear to have realised it) is "all hardware is the same". Well, it isn't. And most of our bugs are hardware-specific. So, I'd venture, most of our bugs don't affect most people. So, over time, by Arjan's "important to enough people" observation we just get more and more and more unfixed bugs. And I believe this effect has been occurring. And please stop regaling us with this kerneloops.org stuff. It just isn't very interesting, useful or representative when considering the whole problem. Very few kernel bugs result in a trace, and when they do they are usually easy to fix and, because of this, they will get fixed, often quickly. I expect netdevwatchdogeth0transmittimedout.org would tell a different story. One thing which muddies all this up is that bug reporters vanish. Over the years I have sent thousands and thousands of ping emails to people who have reported bugs via email, three to six months after the fact. Some were solved - maybe a fifth. About the same proportion of reporters reply and give some reason why they cannot work on the bug. In the majorty of cases people don't reply at all and I suspect they're in the same category of cannot-work-on-the-bug. And why can't they work on the bug? Usually, because they found a workaround. People aren't going to spend months sitting in front of a non-functional computer waiting for kernel developers to decide if their machine is important enough to fix. They will find a workaround. They will buy new hardware. They will discover "noapic" (234000 google hits and rising!). They will swap it with a different machine. They will switch to a different distro which for some reason doesn't trigger the bug. They will use an older kernel. They will switch to Solaris. Etcetera. People are clever - they will find a way to get around it. I figure that after a bug is reported we have maybe 24 to 48 hours to send a good response before our chances of _ever_ fixing it have begun to decline sharply due to the clever minds at the other end. Which leads us to Arjan's third fallacy: "How many bugs that a sizable portion of users will hit in reality are there?" is the right question to ask... well no, it isn't. Because approximately zero of the hardware bugs affect a sizeable portion of users. With this logic we will end up with more and more and more and more bugs each of which affect a tiny number of users. Hundreds of different bugs. You know where this process ends up. Arjan's fourth fallacy: "We don't make (effective) prioritization decisions." lol. This implies that someone somewhere once sat down and wondered which bug he should most effectively work on. Well, we don't do that. We ignore _all_ the bugs in favour of busily writing new ones. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 15:49 ` Andrew Morton @ 2008-05-01 1:13 ` Arjan van de Ven 2008-05-02 9:00 ` Adrian Bunk 2008-05-01 16:38 ` Steven Rostedt 2008-05-01 17:24 ` Theodore Tso 2 siblings, 1 reply; 229+ messages in thread From: Arjan van de Ven @ 2008-05-01 1:13 UTC (permalink / raw) To: Andrew Morton Cc: Adrian Bunk, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Thu, 1 May 2008 08:49:19 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > > Granted that compared to x86 there's not a sizable portion of users > > crazy enough to run Linux on powerpc machines... > > Another fallacy which Arjan is pushing (even though he doesn't appear > to have realised it) is "all hardware is the same". no I'm pushing "some classes of hardware are much more popular/relevant than others". > Well, it isn't. And most of our bugs are hardware-specific. So, I'd > venture, most of our bugs don't affect most people. So, over time, by > Arjan's "important to enough people" observation we just get more and > more and more unfixed bugs. I did not say "most people". I believe "most people" aren't hitting bugs right now (or there would be a lot more screaming). What I do believe is that *within the bugs that hit*, even the hardware specific ones, there's a clear prioritization by how many people hit the bug (or have the hardware in general). > > And I believe this effect has been occurring. > > And please stop regaling us with this kerneloops.org stuff. It just > isn't very interesting, useful or representative when considering the > whole problem. Very few kernel bugs result in a trace, and when they > do they are usually easy to fix and, because of this, they will get > fixed, often quickly. I expect > netdevwatchdogeth0transmittimedout.org would tell a different story. now that's a fallacy of your own.. if you care about that one, it's 1) trivial to track and/or 2) could contain a WARN_ON_ONCE(), at which point it's automatically tracked. (and more useful information I suspect, since it suddenly has a full backtrace including driver info in it) By your argument we should work hard to make sure we're better at creating traces for cases we detect something goes wrong. (I would not argue against that fwiw) > I figure that after a bug is reported we have maybe 24 to 48 hours to > send a good response before our chances of _ever_ fixing it have > begun to decline sharply due to the clever minds at the other end. > > Which leads us to Arjan's third fallacy: > > "How many bugs that a sizable portion of users will hit in reality > are there?" is the right question to ask... > > well no, it isn't. Because approximately zero of the hardware bugs if it's a hardware bug there's little we can do. If it's a hardware specific bug, yeah then it becomes a function of how popular that hardware is. > affect a sizeable portion of users. With this logic we will end up > with more and more and more and more bugs each of which affect a tiny > number of users. Hundreds of different bugs. You know where this > process ends up. Given that a normal PC has maybe 10 components... yes we don't want bugcreep that affects common hardware over time. At the same time, by your argument, a bug that hits a piece of hardware of which 5 are made (or left on this planet) is equally important to a bug in something that > > Arjan's fourth fallacy: "We don't make (effective) prioritization > decisions." lol. This implies that someone somewhere once sat down > and wondered which bug he should most effectively work on. Well, we > don't do that. We ignore _all_ the bugs in favour of busily writing > new ones This statement is so rediculous and self contradicting to what you said before that I'm not even going to respond to it. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 1:13 ` Arjan van de Ven @ 2008-05-02 9:00 ` Adrian Bunk 0 siblings, 0 replies; 229+ messages in thread From: Adrian Bunk @ 2008-05-02 9:00 UTC (permalink / raw) To: Arjan van de Ven Cc: Andrew Morton, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Wed, Apr 30, 2008 at 06:13:38PM -0700, Arjan van de Ven wrote: > On Thu, 1 May 2008 08:49:19 -0700 > Andrew Morton <akpm@linux-foundation.org> wrote: > > > > Granted that compared to x86 there's not a sizable portion of users > > > crazy enough to run Linux on powerpc machines... > > > > Another fallacy which Arjan is pushing (even though he doesn't appear > > to have realised it) is "all hardware is the same". > > no I'm pushing "some classes of hardware are much more popular/relevant > than others". "popular/relevant" is hard to define. E.g. if we'd go after "popular" we should only keep architectures like ARM and x86 and ditch architectures like ia64 and s390 that have puny userbases. And how would you define "relevant"? > > Well, it isn't. And most of our bugs are hardware-specific. So, I'd > > venture, most of our bugs don't affect most people. So, over time, by > > Arjan's "important to enough people" observation we just get more and > > more and more unfixed bugs. > > I did not say "most people". I believe "most people" aren't hitting > bugs right now (or there would be a lot more screaming). > What I do believe is that *within the bugs that hit*, even the hardware > specific ones, there's a clear prioritization by how many people hit > the bug (or have the hardware in general). If your "or have the hardware in general" is meant seriously you have to convince people that ARM must become a very high priority. No matter whether one supports your "there's a clear prioritization" view or not it anyway doesn't currently work since the areas covered by people testing -rc kernels don't even remotely map the most popular hardware in the field. > > And I believe this effect has been occurring. > > > And please stop regaling us with this kerneloops.org stuff. It just > > isn't very interesting, useful or representative when considering the > > whole problem. Very few kernel bugs result in a trace, and when they > > do they are usually easy to fix and, because of this, they will get > > fixed, often quickly. I expect > > netdevwatchdogeth0transmittimedout.org would tell a different story. > > now that's a fallacy of your own.. if you care about that one, it's 1) > trivial to track and/or 2) could contain a WARN_ON_ONCE(), at which > point it's automatically tracked. (and more useful information I > suspect, since it suddenly has a full backtrace including driver info > in it) > By your argument we should work hard to make sure we're better at > creating traces for cases we detect something goes wrong. > (I would not argue against that fwiw) >... kerneloops.org catches the easiest to solve bugs (there's a trace) and helps in getting them fixed. That's a very good thing. And if we get more bugs into this easy to resolve state that would be even better. But it's only a small part of the complete picture of incoming bug reports. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 15:49 ` Andrew Morton 2008-05-01 1:13 ` Arjan van de Ven @ 2008-05-01 16:38 ` Steven Rostedt 2008-05-01 17:18 ` Andrew Morton 2008-05-01 17:24 ` Theodore Tso 2 siblings, 1 reply; 229+ messages in thread From: Steven Rostedt @ 2008-05-01 16:38 UTC (permalink / raw) To: Andrew Morton Cc: Adrian Bunk, Arjan van de Ven, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby On Thu, 1 May 2008, Andrew Morton wrote: > > Arjan's fourth fallacy: "We don't make (effective) prioritization > decisions." lol. This implies that someone somewhere once sat down and > wondered which bug he should most effectively work on. Well, we don't do > that. We ignore _all_ the bugs in favour of busily writing new ones. And actually, core kernel developers are best for writing new bugs. Really, the way I started out learning how the kernel ticks was to go and try to solve some bugs that I was seeing (this was years ago). I get people asking that they want to learn to be a kernel developer and they ask what new feature should they work on? Well, honestly, the last thing a newbie kernel developer should be doing is writing new bugs. We need to send them to a URL that lists all the known bugs and have them pick one, any one, and have them solve it. This would be the best way to learn part of the kernel. I even find that I understand my own code better when I'm in the debugging phase. People here mention differnt places to look at code, and besides the kerneloops.org I really don't even know where to look for bugs, because I haven't seen a URL to point me to. The next time someone asks me how to get started in kernel programming, I would love to tell them to go and look here, and solve the bugs. I'm guessing that I should just point them to: http://janitor.kernelnewbies.org/ and tell them to focus on real bugs (not just comments and such) to get fixed if they really want to learn the kernel. -- Steve ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 16:38 ` Steven Rostedt @ 2008-05-01 17:18 ` Andrew Morton 0 siblings, 0 replies; 229+ messages in thread From: Andrew Morton @ 2008-05-01 17:18 UTC (permalink / raw) To: Steven Rostedt; +Cc: bunk, arjan, torvalds, rjw, davem, linux-kernel, jirislaby On Thu, 1 May 2008 12:38:23 -0400 (EDT) Steven Rostedt <rostedt@goodmis.org> wrote: > People here mention differnt places to look at code, and besides the > kerneloops.org I really don't even know where to look for bugs, because I > haven't seen a URL to point me to. bugzilla.kernel.org is, umm, improving. It would be an intersting exercise for someone to spend a few days seeing how many of the bugzilla reports they personally can reproduce. I'd guess "zero". There's a lesson in that. The problem with bugzilla will be that it will be hard to find reports where the reporter will be able to work with you on the fix - we've let them go cold. The most fruitful place to find fixable bugs is linux-kernel. People who report bugs there are sufficiently motivated to have actually sent the email and the bug is still recent, so they probably haven't done the Solaris install yet. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 15:49 ` Andrew Morton 2008-05-01 1:13 ` Arjan van de Ven 2008-05-01 16:38 ` Steven Rostedt @ 2008-05-01 17:24 ` Theodore Tso 2008-05-01 19:26 ` Andrew Morton 2 siblings, 1 reply; 229+ messages in thread From: Theodore Tso @ 2008-05-01 17:24 UTC (permalink / raw) To: Andrew Morton Cc: Adrian Bunk, Arjan van de Ven, Linus Torvalds, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Thu, May 01, 2008 at 08:49:19AM -0700, Andrew Morton wrote: > Another fallacy which Arjan is pushing (even though he doesn't appear to > have realised it) is "all hardware is the same". > > Well, it isn't. And most of our bugs are hardware-specific. So, I'd > venture, most of our bugs don't affect most people. So, over time, by > Arjan's "important to enough people" observation we just get more and more > and more unfixed bugs. > > And I believe this effect has been occurring. So the question is if we have a thousand bugs which only affect one person each, and 70 million Linux users, how much should we beat up ourselves that 1,000 people can't use a particular version of the Linux kernel, versus the 99.9% of the people for which the kernel works just fine? Sometimes, we can't make everyone happy. At the recent Linux Collaboration Summit, we had a local user walk up to a microphone, and loosely paraphrased, said, "WHINE WHINE WHINE WHINE I have have a $30 DVD drive that doesn't work with Linux. WHINE WHINE WHINE WHINE WHINE What are *you* going to do to fix my problem?" Some people like James responded very diplomatically, with "Well, you have to understand, the developer might not have your hardware, and there's a lot of broken out here, etc., etc." What I wanted to tell this user was, "Ask not what the Linux development community can do for you. Ask what *you* can do for Linux?" Suppose this person had filed a kernel bugzilla bug, and it was one of the hundreds or thousands of non-handled bugs. Sure, it's a tragedy that bugs pile up. But if they pile up because of crappy hardware, that's not a major tragedy. If we can figure out how to blacklist it, and move on, we should do so. > And why can't they work on the bug? Usually, because they found a > workaround. People aren't going to spend months sitting in front of a > non-functional computer waiting for kernel developers to decide if their > machine is important enough to fix. They will find a workaround. They > will buy new hardware. Hey, in this particular case, if this user worked around the problem by buying new hardware, it was probably the right solution. As far as we know we don't have a systematic problem where huge numbers DVD drives aren't working, so if there are a few odd ball ones that are out there, we just CAN'T self-flagellate ourselves that we're not fixing all bugs, and letting some bugs pile up. > Which leads us to Arjan's third fallacy: > > "How many bugs that a sizable portion of users will hit in reality > are there?" is the right question to ask... > > well no, it isn't. Because approximately zero of the hardware bugs affect > a sizeable portion of users. With this logic we will end up with more and > more and more and more bugs each of which affect a tiny number of users. > Hundreds of different bugs. You know where this process ends up. ... and maybe we can't solve hardware bugs. Or that crappy hardware isn't worth holding back Linux development. And I'm not sure ignoring it is that horrible of a thing. And in practice, if it's a hardware bug in something which is very common, it *will* get noticed very quickly and fixed. But if it's in a hardware bug in some rare piece of hardware, the user is going to have to either (a) help us fix it, or (b) decide that his time is more valuable and that buying another $30 DVD drive might be a better use of his and our time. Back when I was the serial driver maintainer, I certainly made those kinds of triage decisions. I knew the serial driver was working on the vast majority of the Linux users, because if it broke in a major ways, I would hear about it, in spades and get lots and lots of hate mail. And there were plenty of crappy ISA boards out there; and I would help them out when I could, and sometimes spend more volunteer time helping them by changing one or two outb() to outb_p()'s (yes, that really made a difference; remember, we're talking about crappy PC class hardware with hardware bugs), but at the end of the day, past a certain point, even with a willing and cooperative end-user, I would have to call it a day, and give up, and tell them to get another serial card. (And back in the days of ISA boards, we couldn't even use blacklists.) And you know what? Linux didn't collapse into a steaming pile of dung when I did that. We're all volunteers, and we need to recognize there are limits to what we can do --- otherwise, it will way to easy to burn out and become a bitter shell of a maintainer.... Even BSD fan boys will realize that in BSD land, you have to do even more of this; if there's random broken hardware, or simply a lack of a device driver, very often your only recourse is to work around the problem by buying another serial card, or wifi card, or whatever. And this happens much more with BSD than Linux, simply because they support fewer devices to begin with. - Ted P.S. We should really try to categorize bugs so we can figure out what percentage of the bugs are device driver bugs, and what percentage are core kernel bugs, which are "if you stress the system too badly" sort of bugs, or "if you do something bad like yank the USB stick without unmounting the filesystem first" sort of thing. I think if we did this, the numbers wouldn't look quite so scary, because it's things like device driver problems with wierd sh*t bugs are not comparable with core functionality bugs in the SLUB allocator, for example. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 17:24 ` Theodore Tso @ 2008-05-01 19:26 ` Andrew Morton 2008-05-01 19:39 ` Steven Rostedt 2008-05-02 10:23 ` Andi Kleen 0 siblings, 2 replies; 229+ messages in thread From: Andrew Morton @ 2008-05-01 19:26 UTC (permalink / raw) To: Theodore Tso Cc: bunk, arjan, torvalds, rjw, davem, linux-kernel, jirislaby, rostedt On Thu, 1 May 2008 13:24:34 -0400 Theodore Tso <tytso@MIT.EDU> wrote: > ... and maybe we can't solve hardware bugs. Many, many of these are regressions. If old-linux works on that hardware then new-linux can too. (still wants to know what we did 2-3 years ago which caused thousands of people to have to resort to using noapic and other apic-related boot option workarounds) ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 19:26 ` Andrew Morton @ 2008-05-01 19:39 ` Steven Rostedt 2008-05-02 10:23 ` Andi Kleen 1 sibling, 0 replies; 229+ messages in thread From: Steven Rostedt @ 2008-05-01 19:39 UTC (permalink / raw) To: Andrew Morton Cc: Theodore Tso, bunk, arjan, torvalds, rjw, davem, linux-kernel, jirislaby On Thu, 1 May 2008, Andrew Morton wrote: > On Thu, 1 May 2008 13:24:34 -0400 > Theodore Tso <tytso@MIT.EDU> wrote: > > > ... and maybe we can't solve hardware bugs. > > Many, many of these are regressions. If old-linux works on that > hardware then new-linux can too. > > (still wants to know what we did 2-3 years ago which caused thousands of > people to have to resort to using noapic and other apic-related boot option > workarounds) Perhaps 2-3 years ago more people started using more hardware that implements APIC. ;-) -- Steve ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 19:26 ` Andrew Morton 2008-05-01 19:39 ` Steven Rostedt @ 2008-05-02 10:23 ` Andi Kleen 1 sibling, 0 replies; 229+ messages in thread From: Andi Kleen @ 2008-05-02 10:23 UTC (permalink / raw) To: Andrew Morton Cc: Theodore Tso, bunk, arjan, torvalds, rjw, davem, linux-kernel, jirislaby, rostedt Andrew Morton <akpm@linux-foundation.org> writes: > > (still wants to know what we did 2-3 years ago which caused thousands of > people to have to resort to using noapic and other apic-related boot option > workarounds) Forcing APIC even when the BIOS didn't support them. -Andi ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 13:21 ` Adrian Bunk 2008-05-01 15:49 ` Andrew Morton @ 2008-05-02 2:08 ` Paul Mackerras 2008-05-02 3:10 ` Josh Boyer 1 sibling, 1 reply; 229+ messages in thread From: Paul Mackerras @ 2008-05-02 2:08 UTC (permalink / raw) To: Adrian Bunk Cc: Arjan van de Ven, Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt Adrian Bunk writes: > "Kernel oops while running kernbench and tbench on powerpc" took more > than 2 months to get resolved, and we ship 2.6.25 with this regression. That was a very subtle bug that only showed up on one particular powerpc machine. I was not able to replicate it on any of the powerpc machines I have here. Nevertheless, we found it and we have a fix for it. I think that's an example of the process working. :) Paul. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-02 2:08 ` Paul Mackerras @ 2008-05-02 3:10 ` Josh Boyer 2008-05-02 4:09 ` Paul Mackerras 0 siblings, 1 reply; 229+ messages in thread From: Josh Boyer @ 2008-05-02 3:10 UTC (permalink / raw) To: Paul Mackerras Cc: Adrian Bunk, Arjan van de Ven, Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Fri, 2008-05-02 at 12:08 +1000, Paul Mackerras wrote: > Adrian Bunk writes: > > > "Kernel oops while running kernbench and tbench on powerpc" took more > > than 2 months to get resolved, and we ship 2.6.25 with this regression. > > That was a very subtle bug that only showed up on one particular > powerpc machine. I was not able to replicate it on any of the powerpc > machines I have here. Nevertheless, we found it and we have a fix for > it. I think that's an example of the process working. :) Was it even a regression in the classical sense of the word? Seemed more of a latent bug that was simply never triggered before. josh ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-02 3:10 ` Josh Boyer @ 2008-05-02 4:09 ` Paul Mackerras 2008-05-02 8:29 ` Adrian Bunk 0 siblings, 1 reply; 229+ messages in thread From: Paul Mackerras @ 2008-05-02 4:09 UTC (permalink / raw) To: Josh Boyer Cc: Adrian Bunk, Arjan van de Ven, Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt Josh Boyer writes: > On Fri, 2008-05-02 at 12:08 +1000, Paul Mackerras wrote: > > Adrian Bunk writes: > > > > > "Kernel oops while running kernbench and tbench on powerpc" took more > > > than 2 months to get resolved, and we ship 2.6.25 with this regression. > > > > That was a very subtle bug that only showed up on one particular > > powerpc machine. I was not able to replicate it on any of the powerpc > > machines I have here. Nevertheless, we found it and we have a fix for > > it. I think that's an example of the process working. :) > > Was it even a regression in the classical sense of the word? Seemed > more of a latent bug that was simply never triggered before. That's right. The bug has been there basically forever (i.e. since before 2.6.12-rc2 ;) and no-one has been able to trigger it reliably before. Paul. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-02 4:09 ` Paul Mackerras @ 2008-05-02 8:29 ` Adrian Bunk 2008-05-02 10:16 ` Paul Mackerras 2008-05-02 14:58 ` Linus Torvalds 0 siblings, 2 replies; 229+ messages in thread From: Adrian Bunk @ 2008-05-02 8:29 UTC (permalink / raw) To: Paul Mackerras Cc: Josh Boyer, Arjan van de Ven, Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Fri, May 02, 2008 at 02:09:39PM +1000, Paul Mackerras wrote: > Josh Boyer writes: > > > On Fri, 2008-05-02 at 12:08 +1000, Paul Mackerras wrote: > > > Adrian Bunk writes: > > > > > > > "Kernel oops while running kernbench and tbench on powerpc" took more > > > > than 2 months to get resolved, and we ship 2.6.25 with this regression. > > > > > > That was a very subtle bug that only showed up on one particular > > > powerpc machine. I was not able to replicate it on any of the powerpc > > > machines I have here. Nevertheless, we found it and we have a fix for > > > it. I think that's an example of the process working. :) > > > > Was it even a regression in the classical sense of the word? Seemed > > more of a latent bug that was simply never triggered before. > > That's right. The bug has been there basically forever (i.e. since > before 2.6.12-rc2 ;) and no-one has been able to trigger it reliably > before. But for users this is a recent regression since 2.6.24 worked and 2.6.25 does not. If this problem was on x86 Linus himself and some other core developers would most likely have debugged this issue and Linus would have delayed the release of 2.6.25 for getting it fixed there. And stuff that "only showed up on one particular machine" often shows up on many machines (we only know in hindsight) and the "one particular machine" is often due to the fact that of the many machines that might trigger a regression only one was used for testing this -rc kernel. This not in any way meant against you personally, and due to the fact that the powerpc port is among the better maintained parts of the kernel this regression eventually got fixed, but in many other parts of the kernel this would have been one more of the many regressions that were reported and never fixed. > Paul. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-02 8:29 ` Adrian Bunk @ 2008-05-02 10:16 ` Paul Mackerras 2008-05-02 11:58 ` Adrian Bunk 2008-05-02 14:58 ` Linus Torvalds 1 sibling, 1 reply; 229+ messages in thread From: Paul Mackerras @ 2008-05-02 10:16 UTC (permalink / raw) To: Adrian Bunk Cc: Josh Boyer, Arjan van de Ven, Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt Adrian Bunk writes: > > That's right. The bug has been there basically forever (i.e. since > > before 2.6.12-rc2 ;) and no-one has been able to trigger it reliably > > before. > > But for users this is a recent regression since 2.6.24 worked > and 2.6.25 does not. I never actually saw a statement to that effect (i.e. that 2.6.24 worked) from Kamalesh. I think people assumed that because he reported it against version X that version X-1 worked, but we don't actually know that. > If this problem was on x86 Linus himself and some other core developers > would most likely have debugged this issue and Linus would have delayed > the release of 2.6.25 for getting it fixed there. If I had been able to replicate it, or if it had been seen on more than one machine, I would probably have asked Linus to wait while we fixed it. There's a risk management thing happening here. Delaying a release is a negative thing in itself, since it means that users have to wait longer for the improvements we have made. That has to be balanced against the negative of some users seeing a regression. It's not an absolute, black-and-white kind of thing. In this case, for a bug being seen on only one machine, of a somewhat unusual configuration, I considered it wasn't worth asking to delay the release. Paul. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-02 10:16 ` Paul Mackerras @ 2008-05-02 11:58 ` Adrian Bunk 0 siblings, 0 replies; 229+ messages in thread From: Adrian Bunk @ 2008-05-02 11:58 UTC (permalink / raw) To: Paul Mackerras Cc: Josh Boyer, Arjan van de Ven, Linus Torvalds, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Fri, May 02, 2008 at 08:16:49PM +1000, Paul Mackerras wrote: > Adrian Bunk writes: > > > > That's right. The bug has been there basically forever (i.e. since > > > before 2.6.12-rc2 ;) and no-one has been able to trigger it reliably > > > before. > > > > But for users this is a recent regression since 2.6.24 worked > > and 2.6.25 does not. > > I never actually saw a statement to that effect (i.e. that 2.6.24 > worked) from Kamalesh. I think people assumed that because he > reported it against version X that version X-1 worked, but we don't > actually know that. He reported it as [BUG] 2.6.25-rc2-git4 - Regression Kernel oops while running kernbench and tbench on powerpc and it was in the 2.6.25 regression lists for ages. > > If this problem was on x86 Linus himself and some other core developers > > would most likely have debugged this issue and Linus would have delayed > > the release of 2.6.25 for getting it fixed there. > > If I had been able to replicate it, or if it had been seen on more > than one machine, I would probably have asked Linus to wait while we > fixed it. > > There's a risk management thing happening here. Delaying a release is > a negative thing in itself, since it means that users have to wait > longer for the improvements we have made. That has to be balanced > against the negative of some users seeing a regression. It's not an > absolute, black-and-white kind of thing. In this case, for a bug > being seen on only one machine, of a somewhat unusual configuration, I > considered it wasn't worth asking to delay the release. No general disagreement on this. And my example was not in any way meant against you - it's actually unusual and positive that a bug that once got the attention of being on the regression lists gets fixed later. Even worse is the situation with regressions people run into when upgrading from 2.6.22 to 2.6.24 today... :-( > Paul. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-02 8:29 ` Adrian Bunk 2008-05-02 10:16 ` Paul Mackerras @ 2008-05-02 14:58 ` Linus Torvalds 2008-05-02 15:44 ` Carlos R. Mafra 1 sibling, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-05-02 14:58 UTC (permalink / raw) To: Adrian Bunk Cc: Paul Mackerras, Josh Boyer, Arjan van de Ven, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Fri, 2 May 2008, Adrian Bunk wrote: > > But for users this is a recent regression since 2.6.24 worked > and 2.6.25 does not. Totally and utterly immaterial. If it's a timing-related bug, as far as developers are concerned, nothing they did introduced the problem. So anybody who think s that "process" should have caught it is just being stupid. Adrian, you're one of the absolutely *worst* in the camp of "everything should be perfect". You really need to realize that reality is messy, and things cannot be pefect. You also need to realize and *understand* that aiming for "good" is actually much BETTER than trying to aim for "perfect". Perfect is the enemy of good. Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-02 14:58 ` Linus Torvalds @ 2008-05-02 15:44 ` Carlos R. Mafra 2008-05-02 16:28 ` Linus Torvalds 0 siblings, 1 reply; 229+ messages in thread From: Carlos R. Mafra @ 2008-05-02 15:44 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, Paul Mackerras, Josh Boyer, Arjan van de Ven, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Fri 2.May'08 at 7:58:25 -0700, Linus Torvalds wrote: > > > On Fri, 2 May 2008, Adrian Bunk wrote: > > > > But for users this is a recent regression since 2.6.24 worked > > and 2.6.25 does not. > > Totally and utterly immaterial. > > If it's a timing-related bug, as far as developers are concerned, nothing > they did introduced the problem. > > So anybody who think s that "process" should have caught it is just being > stupid. So I would like to ask you what an user should do when facing what is probably a timing-related bug, as it appears I have the bad luck of hitting one. See for example my comments after this one http://bugzilla.kernel.org/show_bug.cgi?id=10117#c11 This same problem is still present with yesterday's git, and sometimes it hangs without hpet=disable and sometimes it doesn't. (And never with hpet=disable in the boot command line) And when it hangs I can see only _one_ "Switched to high resolution mode on CPU x" message before the hang point, and when it boots fine there is always the two of them in sequence: Switched to high resolution mode on CPU 1 Switched to high resolution mode on CPU 0 And using vga=6 or vga=0x0364 makes a difference in the probability of hanging. I am just waiting -rc1 to be released to send an email with my problem again, as I am unable to debug this myself. I think this is ok from my part, right? ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-02 15:44 ` Carlos R. Mafra @ 2008-05-02 16:28 ` Linus Torvalds 2008-05-02 17:15 ` Carlos R. Mafra 0 siblings, 1 reply; 229+ messages in thread From: Linus Torvalds @ 2008-05-02 16:28 UTC (permalink / raw) To: Carlos R. Mafra Cc: Adrian Bunk, Paul Mackerras, Josh Boyer, Arjan van de Ven, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt On Fri, 2 May 2008, Carlos R. Mafra wrote: > > So I would like to ask you what an user should do when facing what is > probably a timing-related bug, as it appears I have the bad luck > of hitting one. Quite frankly, it will depend on the bug. If it's *reliably* timing-related (which sounds crazy, but is not at all unheard of), it can be reliably bisected down to some totally unrelated commit that doesn't actually introduce the problem at all, but that reliably turns it on or off. That can be very misleading, and can cause us to basically revert a good commit, only to not actually fix the bug (and possibly re-introduce the bug that the reverted commit tried to fix). But sometimes it gives us a clue where the timing problem is. But quite frankly, that seems to be the exception rather than the rule. There have been issues that literally seemed to depend on things like cacheline placement etc, where changing config options for code that was never actually even *run* would change timing just enough to show a bug pseudo-reliably or not at all. The good news is that those timing issues are really quite rare. Tha bad news is that when they happen, they are almost totally undebuggable. > This same problem is still present with yesterday's git, and sometimes > it hangs without hpet=disable and sometimes it doesn't. (And never > with hpet=disable in the boot command line) Hey, it may well be a HPET+NOHZ issue. But it could also be that HPET is the thing that just allows you to see the hang. > And using vga=6 or vga=0x0364 makes a difference in the probability > of hanging. .. and yeah, these kinds of really odd and obviously totally unrelated issues are a sign of a bug that is either simply hardware instability or very subtly timing-related. The reason I mention hardware instability is that there really are bugs that happen due to (for example) power supply instabilities. Brownouts under heavy load have been causes of problems, but perhaps surprisingly, so has _idle_ time thanks to sleep-states! The latter is probably due to bad powr conditioning on the CPU power lines, where the huge current swings (going at high CPU power to low, and back again) not only have made soem motherboards "sing" (or "hum", depending on frequency) but also causes voltage instability and then the CPU crashes. Am I saying that's the reason you see problems? Probably not. Most instabilities really are due to kernel bugs. But hardware instabilities do happen, and they can have these kinds of odd effects. > I am just waiting -rc1 to be released to send an email with my > problem again, as I am unable to debug this myself. > I think this is ok from my part, right? Yes. You've been a good bug reporter, and kept at it. It's not your fault that the bug is hard to pin down. Quite frankly, it does sound like the hang happens somewhere around the hpet_init hpet_acpi_add hpet_resources hpet_resources: 0xfed00000 is busy printk's you added (correct?) and we've had tons of issues with NO_HZ, so at a guess it is timer-related. (And I assume it's stable if/once it gets past that boot hang issue? That tends to mean that it's not some hardware instability, it's literally our init code). Linus ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-02 16:28 ` Linus Torvalds @ 2008-05-02 17:15 ` Carlos R. Mafra 2008-05-02 18:02 ` Pallipadi, Venkatesh 0 siblings, 1 reply; 229+ messages in thread From: Carlos R. Mafra @ 2008-05-02 17:15 UTC (permalink / raw) To: Linus Torvalds Cc: Adrian Bunk, Paul Mackerras, Josh Boyer, Arjan van de Ven, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt, venkatesh.pallipadi On Fri 2.May'08 at 9:28:08 -0700, Linus Torvalds wrote: > Quite frankly, it does sound like the hang happens somewhere around the > > hpet_init > hpet_acpi_add > hpet_resources > hpet_resources: 0xfed00000 is busy > > printk's you added (correct?) and we've had tons of issues with NO_HZ, so > at a guess it is timer-related. It happens a bit before that because when it hangs it doesn't print the above lines, and when it does not hang these lines are the ones right after the point where it hangs. > (And I assume it's stable if/once it gets past that boot hang issue? Yes you are right. When I have luck and the boot succeeds my Sony laptop is rock solid and the kernel is wonderful (even the card reader works!). > That > tends to mean that it's not some hardware instability, it's literally our > init code). A few days ago I found this message in lkml in reply to a hpet patch http://lkml.org/lkml/2007/5/7/361 in which the reporter also had a similar hang, which was cured by hpet=disable. So it is in my TODO list to try to check out if that patch is in the current -git and whether it can be reverted somehow (I added Venki to the Cc: now) Thanks a lot for the answer! ^ permalink raw reply [flat|nested] 229+ messages in thread
* RE: RFC: starting a kernel-testers group for newbies 2008-05-02 17:15 ` Carlos R. Mafra @ 2008-05-02 18:02 ` Pallipadi, Venkatesh 2008-05-09 16:32 ` Mark Lord 0 siblings, 1 reply; 229+ messages in thread From: Pallipadi, Venkatesh @ 2008-05-02 18:02 UTC (permalink / raw) To: Carlos R. Mafra, Linus Torvalds Cc: Adrian Bunk, Paul Mackerras, Josh Boyer, Arjan van de Ven, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt, tglx, Len Brown >-----Original Message----- >From: linux-kernel-owner@vger.kernel.org >[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of >Carlos R. Mafra >Sent: Friday, May 02, 2008 10:16 AM >To: Linus Torvalds >Cc: Adrian Bunk; Paul Mackerras; Josh Boyer; Arjan van de Ven; >Andrew Morton; Rafael J. Wysocki; davem@davemloft.net; >linux-kernel@vger.kernel.org; jirislaby@gmail.com; Steven >Rostedt; Pallipadi, Venkatesh >Subject: Re: RFC: starting a kernel-testers group for newbies > >On Fri 2.May'08 at 9:28:08 -0700, Linus Torvalds wrote: > >> Quite frankly, it does sound like the hang happens somewhere >around the >> >> hpet_init >> hpet_acpi_add >> hpet_resources >> hpet_resources: 0xfed00000 is busy >> >> printk's you added (correct?) and we've had tons of issues >with NO_HZ, so >> at a guess it is timer-related. > >It happens a bit before that because when it hangs it doesn't >print the above lines, and when it does not hang these lines are >the ones right after the point where it hangs. > >> (And I assume it's stable if/once it gets past that boot hang issue? > >Yes you are right. When I have luck and the boot succeeds my >Sony laptop >is rock solid and the kernel is wonderful (even the card >reader works!). > >> That >> tends to mean that it's not some hardware instability, it's >literally our >> init code). > >A few days ago I found this message in lkml in reply to a hpet patch >http://lkml.org/lkml/2007/5/7/361 in which the reporter also had >a similar hang, which was cured by hpet=disable. > >So it is in my TODO list to try to check out if that patch is >in the current -git and whether it can be reverted somehow (I >added Venki to the Cc: now) > >Thanks a lot for the answer! It depends on whether we are HPET is being force detected based on the chipset or whether it was exported by the BIOS in ACPI table. If it was force enabled and above patch is having any effect, then you should see a message like > Force enabled HPET at base address 0xfed00000 In any case, off late there seems to be quite a few breakages that are related to HPET/timer interrupts. One of them was on a system which has HPET being exported by BIOS http://bugzilla.kernel.org/show_bug.cgi?id=10409 And the other one where we are force enabling based on chipset http://bugzilla.kernel.org/show_bug.cgi?id=10561 And then we have hangs once in a while reports by you, Roman and Mark here http://bugzilla.kernel.org/show_bug.cgi?id=10377 http://bugzilla.kernel.org/show_bug.cgi?id=10117 Thanks, Venki ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-02 18:02 ` Pallipadi, Venkatesh @ 2008-05-09 16:32 ` Mark Lord 2008-05-09 19:30 ` Carlos R. Mafra 0 siblings, 1 reply; 229+ messages in thread From: Mark Lord @ 2008-05-09 16:32 UTC (permalink / raw) To: Pallipadi, Venkatesh Cc: Carlos R. Mafra, Linus Torvalds, Adrian Bunk, Paul Mackerras, Josh Boyer, Arjan van de Ven, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt, tglx, Len Brown Pallipadi, Venkatesh wrote: > > >> -----Original Message----- >> From: linux-kernel-owner@vger.kernel.org >> [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of >> Carlos R. Mafra >> Sent: Friday, May 02, 2008 10:16 AM >> To: Linus Torvalds >> Cc: Adrian Bunk; Paul Mackerras; Josh Boyer; Arjan van de Ven; >> Andrew Morton; Rafael J. Wysocki; davem@davemloft.net; >> linux-kernel@vger.kernel.org; jirislaby@gmail.com; Steven >> Rostedt; Pallipadi, Venkatesh >> Subject: Re: RFC: starting a kernel-testers group for newbies >> >> On Fri 2.May'08 at 9:28:08 -0700, Linus Torvalds wrote: >> >>> Quite frankly, it does sound like the hang happens somewhere >> around the >>> hpet_init >>> hpet_acpi_add >>> hpet_resources >>> hpet_resources: 0xfed00000 is busy >>> >>> printk's you added (correct?) and we've had tons of issues >> with NO_HZ, so >>> at a guess it is timer-related. >> It happens a bit before that because when it hangs it doesn't >> print the above lines, and when it does not hang these lines are >> the ones right after the point where it hangs. >> >>> (And I assume it's stable if/once it gets past that boot hang issue? >> Yes you are right. When I have luck and the boot succeeds my >> Sony laptop >> is rock solid and the kernel is wonderful (even the card >> reader works!). >> >>> That >>> tends to mean that it's not some hardware instability, it's >> literally our >>> init code). >> A few days ago I found this message in lkml in reply to a hpet patch >> http://lkml.org/lkml/2007/5/7/361 in which the reporter also had >> a similar hang, which was cured by hpet=disable. >> >> So it is in my TODO list to try to check out if that patch is >> in the current -git and whether it can be reverted somehow (I >> added Venki to the Cc: now) >> >> Thanks a lot for the answer! > > It depends on whether we are HPET is being force detected based on the > chipset or whether it was exported by the BIOS in ACPI table. > > If it was force enabled and above patch is having any effect, then you > should see a message like >> Force enabled HPET at base address 0xfed00000 > > In any case, off late there seems to be quite a few breakages that are > related to HPET/timer interrupts. One of them was on a system which has > HPET being exported by BIOS > http://bugzilla.kernel.org/show_bug.cgi?id=10409 > And the other one where we are force enabling based on chipset > http://bugzilla.kernel.org/show_bug.cgi?id=10561 > > And then we have hangs once in a while reports by you, Roman and Mark > here > http://bugzilla.kernel.org/show_bug.cgi?id=10377 > http://bugzilla.kernel.org/show_bug.cgi?id=10117 .. Yeah. This particular bug first appeared when NOHZ & HPET were added. Somebody once suggested it had something to do with an SMI interrupt happening in the midst of HPET calibration or some such thing. But nobody who works on the HPET code has ever shown more than a casual interest in helping to track down and fix whatever the problem is. Cheers ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-09 16:32 ` Mark Lord @ 2008-05-09 19:30 ` Carlos R. Mafra 2008-05-09 20:39 ` Mark Lord 0 siblings, 1 reply; 229+ messages in thread From: Carlos R. Mafra @ 2008-05-09 19:30 UTC (permalink / raw) To: Mark Lord Cc: Pallipadi, Venkatesh, Linus Torvalds, Adrian Bunk, Paul Mackerras, Josh Boyer, Arjan van de Ven, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt, tglx, Len Brown On Fri 9.May'08 at 12:32:51 -0400, Mark Lord wrote: > Pallipadi, Venkatesh wrote: >> >>> -----Original Message----- >>> From: linux-kernel-owner@vger.kernel.org >>> [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Carlos R. Mafra >>> Sent: Friday, May 02, 2008 10:16 AM >>> To: Linus Torvalds >>> Cc: Adrian Bunk; Paul Mackerras; Josh Boyer; Arjan van de Ven; Andrew >>> Morton; Rafael J. Wysocki; davem@davemloft.net; >>> linux-kernel@vger.kernel.org; jirislaby@gmail.com; Steven Rostedt; >>> Pallipadi, Venkatesh >>> Subject: Re: RFC: starting a kernel-testers group for newbies >>> >>> On Fri 2.May'08 at 9:28:08 -0700, Linus Torvalds wrote: >>> >>>> Quite frankly, it does sound like the hang happens somewhere >>> around the >>>> hpet_init >>>> hpet_acpi_add >>>> hpet_resources >>>> hpet_resources: 0xfed00000 is busy >>>> >>>> printk's you added (correct?) and we've had tons of issues >>> with NO_HZ, so >>>> at a guess it is timer-related. >>> It happens a bit before that because when it hangs it doesn't print the >>> above lines, and when it does not hang these lines are >>> the ones right after the point where it hangs. >>>> (And I assume it's stable if/once it gets past that boot hang issue? >>> Yes you are right. When I have luck and the boot succeeds my Sony laptop >>> is rock solid and the kernel is wonderful (even the card reader works!). >>> >>>> That >>>> tends to mean that it's not some hardware instability, it's >>> literally our >>>> init code). >>> A few days ago I found this message in lkml in reply to a hpet patch >>> http://lkml.org/lkml/2007/5/7/361 in which the reporter also had a >>> similar hang, which was cured by hpet=disable. >>> So it is in my TODO list to try to check out if that patch is in the >>> current -git and whether it can be reverted somehow (I added Venki to the >>> Cc: now) >>> >>> Thanks a lot for the answer! >> >> It depends on whether we are HPET is being force detected based on the >> chipset or whether it was exported by the BIOS in ACPI table. >> >> If it was force enabled and above patch is having any effect, then you >> should see a message like >>> Force enabled HPET at base address 0xfed00000 >> >> In any case, off late there seems to be quite a few breakages that are >> related to HPET/timer interrupts. One of them was on a system which has >> HPET being exported by BIOS >> http://bugzilla.kernel.org/show_bug.cgi?id=10409 >> And the other one where we are force enabling based on chipset >> http://bugzilla.kernel.org/show_bug.cgi?id=10561 >> >> And then we have hangs once in a while reports by you, Roman and Mark >> here >> http://bugzilla.kernel.org/show_bug.cgi?id=10377 >> http://bugzilla.kernel.org/show_bug.cgi?id=10117 > .. > > Yeah. This particular bug first appeared when NOHZ & HPET were added. > Somebody once suggested it had something to do with an SMI interrupt > happening in the midst of HPET calibration or some such thing. > I said I was waiting for -rc1 to be released to send another email about my HPET problem, but curiously with v2.6.26-rc1-6-gafa26be my laptop did not hang after 30+ boots and counting. Somewhere between 2.6.25-07000-(something) and the above kernel something happened which changed significantly the probability of hanging during boot. I could not boot more than 3 times in a row without hanging with kernels up to 2.6.25-07000 (approximately), and now I am still booting v2.6.26-rc1-6-gafa26be a few times a day and no hangs yet. Yesterday I started a "reverse" bisection, trying to find which commit "fixed" it, but I still didn't finish (but it is past -7200). Of course I am not sure if after the 100th boot the latest -git won't hang but it definitely improved. > But nobody who works on the HPET code has ever shown more than a casual > interest in helping to track down and fix whatever the problem is. Well, I would like to thank Venki for his effort because he even answered some private emails from me about this issue and is tracking the bugzillas about it. ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-09 19:30 ` Carlos R. Mafra @ 2008-05-09 20:39 ` Mark Lord 0 siblings, 0 replies; 229+ messages in thread From: Mark Lord @ 2008-05-09 20:39 UTC (permalink / raw) To: Mark Lord, Pallipadi, Venkatesh, Linus Torvalds, Adrian Bunk, Paul Mackerras, Josh Boyer, Arjan van de Ven, Andrew Morton, Rafael J. Wysocki, davem, linux-kernel, jirislaby, Steven Rostedt, tglx, Len Brown Carlos R. Mafra wrote: > On Fri 9.May'08 at 12:32:51 -0400, Mark Lord wrote: >> Pallipadi, Venkatesh wrote: >>> >>>> -----Original Message----- >>>> From: linux-kernel-owner@vger.kernel.org >>>> [mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of Carlos R. Mafra >>>> Sent: Friday, May 02, 2008 10:16 AM >>>> To: Linus Torvalds >>>> Cc: Adrian Bunk; Paul Mackerras; Josh Boyer; Arjan van de Ven; Andrew >>>> Morton; Rafael J. Wysocki; davem@davemloft.net; >>>> linux-kernel@vger.kernel.org; jirislaby@gmail.com; Steven Rostedt; >>>> Pallipadi, Venkatesh >>>> Subject: Re: RFC: starting a kernel-testers group for newbies >>>> >>>> On Fri 2.May'08 at 9:28:08 -0700, Linus Torvalds wrote: >>>> >>>>> Quite frankly, it does sound like the hang happens somewhere >>>> around the >>>>> hpet_init >>>>> hpet_acpi_add >>>>> hpet_resources >>>>> hpet_resources: 0xfed00000 is busy >>>>> >>>>> printk's you added (correct?) and we've had tons of issues >>>> with NO_HZ, so >>>>> at a guess it is timer-related. >>>> It happens a bit before that because when it hangs it doesn't print the >>>> above lines, and when it does not hang these lines are >>>> the ones right after the point where it hangs. >>>>> (And I assume it's stable if/once it gets past that boot hang issue? >>>> Yes you are right. When I have luck and the boot succeeds my Sony laptop >>>> is rock solid and the kernel is wonderful (even the card reader works!). >>>> >>>>> That >>>>> tends to mean that it's not some hardware instability, it's >>>> literally our >>>>> init code). >>>> A few days ago I found this message in lkml in reply to a hpet patch >>>> http://lkml.org/lkml/2007/5/7/361 in which the reporter also had a >>>> similar hang, which was cured by hpet=disable. >>>> So it is in my TODO list to try to check out if that patch is in the >>>> current -git and whether it can be reverted somehow (I added Venki to the >>>> Cc: now) >>>> >>>> Thanks a lot for the answer! >>> It depends on whether we are HPET is being force detected based on the >>> chipset or whether it was exported by the BIOS in ACPI table. >>> >>> If it was force enabled and above patch is having any effect, then you >>> should see a message like >>>> Force enabled HPET at base address 0xfed00000 >>> In any case, off late there seems to be quite a few breakages that are >>> related to HPET/timer interrupts. One of them was on a system which has >>> HPET being exported by BIOS >>> http://bugzilla.kernel.org/show_bug.cgi?id=10409 >>> And the other one where we are force enabling based on chipset >>> http://bugzilla.kernel.org/show_bug.cgi?id=10561 >>> >>> And then we have hangs once in a while reports by you, Roman and Mark >>> here >>> http://bugzilla.kernel.org/show_bug.cgi?id=10377 >>> http://bugzilla.kernel.org/show_bug.cgi?id=10117 >> .. >> >> Yeah. This particular bug first appeared when NOHZ & HPET were added. >> Somebody once suggested it had something to do with an SMI interrupt >> happening in the midst of HPET calibration or some such thing. >> > > I said I was waiting for -rc1 to be released to send another email > about my HPET problem, but curiously with v2.6.26-rc1-6-gafa26be > my laptop did not hang after 30+ boots and counting. > > Somewhere between 2.6.25-07000-(something) and the above kernel > something happened which changed significantly the probability > of hanging during boot. > > I could not boot more than 3 times in > a row without hanging with kernels up to 2.6.25-07000 (approximately), > and now I am still booting v2.6.26-rc1-6-gafa26be a few times a day > and no hangs yet. > > Yesterday I started a "reverse" bisection, trying to find which > commit "fixed" it, but I still didn't finish (but it is past > -7200). > > Of course I am not sure if after the 100th boot the latest -git > won't hang but it definitely improved. > >> But nobody who works on the HPET code has ever shown more than a casual >> interest in helping to track down and fix whatever the problem is. > > Well, I would like to thank Venki for his effort because he even > answered some private emails from me about this issue and is > tracking the bugzillas about it. .. My experience with this bug, since 2.6.20 or so, has been that it comes and goes with even the most innocent change in the .config file, like turning frame pointers on/off. Cheers ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 0:31 ` RFC: starting a kernel-testers group for newbies Adrian Bunk 2008-04-30 7:03 ` Arjan van de Ven @ 2008-05-01 0:41 ` David Miller 2008-05-01 13:23 ` Adrian Bunk 1 sibling, 1 reply; 229+ messages in thread From: David Miller @ 2008-05-01 0:41 UTC (permalink / raw) To: bunk; +Cc: torvalds, akpm, rjw, linux-kernel, jirislaby, rostedt From: Adrian Bunk <bunk@kernel.org> Date: Thu, 1 May 2008 03:31:25 +0300 > - get a mailing list at vger kernel-testers@vger.kernel.org has been created, feel free to use it ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: RFC: starting a kernel-testers group for newbies 2008-05-01 0:41 ` David Miller @ 2008-05-01 13:23 ` Adrian Bunk 0 siblings, 0 replies; 229+ messages in thread From: Adrian Bunk @ 2008-05-01 13:23 UTC (permalink / raw) To: David Miller; +Cc: torvalds, akpm, rjw, linux-kernel, jirislaby, rostedt On Wed, Apr 30, 2008 at 05:41:58PM -0700, David Miller wrote: > From: Adrian Bunk <bunk@kernel.org> > Date: Thu, 1 May 2008 03:31:25 +0300 > > > - get a mailing list at vger > > kernel-testers@vger.kernel.org has been created, feel free to > use it Thanks :-) Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed ^ permalink raw reply [flat|nested] 229+ messages in thread
* Re: Slow DOWN, please!!! @ 2008-04-30 20:59 devzero 0 siblings, 0 replies; 229+ messages in thread From: devzero @ 2008-04-30 20:59 UTC (permalink / raw) To: linux-kernel yes, please ! i know, linux is an ever moving target, but from my user`s perspective i think the kernel more and more suffers from a quality problem. the deeper i`m involved in linux and the more i read into lkml or bugzilla, the more bad impression i get. ok, maybe windows has still more bugs and the transparency of linux gives a false impression, but it`s ridculous that things get broken that often. e.g. i rarely saw a cd-rom/dvd fail on windows - but i have seen LOTs of problems with that in linux, especially with more recent kernels. i can somewhat understand why one of my colleagues at work (windoze evangelist, sigh) constantly teasing "nahh, go away i don`t want to grapple with your DIY superstore OS" :) linux is absolutely great , but please make sure that quality and stability is priority number one ! _____________________________________________________________________ Unbegrenzter Speicherplatz für Ihr E-Mail Postfach? Jetzt aktivieren! http://freemail.web.de/club/landingpage.htm/?mc=025555 ^ permalink raw reply [flat|nested] 229+ messages in thread
end of thread, other threads:[~2008-05-14 14:56 UTC | newest] Thread overview: 229+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-04-30 2:03 Slow DOWN, please!!! David Miller 2008-04-30 4:03 ` David Newall 2008-04-30 4:18 ` David Miller 2008-04-30 13:04 ` David Newall 2008-04-30 13:18 ` Michael Kerrisk 2008-04-30 14:51 ` Linus Torvalds 2008-04-30 18:21 ` David Newall 2008-04-30 18:27 ` Linus Torvalds 2008-04-30 18:55 ` David Newall 2008-04-30 19:08 ` Linus Torvalds 2008-04-30 19:16 ` David Newall 2008-04-30 19:25 ` Linus Torvalds 2008-05-01 4:31 ` David Newall 2008-05-01 4:37 ` David Miller 2008-05-01 13:49 ` Lennart Sorensen 2008-05-01 15:28 ` Kasper Sandberg 2008-05-01 17:49 ` Russ Dill 2008-05-02 1:47 ` Kasper Sandberg 2008-05-02 2:54 ` Russ Dill 2008-05-02 7:01 ` Kasper Sandberg 2008-05-02 17:34 ` Lee Mathers (TCAFS) 2008-05-02 18:21 ` Andi Kleen 2008-05-02 21:34 ` Kasper Sandberg 2008-04-30 19:06 ` Chris Friesen 2008-04-30 19:13 ` Linus Torvalds 2008-04-30 19:22 ` David Newall 2008-04-30 19:42 ` Linus Torvalds 2008-04-30 7:11 ` Tarkan Erimer 2008-04-30 13:28 ` David Newall 2008-04-30 13:38 ` Mike Galbraith 2008-04-30 14:41 ` mws 2008-04-30 14:55 ` Russ Dill 2008-04-30 14:48 ` Peter Teoh 2008-04-30 19:36 ` Rafael J. Wysocki 2008-04-30 20:00 ` Andrew Morton 2008-04-30 20:20 ` Rafael J. Wysocki 2008-04-30 20:05 ` Linus Torvalds 2008-04-30 20:14 ` Linus Torvalds 2008-04-30 20:56 ` Rafael J. Wysocki 2008-04-30 23:34 ` Greg KH 2008-04-30 20:45 ` Rafael J. Wysocki 2008-04-30 21:37 ` Linus Torvalds 2008-04-30 22:23 ` Rafael J. Wysocki 2008-04-30 22:31 ` Linus Torvalds 2008-04-30 22:41 ` Andrew Morton 2008-04-30 23:23 ` Rafael J. Wysocki 2008-04-30 23:41 ` david 2008-04-30 23:51 ` Rafael J. Wysocki 2008-05-01 0:57 ` Adrian Bunk 2008-05-01 1:25 ` Linus Torvalds 2008-05-01 2:13 ` Adrian Bunk 2008-05-01 2:30 ` Linus Torvalds 2008-05-01 18:54 ` Adrian Bunk 2008-05-14 14:55 ` Pavel Machek 2008-05-01 1:35 ` Theodore Tso 2008-05-01 12:31 ` Tarkan Erimer 2008-05-01 15:34 ` Stefan Richter 2008-05-02 14:05 ` Tarkan Erimer 2008-04-30 22:46 ` Willy Tarreau 2008-04-30 22:52 ` Andrew Morton 2008-04-30 23:21 ` Willy Tarreau 2008-04-30 23:38 ` Chris Shoemaker 2008-04-30 23:20 ` Linus Torvalds 2008-05-01 0:42 ` Rafael J. Wysocki 2008-05-01 1:19 ` Linus Torvalds 2008-05-01 1:31 ` Andrew Morton 2008-05-01 1:43 ` Linus Torvalds 2008-05-01 10:59 ` Rafael J. Wysocki 2008-05-01 15:26 ` Linus Torvalds 2008-05-01 17:09 ` Rafael J. Wysocki 2008-05-01 17:41 ` Linus Torvalds 2008-05-01 18:11 ` Al Viro 2008-05-01 18:23 ` Linus Torvalds 2008-05-01 18:30 ` Linus Torvalds 2008-05-01 18:58 ` Willy Tarreau 2008-05-01 19:37 ` Al Viro 2008-05-01 19:58 ` Andrew Morton 2008-05-01 20:07 ` Joel Becker 2008-05-01 18:50 ` Willy Tarreau 2008-05-01 19:07 ` david 2008-05-01 19:28 ` Willy Tarreau 2008-05-01 19:46 ` david 2008-05-01 19:53 ` Willy Tarreau 2008-05-01 22:17 ` Rafael J. Wysocki 2008-05-01 19:39 ` Friedrich Göpel 2008-05-01 21:59 ` Rafael J. Wysocki 2008-05-02 12:17 ` Stefan Richter 2008-05-01 18:35 ` Chris Frey 2008-05-02 13:22 ` Enrico Weigelt 2008-05-01 1:40 ` Linus Torvalds 2008-05-01 1:51 ` David Miller 2008-05-01 2:01 ` Linus Torvalds 2008-05-01 2:17 ` David Miller 2008-05-01 2:21 ` Al Viro 2008-05-01 5:19 ` david 2008-05-04 3:26 ` Rene Herman 2008-05-01 2:31 ` Nigel Cunningham 2008-05-01 18:32 ` Stephen Clark 2008-05-01 3:53 ` Frans Pop 2008-05-01 11:38 ` Rafael J. Wysocki 2008-04-30 14:28 ` Arjan van de Ven 2008-05-01 12:41 ` Rafael J. Wysocki 2008-04-30 15:06 ` Arjan van de Ven 2008-05-01 5:50 ` Willy Tarreau 2008-05-01 11:53 ` Rafael J. Wysocki 2008-05-01 12:11 ` Will Newton 2008-05-01 13:16 ` Bartlomiej Zolnierkiewicz 2008-05-01 13:53 ` Rafael J. Wysocki 2008-05-01 14:35 ` Bartlomiej Zolnierkiewicz 2008-05-01 15:29 ` Ray Lee 2008-05-01 19:03 ` Willy Tarreau 2008-05-01 19:36 ` Valdis.Kletnieks 2008-05-01 1:30 ` Jeremy Fitzhardinge 2008-05-01 5:35 ` Willy Tarreau 2008-04-30 23:03 ` Rafael J. Wysocki 2008-04-30 22:40 ` david 2008-04-30 23:45 ` Rafael J. Wysocki 2008-04-30 23:57 ` david 2008-05-01 0:01 ` Chris Shoemaker 2008-05-01 0:14 ` david 2008-05-01 0:38 ` Linus Torvalds 2008-05-01 1:39 ` Jeremy Fitzhardinge 2008-05-01 0:38 ` Adrian Bunk 2008-05-01 0:56 ` Rafael J. Wysocki 2008-05-01 1:25 ` Adrian Bunk 2008-05-01 12:05 ` Rafael J. Wysocki 2008-05-01 13:54 ` Stefan Richter 2008-05-01 14:06 ` Rafael J. Wysocki 2008-04-30 23:29 ` Paul Mackerras 2008-05-01 1:57 ` Jeff Garzik 2008-05-01 2:52 ` Frans Pop 2008-05-01 3:47 ` Linus Torvalds 2008-05-01 4:17 ` Jeff Garzik 2008-05-01 4:46 ` Linus Torvalds 2008-05-04 13:47 ` Krzysztof Halasa 2008-05-04 15:05 ` Jacek Luczak 2008-05-01 9:17 ` Alan Cox 2008-04-30 20:15 ` Andrew Morton 2008-04-30 20:31 ` Linus Torvalds 2008-04-30 20:47 ` Dan Noe 2008-04-30 20:59 ` Andrew Morton 2008-04-30 21:30 ` Rafael J. Wysocki 2008-04-30 21:37 ` Andrew Morton 2008-04-30 22:08 ` Linus Torvalds 2008-04-30 22:53 ` Mariusz Kozlowski 2008-04-30 23:11 ` Andrew Morton 2008-05-12 9:27 ` Ben Dooks 2008-05-02 10:20 ` Andi Kleen 2008-05-02 15:33 ` Mariusz Kozlowski 2008-04-30 20:54 ` Andrew Morton 2008-04-30 21:21 ` David Miller 2008-04-30 21:47 ` Rafael J. Wysocki 2008-04-30 22:02 ` Dmitri Vorobiev 2008-04-30 22:19 ` Ingo Molnar 2008-04-30 22:22 ` David Miller 2008-04-30 22:39 ` Rafael J. Wysocki 2008-04-30 22:54 ` david 2008-04-30 23:12 ` Willy Tarreau 2008-04-30 23:59 ` Rafael J. Wysocki 2008-05-01 0:15 ` Chris Shoemaker 2008-05-01 5:09 ` Willy Tarreau 2008-04-30 22:35 ` Ingo Molnar 2008-04-30 22:49 ` Andrew Morton 2008-04-30 22:51 ` David Miller 2008-05-01 1:40 ` Ingo Molnar 2008-05-01 2:48 ` Adrian Bunk 2008-05-05 3:04 ` Rusty Russell 2008-05-02 13:37 ` Helge Hafting 2008-04-30 21:42 ` Dmitri Vorobiev 2008-04-30 22:06 ` Jiri Slaby 2008-04-30 22:10 ` Andrew Morton 2008-04-30 22:19 ` Linus Torvalds 2008-04-30 22:28 ` Dmitri Vorobiev 2008-05-01 16:26 ` Diego Calleja 2008-05-01 16:31 ` Dmitri Vorobiev 2008-05-02 1:48 ` Stephen Rothwell 2008-05-01 23:06 ` Kevin Winchester 2008-04-30 23:04 ` Dmitri Vorobiev 2008-05-01 15:19 ` Jim Schutt 2008-05-01 6:15 ` Jan Engelhardt 2008-05-09 9:28 ` Jiri Kosina 2008-05-09 15:00 ` Jeff Garzik 2008-04-30 21:52 ` H. Peter Anvin 2008-05-01 3:24 ` Bob Tracy 2008-05-01 16:39 ` Valdis.Kletnieks 2008-05-01 0:31 ` RFC: starting a kernel-testers group for newbies Adrian Bunk 2008-04-30 7:03 ` Arjan van de Ven 2008-05-01 8:13 ` Andrew Morton 2008-04-30 14:15 ` Arjan van de Ven 2008-05-01 12:42 ` David Woodhouse 2008-04-30 15:02 ` Arjan van de Ven 2008-05-05 10:03 ` Benny Halevy 2008-05-04 12:45 ` Rene Herman 2008-05-04 13:00 ` Pekka Enberg 2008-05-04 13:19 ` Rene Herman 2008-05-05 13:13 ` crosscompiler [WAS: RFC: starting a kernel-testers group for newbies] Enrico Weigelt 2008-05-01 9:16 ` RFC: starting a kernel-testers group for newbies Frans Pop 2008-05-01 10:30 ` Enrico Weigelt 2008-05-01 13:02 ` Adrian Bunk 2008-05-01 11:30 ` Adrian Bunk 2008-04-30 14:20 ` Arjan van de Ven 2008-05-01 12:53 ` Rafael J. Wysocki 2008-05-01 13:21 ` Adrian Bunk 2008-05-01 15:49 ` Andrew Morton 2008-05-01 1:13 ` Arjan van de Ven 2008-05-02 9:00 ` Adrian Bunk 2008-05-01 16:38 ` Steven Rostedt 2008-05-01 17:18 ` Andrew Morton 2008-05-01 17:24 ` Theodore Tso 2008-05-01 19:26 ` Andrew Morton 2008-05-01 19:39 ` Steven Rostedt 2008-05-02 10:23 ` Andi Kleen 2008-05-02 2:08 ` Paul Mackerras 2008-05-02 3:10 ` Josh Boyer 2008-05-02 4:09 ` Paul Mackerras 2008-05-02 8:29 ` Adrian Bunk 2008-05-02 10:16 ` Paul Mackerras 2008-05-02 11:58 ` Adrian Bunk 2008-05-02 14:58 ` Linus Torvalds 2008-05-02 15:44 ` Carlos R. Mafra 2008-05-02 16:28 ` Linus Torvalds 2008-05-02 17:15 ` Carlos R. Mafra 2008-05-02 18:02 ` Pallipadi, Venkatesh 2008-05-09 16:32 ` Mark Lord 2008-05-09 19:30 ` Carlos R. Mafra 2008-05-09 20:39 ` Mark Lord 2008-05-01 0:41 ` David Miller 2008-05-01 13:23 ` Adrian Bunk 2008-04-30 20:59 Slow DOWN, please!!! devzero
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).