All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Yinghai Lu <yinghai@kernel.org>
Cc: Joerg Roedel <joro@8bytes.org>, Ingo Molnar <mingo@elte.hu>,
	Alex Deucher <alexdeucher@gmail.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	dri-devel@lists.freedesktop.org, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>, Tejun Heo <tj@kernel.org>
Subject: Re: Linux 2.6.39-rc3
Date: Wed, 13 Apr 2011 16:39:21 -0700	[thread overview]
Message-ID: <BANLkTinud+4nUkZtnYRrU_gAFaMDWRa_AA@mail.gmail.com> (raw)
In-Reply-To: <4DA6145D.9070703@kernel.org>

On Wed, Apr 13, 2011 at 2:23 PM, Yinghai Lu <yinghai@kernel.org> wrote:
>>
>> What are all the magic numbers, and why would 0x80000000 be special?
>
> that is the old value when kernel was doing bottom-up bootmem allocation.

I understand, BUT THAT IS STILL A TOTALLY MAGIC NUMBER!

It makes it come out the same ON THAT ONE MACHINE.  So no, it's not
"the old value". It's a random value that gets the old value in one
specific case.

>> Why don't we write code that just works?
>>
>> Or absent a "just works" set of patches, why don't we revert to code
>> that has years of testing?
>>
>> This kind of "I broke things, so now I will jiggle things randomly
>> until they unbreak" is not acceptable.
>>
>> Either explain why that fixes a real BUG (and why the magic constants
>> need to be what they are), or just revert the patch that caused the
>> problem, and go back to the allocation patters that have years of
>> experience.
>>
>> Guys, we've had this discussion before, in PCI allocation. We don't do
>> this. We tried switching the PCI region allocations to top-down, and
>> IT WAS A FAILURE. We reverted it to what we had years of testing with.
>>
>> Don't just make random changes. There really are only two acceptable
>> models of development: "think and analyze" or "years and years of
>> testing on thousands of machines". Those two really do work.
>
> We did do the analyzing, and only difference seems to be:

No.

Yinghai, we have had this discussion before, and dammit, you need to
understand the difference between "understanding the problem" and "put
in random values until it works on one machine".

There was absolutely _zero_ analysis done. You do not actually
understand WHY the numbers matter. You just look at two random
numbers, and one works, the other does not. That's not "analyzing".
That's just "random number games".

If you cannot see and understand the difference between an actual
analytical solution where you _understand_ what the code is doing and
why, and "random numbers that happen to work on one machine", I don't
know what to tell you.

> good one is using 0x80000000
> and bad one is using 0xa0000000.
>
> We try to figure out if it needs low address and it happen to work
> because kernel was doing bottom up allocation.

No.

Let me repeat my point one more time.

You have TWO choices. Not more, not less:

 - choice #1: go back to the old allocation model. It's tested. It
doesn't regress. Admittedly we may not know exactly _why_ it works,
and it might not work on all machines, but it doesn't cause
regressions (ie the machines it doesn't work on it _never_ worked on).

   And this doesn't mean "old value for that _one_ machine". It means
"old value for _every_ machine". So it means we revert the whole
bottom-down thing entirely. Not just "change one random number so that
the totally different allocation pattern happens to give the same
result on one particular machine".

   Quite frankly, I don't see the point of doing top-to-bottom anyway,
so I think we should do this regardless. Just revert the whole
"allocate from top". It didn't work for PCI, it's not working for this
case either. Stop doing it.

 - Choice #2: understand exactly _what_ goes wrong, and fix it
analytically (ie by _understanding_ the problem, and being able to
solve it exactly, and in a way you can argue about without having to
resort to "magic happens").

Now, the whole analytic approach (aka "computer sciency" approach),
where you can actually think about the problem without having any
pesky "reality" impact the solution is obviously the one we tend to
prefer. Sadly, it's seldom the one we can use in reality when it comes
to things like resource allocation, since we end up starting off with
often buggy approximations of what the actual hardware is all about
(ie broken firmware tables).

So I'd love to know exactly why one random number works, and why
another one doesn't. But as long as we do _not_ know the "Why" of it,
we will have to revert.

It really is that simple. It's _always_ that simple.

So the numbers shouldn't be "magic", they should have real
explanations. And in the absense of real explanation, the model that
works is "this is what we've always done". Including, very much, the
whole allocation order. Not just one random number on one random
machine.

                        Linus

  reply	other threads:[~2011-04-13 23:39 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-04-12  0:40 Linux 2.6.39-rc3 Linus Torvalds
2011-04-12  9:02 ` Joerg Roedel
2011-04-12 14:15   ` Alex Deucher
2011-04-12 18:44     ` Joerg Roedel
2011-04-13  1:27       ` David Rientjes
2011-04-13  6:46       ` Ingo Molnar
2011-04-13 17:21         ` Joerg Roedel
2011-04-13 18:39           ` H. Peter Anvin
2011-04-13 19:26             ` Joerg Roedel
2011-04-13 18:51           ` H. Peter Anvin
2011-04-13 19:24             ` Joerg Roedel
2011-04-13 19:14           ` Yinghai Lu
2011-04-13 19:34             ` Joerg Roedel
2011-04-13 20:48               ` Yinghai Lu
2011-04-13 20:54                 ` Linus Torvalds
2011-04-13 21:23                   ` Yinghai Lu
2011-04-13 23:39                     ` Linus Torvalds [this message]
2011-04-14  0:10                       ` Yinghai Lu
2011-04-14  2:03                       ` H. Peter Anvin
2011-04-14  2:27                         ` Linus Torvalds
2011-04-14  2:27                           ` Linus Torvalds
2011-04-14  2:33                           ` Linus Torvalds
2011-04-14  2:33                             ` Linus Torvalds
2011-04-14  4:03                             ` Tejun Heo
2011-04-14  9:36                               ` Joerg Roedel
2011-04-14  8:09                             ` Alan Cox
2011-04-14  8:09                               ` Alan Cox
2011-04-15 13:11                             ` Joerg Roedel
2011-04-15 13:16                               ` Ingo Molnar
2011-04-15 14:33                                 ` Joerg Roedel
2011-04-15 16:11                                   ` Alex Deucher
2011-04-15 15:46                                 ` Joerg Roedel
2011-04-15 16:11                                   ` Jerome Glisse
2011-04-16 16:35                                     ` Joerg Roedel
2011-04-16 16:35                                       ` Joerg Roedel
2011-04-16 18:54                                       ` Jerome Glisse
2011-04-16 18:54                                         ` Jerome Glisse
2011-04-17 14:09                                         ` Joerg Roedel
2011-04-18  1:12                                           ` Jerome Glisse
2011-04-18 15:23                                           ` Alex Deucher
2011-04-18 15:23                                             ` Alex Deucher
2011-04-18 15:29                                             ` Jerome Glisse
2011-04-18 15:33                                               ` Alex Deucher
2011-04-18 15:59                                                 ` Jerome Glisse
2011-04-18 16:35                                                   ` Alex Deucher
2011-04-15 14:04                               ` Andreas Herrmann
2011-04-15 14:28                                 ` Joerg Roedel
2011-04-15 14:16                               ` Alexandre Demers
2011-04-15 14:27                                 ` Joerg Roedel
2011-04-15 14:27                                   ` Joerg Roedel
2011-04-15 18:59                                   ` Alexandre Demers
2011-04-15 19:06                                     ` Ingo Molnar
2011-04-15 19:18                                       ` Yinghai Lu
2011-04-15 20:22                                         ` H. Peter Anvin
2011-04-16 12:01                                         ` Joerg Roedel
2011-04-16 12:01                                           ` Joerg Roedel
2011-04-16 12:00                                       ` Joerg Roedel
2011-04-16 12:21                                         ` Ingo Molnar
2011-04-16 12:21                                           ` Ingo Molnar
2011-04-16  0:03                               ` [tip:x86/urgent] x86, amd: Disable GartTlbWlkErr when BIOS forgets it tip-bot for Joerg Roedel
2011-05-06 21:17                           ` Linux 2.6.39-rc3 Linus Torvalds
2011-04-13 21:50                 ` Joerg Roedel
2011-04-13 21:59                   ` Yinghai Lu
2011-04-13 22:11                     ` H. Peter Anvin
2011-04-13 22:01                   ` H. Peter Anvin
2011-04-13 22:22                     ` Joerg Roedel
2011-04-13 22:31                       ` H. Peter Anvin
2011-04-14  8:59                         ` Joerg Roedel
2011-04-13 19:48             ` Alex Deucher
2011-04-14  1:58             ` H. Peter Anvin
2011-04-14  1:58               ` H. Peter Anvin
2011-04-14  2:07               ` Dave Airlie
2011-04-14  6:10                 ` H. Peter Anvin
2011-04-14  8:56               ` Joerg Roedel
2011-04-14  9:07                 ` Dave Airlie
2011-04-14  9:11                 ` Ingo Molnar
2011-04-14 14:31                   ` H. Peter Anvin
2011-04-14 14:28                 ` Alex Deucher
2011-04-14 21:09                   ` Joerg Roedel
2011-04-14 21:34                     ` Alex Deucher
2011-04-15  6:50                       ` Joerg Roedel
2011-04-15 14:49                       ` Andreas Herrmann
2011-04-15  8:26                     ` Michel Dänzer
2011-04-15  8:26                       ` Michel Dänzer
2011-04-15  8:55                       ` Joerg Roedel
2011-04-12 19:09 ` Dave Jones
2011-04-12 19:21   ` Dave Jones
2011-04-12 19:55     ` Linus Torvalds
2011-04-12 20:13       ` Dave Jones
2011-04-14  8:20     ` Aneesh Kumar K.V
2011-04-18 22:57       ` Kay Sievers
2011-04-18 23:02         ` Dave Jones
2011-04-18 23:14           ` Kay Sievers
2011-04-19 11:42           ` Ted Ts'o
2011-04-19  8:23         ` Aneesh Kumar K.V
2011-04-19  8:37           ` Steven Whitehouse
2011-04-19  9:55           ` Kay Sievers
2011-04-12 20:20   ` Eric Sandeen
2011-04-12 20:27     ` Karel Zak
2011-04-12 20:33     ` Linus Torvalds
2011-04-14 20:24 ` Borislav Petkov
2011-04-14 20:55   ` Linus Torvalds
2011-04-15  4:14     ` Christoph Hellwig
2011-04-20 20:12       ` Borislav Petkov
2011-04-12 21:21 Alexandre Demers
2011-04-13  4:32 George Spelvin
2011-04-13 14:54 ` Linus Torvalds
2011-04-14 18:28   ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BANLkTinud+4nUkZtnYRrU_gAFaMDWRa_AA@mail.gmail.com \
    --to=torvalds@linux-foundation.org \
    --cc=alexdeucher@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=hpa@zytor.com \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.