All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ian Pratt <Ian.Pratt@cl.cam.ac.uk>
To: Andi Kleen <ak@suse.de>
Cc: Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, akpm@osdl.org,
	Ian Pratt <Ian.Pratt@cl.cam.ac.uk>,
	Steven.Hand@cl.cam.ac.uk, Christian.Limpach@cl.cam.ac.uk,
	Keir.Fraser@cl.cam.ac.uk, Ian.Pratt@cl.cam.ac.uk
Subject: Re: arch/xen is a bad idea
Date: Tue, 14 Dec 2004 22:40:20 +0000	[thread overview]
Message-ID: <E1CeLLB-0000Sl-00@mta1.cl.cam.ac.uk> (raw)
In-Reply-To: Your message of "14 Dec 2004 19:59:50 +0100." <p73acsg1za1.fsf@bragg.suse.de>

> > Stunned silence I guess - merging an architecture is
> > usually much more controversial ;)
> 
> In my opinion it's still an extremly bad idea to have arch/xen
> an own architecture. It will cause a lot of work long term
> to maintain it, especially when it gets x86-64 support too.
> It would be much better to just merge it with i386/x86-64.

Andi, I totally agree that merging into i386 could be a long term
goal. However, its just not feasible right now. The changes
required are way too intrusive. We put considerable effort into
investigating this approach, but came to the conclusion that with
the current structure of arch i386 it was going to be way too
messy. 

I really think the best approach is to get arch xen into
mainstream Linux, and then work toward integrating i386, x86_64
and xen. From our point of view, the first stage of this is to
increase the number of files that are shared unmodified between
i386 and xen/i386 (i.e. linked from xen into i386). There's
already many such files, but with a few relatively simple changes
to i386 we could get quite a few more.

> Currently it's already difficult enough to get people to
> add fixes to both i386 and x86-64, adding fixes to three
> or rather four (xen32 and xen64) architectures will be quite bad.
> In practice we'll likely get much worse code drift and missing
> fixes. Also I still suspect Ian is underestimating how much
> work it is long term to keep an Linux architecture uptodate.

We're actually very well setup to handle this, having been doing
it for some time. Whenever Linus issues a new mainstream patch,
we have a script that rewrites the patch to duplicate the hunks
that apply to i386 such that they also hit files that we've
modified in xen/i386. This way we keep arch xen/i386 in perfect
sync with i386. It takes discipline, but we're pretty good at it
now.

> I cannot imagine the virtualization hooks are intrusive anyways. The
> only things it needs to hook idle and the page table updates,
> right?

It's rather more complicated than that if you want a clean
interface that gives good performance. We've taken a very
benchmark-driven approach to minimise the overhead of
virtualization. The aim is to have such a small overhead that
people are happy running virtualized the whole time. I think this
is a really important aim.

> Doing that cleanly in the existing architectures shouldn't be that
> hard.

I've appended a list of some of the areas we need to modify. I
think you may have underestimated what needs to happen.

> I suspect xen64 will be rather different from xen32 anyways
> because as far as I can see the tricks Xen32 uses to be
> fast (segment limits) just plain don't work on 64bit
> because the segments don't extend into 64bit space.
> So having both in one architecture may also end up messy.

We have subdirectories for the i386 and x86_64 specific files,
along with a common directory for stuff which is shared between
the two e.g. virtual interrupt control etc.
 
> Also the other thing I'm worried about is that there is no clear
> specification on how the Xen<->Linux interface works.

We have an interface manual in the Xen bk repo, but I acknowledge
that we haven't always been totally prompt in keeping it up to
date and fully detailed. Now the Xen2 interface is frozen, that
should be fixable. Even so, it hasn't prevented other independent
groups porting OSes to Xen e.g. NetBSD, FreeBSD, Plan9.

Ian

Differences between i386 and xen/i386 files:
- irq setup
- pci bus scanning
- gdt/ldt must be in dedicated read-only pages
- gdt/ldt install/updates
- debug register updates
- pte quicklist/cache
- pmd/pgd are read-only pages
- highmem pte mapping
- dma memory allocation
- idle loop
- different fixmap
- *_to_* macros differ (like page_to_phys)
- *_val macros (pte_val, pmd_val)
- inb/outb
- switching cr3
- cr4 updates
- a few cpu flags need to be cleared
- msr access
- wbinvd call
- mtrr access
- io permission handling
- ioremap 
- access to hardware memory
- interrupt enabling/disabling
- setup of trap gates
- tlb flush
- fpu stack switches
- user/kernel mode test
- different segment selectors since the kernel runs in ring1
- pagefault handler stack layout is different
- startup is in 32-bit mode
- start of day initialization is different
- start of day memory probing and pagetable setup
- trap/fault handling
- timer is virtualized
- smp addtl. cpu startup
- smp ipi's




  parent reply	other threads:[~2004-12-14 22:44 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <41BF1983.mailP9C1B91GB@suse.de.suse.lists.linux.kernel>
2004-12-14 18:59 ` arch/xen is a bad idea Andi Kleen
2004-12-14 19:35   ` Antonio Vargas
2004-12-14 22:40   ` Ian Pratt [this message]
2004-12-15  4:49     ` Andi Kleen
2004-12-16  0:09       ` Alan Cox
2004-12-16  4:01         ` Andi Kleen
2004-12-16 12:54           ` Alan Cox
2004-12-16 14:09             ` Andi Kleen
2004-12-16 13:19               ` Alan Cox
2004-12-16 14:28                 ` Andi Kleen
2004-12-16 20:37                   ` Ian Pratt
2004-12-16 18:26               ` Andrew Morton
2004-12-16 18:57                 ` Alan Cox
2004-12-16 21:00                 ` Ian Pratt
2004-12-16 21:03                   ` Andrew Morton
2004-12-16 21:36                     ` Ian Pratt
2004-12-16 21:39                       ` Rik van Riel
2004-12-17  6:04                       ` Andi Kleen
2004-12-17  8:26                         ` Ian Pratt
2004-12-16 22:04                   ` Philip R Auld
2004-12-16 23:08                     ` Rik van Riel
2004-12-17  2:07                       ` Philip R Auld
2004-12-17  6:03                   ` Andi Kleen
2004-12-15 11:49     ` Pavel Machek
2004-12-16  1:14       ` Ian Pratt
2004-12-16  1:26         ` Pavel Machek
2004-12-16 14:21         ` Andi Kleen
2004-12-16 22:45       ` Bill Davidsen
2004-12-16 23:09         ` Rik van Riel
2004-12-20 15:08         ` arch/xen clue? Dorn Hetzel
2004-12-20 15:15           ` Ian Pratt
2004-12-20 15:23           ` Anton Altaparmakov
2004-12-20 15:34           ` Måns Rullgård
2004-12-15 11:51     ` arch/xen is a bad idea Pavel Machek
2004-12-17 16:05   ` William Lee Irwin III
2004-12-18 17:57     ` Ian Pratt
2005-02-25 11:43   ` Andrew Morton
2005-02-25 11:55     ` kernel 2.6.8-24.11-smp errors Marcel Smeets
2005-02-25 12:07 arch/xen is a bad idea Ian Pratt
2005-02-25 15:01 ` Andi Kleen
2005-02-25 22:37 ` Andrew Morton
2005-02-26 20:41 Ian Pratt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E1CeLLB-0000Sl-00@mta1.cl.cam.ac.uk \
    --to=ian.pratt@cl.cam.ac.uk \
    --cc=Christian.Limpach@cl.cam.ac.uk \
    --cc=Keir.Fraser@cl.cam.ac.uk \
    --cc=Steven.Hand@cl.cam.ac.uk \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.