All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Sander Eikelenboom <linux@eikelenboom.it>
Cc: xen-devel <xen-devel@lists.xensource.com>
Subject: Re: Load increase after memory upgrade (part2)
Date: Fri, 13 Jan 2012 10:13:07 -0500	[thread overview]
Message-ID: <20120113151307.GC5025@phenom.dumpdata.com> (raw)
In-Reply-To: <1442969761.20120112230601@eikelenboom.it>

> >> > I also have done some experiments with the patch, in domU i also get the 0% full for my usb controllers with video grabbers , in dom0 my i get 12% full, both my realtek 8169 ethernet controllers seem to use the bounce buffering ...
> >> > And that with a iommu (amd) ? it all seems kind of strange, although it is also working ...
> >> > I'm not having much time now, hoping to get back with a full report soon.
> >> 
> >> Hm, so domU nothing, but dom0 it reports. Maybe the patch is incorrect
> >> when running as PV guest .. Will look in more details after the
> >> holidays. Thanks for being willing to try it out.
> 
> > Good news is I am able to reproduce this with my 32-bit NIC with 3.2 domU:
> 
> > [  771.896140] SWIOTLB is 11% full
> > [  776.896116] 0 [e1000 0000:00:00.0] bounce: from:222028(slow:0)to:2 map:222037 unmap:227220 sync:0
> > [  776.896126] 1 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:5188 map:5188 unmap:0 sync:0
> > [  776.896133] 3 [e1000 0000:00:00.0] bounce: from:0(slow:0)to:1 map:1 unmap:0 sync:0
> 
> > but interestingly enough, if I boot the guest as the first one I do not get these bounce
> > requests. I will shortly bootup a Xen-O-Linux kernel and see if I get these same
> > numbers.
> 
> 
> I started to expiriment some more with what i encountered.
> 
> On dom0 i was seeing that my r8169 ethernet controllers where using bounce buffering with the dump-swiotlb module.
> It was showing "12% full".
> Checking in sysfs shows:
> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat consistent_dma_mask_bits
> 32
> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat dma_mask_bits
> 32
> 
> If i remember correctly wasn't the allocation for dom0 changed to be to the top of memory instead of low .. somewhere between 2.6.32 and 3.0 ?

? We never actually had dom0 support in the upstream kernel until 2.6.37.. The 2.6.32<->2.6.36 you are
referring to must have been the trees that I spun up - but the implementation of SWIOTLB in them
had not really changed.

> Could that change cause the need for all devices to need bounce buffering  and could it therefore explain some people seeing more cpu usage for dom0 ?

The issue I am seeing is not CPU usage in dom0, but rather the CPU usage in domU with guests.
And that the older domU's (XenOLinux) do not have this.

That I can't understand - the implementation in both cases _looks_ to do the same thing.
There was one issue I found in the upstream one, but even with that fix I still
get that "bounce" usage in domU.

Interestingly enough, I get that only if I have launched, destroyed, launched, etc, the guest multiple
times before I get this. Which leads me to believe this is not a kernel issue but that we
are simply fragmented the Xen memory so much, so that when it launches the guest all of the
memory is above 4GB. But that seems counter-intuive as by default Xen starts guests at the far end of
memory (so on my 16GB box it would stick a 4GB guest at 12GB->16GB roughly). The SWIOTLB
swizzles some memory under the 4GB , and this is where we get the bounce buffer effect
(as the memory from 4GB is then copied to the memory 12GB->16GB).

But it does not explain why on the first couple of starts I did not see this with pvops.
And it does not seem to happen with the XenOLinux kernel, so there must be something else
in here.

> 
> I have forced my r8169 to use 64bits dma mask (using use_dac=1)

Ah yes.
> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat consistent_dma_mask_bits
> 32
> serveerstertje:/sys/bus/pci/devices/0000:09:00.0# cat dma_mask_bits
> 64
> 
> This results in dump-swiotlb reporting:
> 
> [ 1265.616106] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10
> [ 1265.625043] SWIOTLB is 0% full
> [ 1270.626085] 0 [r8169 0000:08:00.0] bounce: from:6(slow:0)to:0 map:0 unmap:0 sync:12
> [ 1270.635024] SWIOTLB is 0% full
> [ 1275.635091] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10
> [ 1275.644261] SWIOTLB is 0% full
> [ 1280.654097] 0 [r8169 0000:09:00.0] bounce: from:5(slow:0)to:0 map:0 unmap:0 sync:10

Which is what we expect. No need to bounce since the PCI adapter can reach memory
above the 4GB mark.

> 
> 
> 
> So it has changed from 12% to 0%, although it still reports something about bouncing ? or am i mis interpreting stuff ?

The bouncing can happen due to two cases:
 - Memory is above 4GB
 - Memory crosses a page-boundary (rarely happens).
> 
> 
> Another thing i was wondering about, couldn't the hypervisor offer a small window in 32bit addressable mem to all (or only when pci passthrough is used) domU's to be used for DMA ?

It does. That is what the Xen SWIOTLB does with "swizzling" the pages in its pool.
But it can't do it for every part of memory. That is why there are DMA pools
which are used by graphics adapters, video capture devices,storage and network
drivers. They are used for small packet sizes so that the driver does not have
to allocate DMA buffers when it gets a 100bytes ping response. But for large
packets (say that ISO file you are downloading) it allocates memory on the fly
and "maps" it into the PCI space using the DMA API. That "mapping" sets up
an "physical memory" -> "guest memory" translation - and if that allocated
memory is above 4GB, part of this mapping is to copy ("bounce") the memory
under the 4GB (where XenSWIOTLB has allocated a pool), so that the adapter
can physically fetch/put the data. Once that is completed it is "sync"-ed
back, which is bouncing that data to the "allocated memory".

So having a DMA pool is very good - and most drivers use it. The thing I can't
figure out is:
 - why the DVB do not seem to use it, even thought they look to use the videobuf_dma
   driver.
 - why the XenOLinux does not seem to have this problem (and this might be false - 
   perhaps it does have this problem and it just takes a couple of guest launches,
   destructions, starts, etc to actually see it).
 - are there any flags in the domain builder to say: "ok, this domain is going to
   service 32-bit cards, hence build the memory from 0->4GB". This seems like
   a good know at first, but it probably is a bad idea (imagine using it by mistake
   on every guest). And also nowadays most cards are PCIe and they can do 64-bit, so
   it would not be that important in the future.
> 
> (oh yes, i haven't got i clue what i'm talking about ... so it probably make no sense at all :-) )

Nonsense. You were on the correct path . Hopefully the level of details hasn't
scared you off now :-)

> 
> 
> --
> Sander
> 
> 

  parent reply	other threads:[~2012-01-13 15:13 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-24 12:28 Load increase after memory upgrade (part2) Carsten Schiers
2011-11-25 18:42 ` Konrad Rzeszutek Wilk
2011-11-25 22:11   ` Carsten Schiers
2011-11-28 15:28     ` Konrad Rzeszutek Wilk
2011-11-28 15:40       ` Ian Campbell
2011-11-28 16:45         ` Konrad Rzeszutek Wilk
2011-11-29  8:31           ` Jan Beulich
2011-11-29  9:31             ` Carsten Schiers
2011-11-29  9:46           ` Carsten Schiers
2011-11-29 10:23           ` Ian Campbell
2011-11-29 15:33             ` Konrad Rzeszutek Wilk
2011-12-02 15:23               ` Konrad Rzeszutek Wilk
2011-12-04 11:59                 ` Carsten Schiers
2011-12-04 12:09                 ` Carsten Schiers
2011-12-06  3:26                   ` Konrad Rzeszutek Wilk
2011-12-14 20:23                     ` Konrad Rzeszutek Wilk
2011-12-14 22:07                       ` Konrad Rzeszutek Wilk
2011-12-15 14:52                         ` Carsten Schiers
2011-12-16 14:56                         ` Carsten Schiers
2011-12-16 15:04                           ` Konrad Rzeszutek Wilk
2011-12-16 15:51                             ` Carsten Schiers
2011-12-16 16:19                               ` Konrad Rzeszutek Wilk
2011-12-17 22:12                                 ` Carsten Schiers
2011-12-18  0:19                                   ` Sander Eikelenboom
2011-12-19 14:56                                     ` Konrad Rzeszutek Wilk
2012-01-10 21:55                                       ` Konrad Rzeszutek Wilk
2012-01-12 22:06                                         ` Sander Eikelenboom
2012-01-13  8:12                                           ` Jan Beulich
2012-01-13 15:13                                           ` Konrad Rzeszutek Wilk [this message]
2012-01-15 11:32                                             ` Sander Eikelenboom
2012-01-17 21:02                                               ` Konrad Rzeszutek Wilk
2012-01-18 11:28                                                 ` Pasi Kärkkäinen
2012-01-18 11:39                                                   ` Jan Beulich
2012-01-18 11:35                                                 ` Jan Beulich
2012-01-18 14:29                                                   ` Konrad Rzeszutek Wilk
2012-01-23 22:32                                                     ` Konrad Rzeszutek Wilk
2012-01-24  8:58                                                       ` Jan Beulich
2012-01-24 14:17                                                         ` Konrad Rzeszutek Wilk
2012-01-24 21:32                                                       ` Carsten Schiers
2012-01-25 12:02                                                       ` Carsten Schiers
2012-01-25 19:06                                                       ` Carsten Schiers
2012-01-25 21:02                                                         ` Konrad Rzeszutek Wilk
2012-02-15 19:28                                                         ` Konrad Rzeszutek Wilk
2012-02-16  8:56                                                           ` Jan Beulich
2012-02-17 15:07                                                             ` Konrad Rzeszutek Wilk
2012-02-28 14:35                                                               ` Carsten Schiers
2012-02-29 12:10                                                                 ` Carsten Schiers
2012-02-29 12:56                                                                   ` Carsten Schiers
2012-05-11  9:39                                                                     ` Carsten Schiers
2012-05-11 19:41                                                                       ` Konrad Rzeszutek Wilk
2012-06-13 16:55                                                                         ` Konrad Rzeszutek Wilk
2012-06-14  7:07                                                                           ` Jan Beulich
2012-06-14 18:33                                                                             ` Konrad Rzeszutek Wilk
2012-06-14 18:43                                                                             ` Carsten Schiers
2012-06-14  8:38                                                                           ` David Vrabel
2012-06-14 18:31                                                                             ` Konrad Rzeszutek Wilk
2012-06-14 18:40                                                                           ` Carsten Schiers
2012-06-14 19:16                                                                             ` Carsten Schiers
2011-12-19 14:54                                   ` Konrad Rzeszutek Wilk
2011-12-04 12:18                 ` Carsten Schiers
2011-11-28 16:58         ` Laszlo Ersek
2011-11-29  9:37         ` Carsten Schiers
2011-11-28 15:52       ` Carsten Schiers
2011-11-26  9:14   ` Carsten Schiers
2011-11-28 15:30     ` Konrad Rzeszutek Wilk
2011-11-29  9:42       ` Carsten Schiers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120113151307.GC5025@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=linux@eikelenboom.it \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.