All of lore.kernel.org
 help / color / mirror / Atom feed
From: "G.R." <firemeteor@users.sourceforge.net>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: xen-devel@lists.xenproject.org
Subject: Re: Possible bug? DOM-U network stopped working after fatal error reported in DOM0
Date: Wed, 5 Jan 2022 00:05:39 +0800	[thread overview]
Message-ID: <CAKhsbWbrvF6M-SAocACO5NvBaitUQ9mB5Qx+fMGtn_yVu0ZvEA@mail.gmail.com> (raw)
In-Reply-To: <YdQgf2+E467kuTxK@Air-de-Roger>

> > > > But seems like this patch is not stable enough yet and has its own
> > > > issue -- memory is not properly released?
> > >
> > > I know. I've been working on improving it this morning and I'm
> > > attaching an updated version below.
> > >
> > Good news.
> > With this  new patch, the NAS domU can serve iSCSI disk without OOM
> > panic, at least for a little while.
> > I'm going to keep it up and running for a while to see if it's stable over time.
>
> Thanks again for all the testing. Do you see any difference
> performance wise?
I'm still on a *debug* kernel build to capture any potential panic --
none so far -- no performance testing yet.
Since I'm a home user with a relatively lightweight workload, so far I
didn't observe any difference in daily usage.

I did some quick iperf3 testing just now.
1. between nas domU <=> Linux dom0 running on an old i7-3770 based box.
The peak is roughly 12 Gbits/s when domU is the server.
But I do see regression down to ~8.5 Gbits/s when I repeat the test in
a short burst.
The regression can recover when I leave the system idle for a while.

When dom0 is the iperf3 server, the transfer rate is much lower, down
all the way to 1.x Gbits/s.
Sometimes, I can see the following kernel log repeats during the
testing, likely contributing to the slowdown.
             interrupt storm detected on "irq2328:"; throttling interrupt source
Another thing that looks alarming is the retransmission is high:
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   212 MBytes  1.78 Gbits/sec  110    231 KBytes
[  5]   1.00-2.00   sec   230 MBytes  1.92 Gbits/sec    1    439 KBytes
[  5]   2.00-3.00   sec   228 MBytes  1.92 Gbits/sec    3    335 KBytes
[  5]   3.00-4.00   sec   204 MBytes  1.71 Gbits/sec    1    486 KBytes
[  5]   4.00-5.00   sec   201 MBytes  1.69 Gbits/sec  812    258 KBytes
[  5]   5.00-6.00   sec   179 MBytes  1.51 Gbits/sec    1    372 KBytes
[  5]   6.00-7.00   sec  50.5 MBytes   423 Mbits/sec    2    154 KBytes
[  5]   7.00-8.00   sec   194 MBytes  1.63 Gbits/sec  339    172 KBytes
[  5]   8.00-9.00   sec   156 MBytes  1.30 Gbits/sec  854    215 KBytes
[  5]   9.00-10.00  sec   143 MBytes  1.20 Gbits/sec  997   93.8 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.76 GBytes  1.51 Gbits/sec  3120             sender
[  5]   0.00-10.45  sec  1.76 GBytes  1.44 Gbits/sec                  receiver

2. between a remote box <=> nas domU, through a 1Gbps ethernet cable.
Roughly saturate the link when domU is the server, without obvious perf drop
When domU running as a client, the achieved BW is ~30Mbps lower than the peak.
Retransmission sometimes also shows up in this scenario, more
seriously when domU is the client.

I cannot test with the stock kernel nor with your patch in release
mode immediately.
But according to the observed imbalance between inbounding and
outgoing path, non-trivial penalty applies I guess?

> > BTW, an irrelevant question:
> > What's the current status of HVM domU on top of storage driver domain?
> > About 7 years ago, one user on the list was able to get this setup up
> > and running with your help (patch).[1]
> > When I attempted to reproduce a similar setup two years later, I
> > discovered that the patch was not submitted.
> > And even with that patch the setup cannot be reproduced successfully.
> > We spent some time debugging on the problem together[2], but didn't
> > bottom out the root cause at that time.
> > In case it's still broken and you still have the interest and time, I
> > can launch a separate thread on this topic and provide required
> > testing environment.
>
> Yes, better as a new thread please.
>
> FWIW, I haven't looked at this since a long time, but I recall some
> fixes in order to be able to use driver domains with HVM guests, which
> require attaching the disk to dom0 in order for the device model
> (QEMU) to access it.
>
> I would give it a try without using stubdomains and see what you get.
> You will need to run `xl devd` inside of the driver domain, so you
> will need to install xen-tools on the domU. There's an init script to
> launch `xl devd` at boot, it's called 'xendriverdomain'.
Looks like I'm unlucky once again. Let's follow up in a separate thread.

> Thanks, Roger.


  reply	other threads:[~2022-01-04 16:06 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-18 18:35 Possible bug? DOM-U network stopped working after fatal error reported in DOM0 G.R.
2021-12-19  6:10 ` Juergen Gross
2021-12-19 17:31 ` G.R.
2021-12-20 17:13   ` G.R.
2021-12-21 13:50     ` Roger Pau Monné
2021-12-21 18:19       ` G.R.
2021-12-21 19:12         ` Roger Pau Monné
2021-12-23 15:49           ` G.R.
2021-12-24 11:24             ` Roger Pau Monné
2021-12-25 16:39               ` G.R.
2021-12-25 18:06                 ` G.R.
2021-12-27 19:04                   ` Roger Pau Monné
     [not found]                     ` <CAKhsbWY5=vENgwgq3NV44KSZQgpOPY=33CMSZo=jweAcRDjBwg@mail.gmail.com>
2021-12-29  8:32                       ` Roger Pau Monné
2021-12-29  9:13                         ` G.R.
2021-12-29 10:27                           ` Roger Pau Monné
2021-12-29 19:07                             ` Roger Pau Monné
2021-12-30 15:12                               ` G.R.
2021-12-30 18:51                                 ` Roger Pau Monné
2021-12-31 14:47                                   ` G.R.
2022-01-04 10:25                                     ` Roger Pau Monné
2022-01-04 16:05                                       ` G.R. [this message]
2022-01-05 14:33                                         ` Roger Pau Monné
2022-01-07 17:14                                           ` G.R.
2022-01-10 14:53                                             ` Roger Pau Monné
2022-01-11 14:24                                               ` G.R.
2022-10-30 16:36                                               ` G.R.
2022-11-03  6:58                                                 ` Paul Leiber
2022-11-03 12:22                                                   ` Roger Pau Monné
2022-12-14  6:16                                                     ` G.R.
2024-01-09 11:13                                                       ` Niklas Hallqvist
2024-01-09 13:53                                                         ` Roger Pau Monné
2024-01-19 15:51                                                           ` G.R.
2021-12-20 13:51 ` Roger Pau Monné

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKhsbWbrvF6M-SAocACO5NvBaitUQ9mB5Qx+fMGtn_yVu0ZvEA@mail.gmail.com \
    --to=firemeteor@users.sourceforge.net \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.