From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751800AbcDBFdr (ORCPT ); Sat, 2 Apr 2016 01:33:47 -0400 Received: from mx2.suse.de ([195.135.220.15]:48968 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750940AbcDBFdp (ORCPT ); Sat, 2 Apr 2016 01:33:45 -0400 Date: Sat, 02 Apr 2016 07:33:40 +0200 Message-ID: From: Takashi Iwai To: "Luis R. Rodriguez" Cc: Stefano Stabellini , Andrew Cooper , Stephen Hemminger , Jim Gettys , Dario Faggioli , George.Dunlap@eu.citrix.com, Ingo Molnar , Thomas Gleixner , Steven Rostedt , Mel Gorman , Andy Lutomirski , Konstantin Ozerkov , "linux-kernel@vger.kernel.org" , ALSA development , Matt Fleming , Borislav Petkov , xen-devel , Joerg Roedel Subject: Re: Getting rid of inside_vm in intel8x0 In-Reply-To: <20160401222831.GX1990@wotan.suse.de> References: <20160331222618.GE1990@wotan.suse.de> <20160401222831.GX1990@wotan.suse.de> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL/10.8 Emacs/24.5 (x86_64-suse-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 02 Apr 2016 00:28:31 +0200, Luis R. Rodriguez wrote: > > On Fri, Apr 01, 2016 at 07:34:10AM +0200, Takashi Iwai wrote: > > On Fri, 01 Apr 2016 00:26:18 +0200, > > Luis R. Rodriguez wrote: > > > > > > On Wed, Mar 30, 2016 at 08:07:04AM +0200, Takashi Iwai wrote: > > > > On Tue, 29 Mar 2016 23:37:32 +0200, > > > > Andy Lutomirski wrote: > > > > > > > > > > Would it be possible to revert: > > > > > > > > > > commit 228cf79376f13b98f2e1ac10586311312757675c > > > > > Author: Konstantin Ozerkov > > > > > Date: Wed Oct 26 19:11:01 2011 +0400 > > > > > > > > > > ALSA: intel8x0: Improve performance in virtual environment > > > > > > > > > > Presumably one or more of the following is true: > > > > > > > > > > a) The inside_vm == true case is just an optimization and should apply > > > > > unconditionally. > > > > > > > > > > b) The inside_vm == true case is incorrect and should be fixed or disabled. > > > > > > > > > > c) The inside_vm == true case is a special case that makes sense then > > > > > IO is very very slow but doesn't make sense when IO is fast. If so, > > > > > why not literally measure the time that the IO takes and switch over > > > > > to the "inside VM" path when IO is slow? > > > > > > BTW can we simulate this on bare metal by throttling an IO bus, or > > > perhaps mucking with the scheduler ? > > > > > > I ask as I wonder if similar type of optimization may be useful > > > to first simulate with other types of buses for other IO devices > > > we might use in virtualization environments. If so, I'd be curious > > > to know if similar type of optimizations might be possible for > > > other sounds cards, or other IO devices. > > > > There aren't so many sound devices requiring such a workaround. > > Why not, what makes this special? The hardware buggyness. > > > > More important condition is rather that the register updates of CIV > > > > and PICB are atomic. > > > > > > To help with this can you perhaps elaborate a bit more on what the code > > > does? As I read it snd_intel8x0_pcm_pointer() gets a pointer to some > > > sort of audio frame we are in and uses two values to see if we are > > > going to be evaluating the right frame, we use an optimization of > > > some sort to skip one check for virtual environments. We seem to need > > > this given that on a virtual environment it is assumed that the sound > > > card is emulated, and as such an IO read there is rather expensive. > > > > > > Can you confirm and/or elaborate a bit more what this does ? > > > > > > To try to help understand what is going on can you describe what CIV > > > and PICB are exactly ? > > > > CIV and PICB registers are a pair and we calculate the linear position > > in a ring buffer from both two. > > What type of ring buffer is this ? A normal PCM ring buffer via PCI DMA transfer. > > However, they are divorced sometimes > > under stress, and the position calculated from such values may go > > backward wrongly. For avoiding it, there is the second read of the > > PICB register and compare with the previous value, and loop until it > > matches. This check is skipped on VM. > > I see. Is this a software emulation *bug*, or an IO issue due to > virtualization? I tried to read what the pointer() (that's what its called) > callback does but since there is no documentation for any of the callbacks I > have no clue what so ever. > > If the former, could a we somehow detect an emulated device other than through > this type of check ? Or could we *add* a capability of some sort to detect it > on the driver ? This would not address the removal, but it could mean finding a > way to address emulation issues. > > If its an IO issue -- exactly what is the causing the delays in IO ? Luis, there is no problem about emulation itself. It's rather an optimization to lighten the host side load, as I/O access on a VM is much heavier. > > > > This is satisfied mostly only on VM, and can't > > > > be measured easily unlike the IO read speed. > > > > > > Interesting, note the original patch claimed it was for KVM and > > > Parallels hypervisor only, but since the code uses: > > > > > > +#if defined(__i386__) || defined(__x86_64__) > > > + inside_vm = inside_vm || boot_cpu_has(X86_FEATURE_HYPERVISOR); > > > +#endif > > > > > > This makes it apply also to Xen as well, this makes this hack more > > > broad, but does is it only applicable when an emulated device is > > > used ? What about if a hypervisor is used and PCI passthrough is > > > used ? > > > > A good question. Xen was added there at the time from positive > > results by quick tests, but it might show an issue if it's running on > > a very old chip with PCI passthrough. But I'm not sure whether PCI > > passthrough would work on such old chipsets at all. > > If it did have an issue then that would have to be special cased, that > is the module parameter would not need to be enabled for such type of > systems, and heuristics would be needed. As you note, fortunately this > may not be common though... Actually this *is* module parametered. If set to a boolean value, it can be applied / skipped forcibly. So, if there has been a problem on Xen, this should have been reported. That's why I wrote it's no common case. This comes from the real experience. > but if this type of work around may be > taken as a precedent to enable other types of hacks in other drivers > I'm very fearful of more hacks later needing these considerations as > well. > > > > > > There are a pile of nonsensical "are we in a VM" checks of various > > > > > sorts scattered throughout the kernel, they're all a mess to maintain > > > > > (there are lots of kinds of VMs in the world, and Linux may not even > > > > > know it's a guest), and, in most cases, it appears that the correct > > > > > solution is to delete the checks. I just removed a nasty one in the > > > > > x86_32 entry asm, and this one is written in C so it should be a piece > > > > > of cake :) > > > > > > > > This cake looks sweet, but a worm is hidden behind the cream. > > > > The loop in the code itself is already a kludge for the buggy hardware > > > > where the inconsistent read happens not so often (only at the boundary > > > > and in a racy way). It would be nice if we can have a more reliably > > > > way to know the hardware buggyness, but it's difficult, > > > > unsurprisingly. > > > > > > The concern here is setting precedents for VM cases sprinkled in the kernel. > > > The assumption here is such special cases are really paper'ing over another > > > type of issue, so its best to ultimately try to root cause the issue in > > > a more generalized fashion. > > > > Well, it's rather bare metal that shows the buggy behavior, thus we > > need to paper over it. In that sense, it's other way round; we don't > > tune for VM. The VM check we're discussing is rather for skipping the > > strange workaround. > > What is it exactly about a VM that enables this work around to be skipped? > I don't quite get it yet. VM -- at least the full one with the sound hardware emulation -- doesn't have the hardware bug. So, the check isn't needed. > > You may ask whether we can reduce the whole workaround instead. It's > > practically impossible. We don't know which models doing so and which > > not. And, the hardware in question are (literally) thousands of > > variants of damn old PC mobos. Any fundamental change needs to be > > verified on all these machines... > > What if we can come up with algorithm on the ring buffer that would > satisfy both cases without special casing it ? Is removing this VM > check impossible really? Yes, it's impossible practically, see my comment above. Whatever you change, you need to verify it on real machines. And it's very difficult to achieve. thanks, Takashi