From mboxrd@z Thu Jan 1 00:00:00 1970 From: Greg Kroah-Hartman Subject: Re: [PATCH] vt_buffer: drop console buffer copying optimisations Date: Thu, 29 Jan 2015 15:57:42 -0800 Message-ID: <20150129235742.GB14741__8770.91366933218$1422580867$gmane$org@kroah.com> References: <1422504685-7864-1-git-send-email-airlied@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.sourceforge.net To: Linus Torvalds Cc: Dave Airlie , Tomi Valkeinen , Linux Kernel Mailing List , dri-devel@lists.sf.net List-Id: dri-devel@lists.freedesktop.org On Thu, Jan 29, 2015 at 03:40:33PM -0800, Linus Torvalds wrote: > On Wed, Jan 28, 2015 at 8:11 PM, Dave Airlie wrote: > > > > Linus, this came up a while back I finally got some confirmation > > that it fixes those servers. > > I'm certainly ok with this. which way should it go in? The users are: > > - drivers/tty/vt/vt.c (Greg KH, "tty layer") > > - drivers/video/console/* (fbcon people: Tomi Valkeinen and friends) > > and it might make sense to have *some* indication of how much worse > this makes fbcon performance in particular.. > > Greg/Tomi - the patch is removing this: > > #define scr_memcpyw(d, s, c) memcpy(d, s, c) > #define scr_memmovew(d, s, c) memmove(d, s, c) > #define VT_BUF_HAVE_MEMCPYW > #define VT_BUF_HAVE_MEMMOVEW > > from , because some stupid graphics cards > apparently cannot handle 64-bit accesses of regular memcpy/memmove. > > And on other setups, this will be the reverse: 8-bit accesses due to > using "rep movsb", which is the fast way to move/clear memory on > modern Intel CPU's, but is really wrong for MMIO where it will be slow > as hell. > > So just getting rid of the memcpy/memmove is likely the right thing in > general, since the fallbacks go this the traditional 16-bit-at-a-time > way. And getting rid of the memcpy _may_ speed things up. > > But if it slows things down, we might have to try something else. Like > saying "all cards we've ever seen have been ok with aligned 32-bit > accesses", and extend the open-coded scr_memcpy/memmove functions to > do that. > > Hmm? I can take this through the tty tree, but can I put it in linux-next and wait for the 3.20 merge window to give people who might notice a slow-down a chance to object? thanks, greg k-h ------------------------------------------------------------------------------ Dive into the World of Parallel Programming. The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ --