All of lore.kernel.org
 help / color / mirror / Atom feed
From: ard.biesheuvel@linaro.org (Ard Biesheuvel)
To: linux-arm-kernel@lists.infradead.org
Subject: Call for testing/opinions: Optimized memset/memcpy
Date: Sun, 14 Jul 2013 16:09:20 +0200	[thread overview]
Message-ID: <CAKv+Gu8JcQUH6d1mWh+sGcMXU6rPW7yk8P16PgqnF1Zdfn3OkA@mail.gmail.com> (raw)
In-Reply-To: <loom.20130714T152342-120@post.gmane.org>

On 14 July 2013 15:33, Harm Hanemaaijer <fgenfb@yahoo.com> wrote:
> Ard Biesheuvel <ard.biesheuvel <at> linaro.org> writes:
>
>>
>> You will clobber the userland NEON contents of the register file if
>> you don't preserve them properly. Also, kernel preemption (if enabled)
>> may put your task to sleep at any time, and the context switching
>> machinery is totally oblivious of NEON being used in the kernel, so
>> the kernel side will get corrupted as well in this case.
>>
>> I have a patch series pending (i.e., accepted but not pulled yet by
>> Russell) which addresses these issues.
>>
>
> That was what I was afraid of concerning NEON. It must be tricky to solve
> without sacrificing performance, since saving/restoring the entire NEON
> register file would obviously seriously impact context switch performance.
> For memcpy-like applications, basically only four dword registers are
> required (d0-d3) which could possibly be optimized for.
>

Well, the whole lazy preserve/restore mechanism is based on the
premise that preserve/restore is only required when multiple users are
contending for the NEON (or in the SMP case, when a task gets migrated
to another CPU). As we will not be allowing NEON in interrupt context
nor in a preemptible section, the burden of the more costly context
switches should not grow disproportionately, even if tasks may be
contending for the NEON with themselves in a way (userland vs kernel).
However, it also means that a NEON based memcpy() is going to be
problematic, not only for the reasons pointed out by Russell, also
because you will need a fallback to use from interrupt context.

Perhaps for sufficiently large sizes, it makes sense to take the hit
of testing whether NEON is allowable at that particular moment, and
doing the preserve in that case. In the end, the numbers should speak
for themselves: if you manage a considerable speedup in a real-world
case, and no deterioration in others, people are usually quite
receptive.

-- 
Ard.

  reply	other threads:[~2013-07-14 14:09 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-13 15:51 Call for testing/opinions: Optimized memset/memcpy Harm Hanemaaijer
2013-07-13 16:48 ` Dr. David Alan Gilbert
2013-07-13 21:13   ` Harm Hanemaaijer
2013-07-15 13:15     ` Catalin Marinas
2013-07-14 11:19   ` Harm Hanemaaijer
2013-07-14 11:32     ` Dr. David Alan Gilbert
2013-07-14 11:37     ` Ard Biesheuvel
2013-07-14 13:13       ` Russell King - ARM Linux
2013-07-14 13:33       ` Harm Hanemaaijer
2013-07-14 14:09         ` Ard Biesheuvel [this message]
2013-07-14 14:32           ` Russell King - ARM Linux
2013-07-13 17:24 ` Willy Tarreau
2013-07-13 21:51   ` Harm Hanemaaijer
2013-07-14  6:13     ` Willy Tarreau
2013-07-14 11:00       ` Harm Hanemaaijer
2013-07-14 13:09         ` Russell King - ARM Linux
2013-07-14 13:59           ` Harm Hanemaaijer
2013-07-14 15:21         ` Siarhei Siamashka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKv+Gu8JcQUH6d1mWh+sGcMXU6rPW7yk8P16PgqnF1Zdfn3OkA@mail.gmail.com \
    --to=ard.biesheuvel@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.