All of lore.kernel.org
 help / color / mirror / Atom feed
From: linux@arm.linux.org.uk (Russell King - ARM Linux)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH V2 0/3] PTE fixes for arm and arm64
Date: Wed, 26 Mar 2014 13:48:03 +0000	[thread overview]
Message-ID: <20140326134803.GA7528@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <20140326132318.GA30750@linaro.org>

On Wed, Mar 26, 2014 at 01:23:19PM +0000, Steve Capper wrote:
> On Wed, Mar 26, 2014 at 11:01:41AM +0000, Russell King - ARM Linux wrote:
> > On Wed, Mar 26, 2014 at 10:23:19AM +0000, Steve Capper wrote:
> > > If there are no objections, I was going to put the following into
> > > Russell's patch system:
> > > 	arm: mm: Double logical invert for pte accessors
> > > 	arm: mm: Switch back to L_PTE_WRITE
> > 
> > I'm not all that happy with double inversions - I think they just serve
> > to cause confusion (and it was confusing, which is why I removed it.)
> > I'll only take them if you have a really good reason why you want to
> > bring it back.
> 
> Hi Russell,
> The problem I'm trying to solve is for LPAE, where we have flags in the
> upper 32 bits of a page table entry that are tested for with a bitwise
> and, then subsequently downcast by a store to 32-bit integer:
> 
> 	gather_stats(page, md, pte_dirty(*pte), 1);
> and,
> 
> 	static inline unsigned long huge_pte_write(pte_t pte)
> 	{
> 		return pte_write(pte);
> 	}
> 
> (and other cases that may arise in future).

I think I have already said that these cases should be dealt with by
ensuring that they return sensible values in such cases.  The official
return type for pte_write() and pte_dirty() if they aren't a macro is
"int", and that makes a 64-bit AND operation returning a bit set in
the high 32-bits incorrect behaviour.

So, the return value from all these functions must fit within "int" and
be of the appropriate true/false indication according to C rules depending
on the test.

While we can use the shortcut of doing a 32-bit AND to test a bit in the
32-bit case, we can't use this with LPAE nor 64-bit PTEs where "int" is
not 64-bit - in that case, these functions must adjust the value
appropriately.

> I had tried to create a helper macro, pte_isset, but this didn't attract
> any positive comments:
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-February/235380.html

It imposes overhead for _everyone_ whether they need that overhead or not.

The problem with !!value is that the compiler has to generate more code
to convert "value" into a one-or-zero in /every/ case, because by doing
that, you've told the compiler not "I want a true/false" value but "I
want a one or zero value".  So, what you end up with in the 32-bit case
is:

	load pte
	test pte bit
	set another register to 0 if test was zero
	set register to 1 if test was non-zero
	test register for zero or non-zero
	... do something ...

which is rather inefficient when you're doing that lots of times.  As I
say, we only need this if the bit being tested is not representable
within 32-bit.

So, rather than:

+#define pte_isset(pte, val)    (!!(pte_val(pte) & (val)))

maybe:

#define pte_isset(pte, val) ((u32)(val) == (val) ? pte_val(pte) & (val) : !!(pte_val(pte) & (val)))

What this says is, if the bit fits within 32-bit, use our existing logic,
otherwise use the new logic.  Since "val" will always be a constant, the
compiler should be able to optimise this, completely eliminating one or
other branches.  It would be worth checking the assembly from the above
on 32-bit LPAE, because the compiler will probably do a 64-bit test even
for values which fit in 32-bit - this may create even better code:

#define pte_isset(pte, val) ((u32)(val) == (val) ? (u32)pte_val(pte) & (u32)(val) : !!(pte_val(pte) & (val)))

but again, it needs the assembly read to work out how it behaves.

Also, it may be worth considering a pte_isclear() macro, since we don't
need the logic in that case - it can just be a plain and simple:

#define pte_isclear(pte, val) (!(pte_val(pte) & (val)))

since we always need the negation.  Again, as per the above, it may be
better on 32-bit LPAE whether a similar trick here would be worth it -
there's no point testing both halves of a 64-bit register pair when you
know that one half is always zero.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.

  reply	other threads:[~2014-03-26 13:48 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-25 11:38 [PATCH V2 0/3] PTE fixes for arm and arm64 Steve Capper
2014-02-25 11:38 ` [PATCH V2 1/3] arm: mm: Double logical invert for pte accessors Steve Capper
2014-02-28 15:45   ` Catalin Marinas
2014-03-26 11:01   ` Catalin Marinas
2014-02-25 11:38 ` [PATCH V2 2/3] arm64: mm: Add double logical invert to " Steve Capper
2014-02-28 15:46   ` Catalin Marinas
2014-02-25 11:38 ` [PATCH V2 3/3] arm: mm: Switch back to L_PTE_WRITE Steve Capper
2014-03-26 11:00   ` Catalin Marinas
2014-02-25 15:22 ` [PATCH V2 0/3] PTE fixes for arm and arm64 Will Deacon
2014-02-25 17:08   ` Steve Capper
2014-03-26 10:23 ` Steve Capper
2014-03-26 11:01   ` Russell King - ARM Linux
2014-03-26 13:23     ` Steve Capper
2014-03-26 13:48       ` Russell King - ARM Linux [this message]
2014-03-27 10:01         ` Steve Capper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140326134803.GA7528@n2100.arm.linux.org.uk \
    --to=linux@arm.linux.org.uk \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.