* arch/powerpc/math-emu/mtfsf.c - incorrect mask?
@ 2014-02-06 2:09 Stephen N Chivers
2014-02-06 8:26 ` Gabriel Paubert
0 siblings, 1 reply; 44+ messages in thread
From: Stephen N Chivers @ 2014-02-06 2:09 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Chris Proctor
I have a MPC8548e based board and an application that makes
extensive use of floating point including numerous calls to cos.
In the same program there is the use of an sqlite database.
The kernel is derived from 2.6.31 and is compiled with math emulation.
At some point after the reading of the SQLITE database, the
return result from cos goes from an in range value to an out
of range value.
This is as a result of the FP rounding mode mutating from "round to
nearest"
to "round toward zero".
The cos in the glibc version being used is known to be sensitive to
rounding
direction and Joseph Myers has previously fixed glibc.
The failure does not occur on a machine that has a hardware floating
point unit (a MPC7410 processor).
I have traced the mutation to the following series of instructions:
mffs f0
mtfsb1 4*cr7+so
mtfsb0 4*cr7+eq
fadd f13,f1,f2
mtfsf 1, f0
The instructions are part of the stream emitted by gcc for the conversion
of a 128 bit floating point value into an integer in the sqlite database
read.
Immediately before the execution of the mffs instruction the "rounding
mode" is "round to nearest".
On the MPC8548 board, the execution of the mtfsf instruction does not
restore the rounding mode to "round to nearest".
I believe that the mask computation in mtfsf.c is incorrect and is
reversed.
In the latest version of the file (linux-3.14-rc1), the mask is computed
by:
mask = 0;
if (FM & (1 << 0))
mask |= 0x90000000;
if (FM & (1 << 1))
mask |= 0x0f000000;
if (FM & (1 << 2))
mask |= 0x00f00000;
if (FM & (1 << 3))
mask |= 0x000f0000;
if (FM & (1 << 4))
mask |= 0x0000f000;
if (FM & (1 << 5))
mask |= 0x00000f00;
if (FM & (1 << 6))
mask |= 0x000000f0;
if (FM & (1 << 7))
mask |= 0x0000000f;
I think it should be:
mask = 0;
if (FM & (1 << 0))
mask |= 0x0000000f;
if (FM & (1 << 1))
mask |= 0x000000f0;
if (FM & (1 << 2))
mask |= 0x00000f00;
if (FM & (1 << 3))
mask |= 0x0000f000;
if (FM & (1 << 4))
mask |= 0x000f0000;
if (FM & (1 << 5))
mask |= 0x00f00000;
if (FM & (1 << 6))
mask |= 0x0f000000;
if (FM & (1 << 7))
mask |= 0x90000000;
With the above mask computation I get consistent results for both the
MPC8548
and MPC7410 boards.
Am I missing something subtle?
Stephen Chivers,
CSC Australia Pty. Ltd.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-06 2:09 arch/powerpc/math-emu/mtfsf.c - incorrect mask? Stephen N Chivers
@ 2014-02-06 8:26 ` Gabriel Paubert
2014-02-07 1:27 ` Stephen N Chivers
0 siblings, 1 reply; 44+ messages in thread
From: Gabriel Paubert @ 2014-02-06 8:26 UTC (permalink / raw)
To: Stephen N Chivers; +Cc: Chris Proctor, linuxppc-dev
On Thu, Feb 06, 2014 at 12:09:00PM +1000, Stephen N Chivers wrote:
> I have a MPC8548e based board and an application that makes
> extensive use of floating point including numerous calls to cos.
> In the same program there is the use of an sqlite database.
>
> The kernel is derived from 2.6.31 and is compiled with math emulation.
>
> At some point after the reading of the SQLITE database, the
> return result from cos goes from an in range value to an out
> of range value.
>
> This is as a result of the FP rounding mode mutating from "round to
> nearest"
> to "round toward zero".
>
> The cos in the glibc version being used is known to be sensitive to
> rounding
> direction and Joseph Myers has previously fixed glibc.
>
> The failure does not occur on a machine that has a hardware floating
> point unit (a MPC7410 processor).
>
> I have traced the mutation to the following series of instructions:
>
> mffs f0
> mtfsb1 4*cr7+so
> mtfsb0 4*cr7+eq
> fadd f13,f1,f2
> mtfsf 1, f0
>
> The instructions are part of the stream emitted by gcc for the conversion
> of a 128 bit floating point value into an integer in the sqlite database
> read.
>
> Immediately before the execution of the mffs instruction the "rounding
> mode" is "round to nearest".
>
> On the MPC8548 board, the execution of the mtfsf instruction does not
> restore the rounding mode to "round to nearest".
>
> I believe that the mask computation in mtfsf.c is incorrect and is
> reversed.
>
> In the latest version of the file (linux-3.14-rc1), the mask is computed
> by:
>
> mask = 0;
> if (FM & (1 << 0))
> mask |= 0x90000000;
> if (FM & (1 << 1))
> mask |= 0x0f000000;
> if (FM & (1 << 2))
> mask |= 0x00f00000;
> if (FM & (1 << 3))
> mask |= 0x000f0000;
> if (FM & (1 << 4))
> mask |= 0x0000f000;
> if (FM & (1 << 5))
> mask |= 0x00000f00;
> if (FM & (1 << 6))
> mask |= 0x000000f0;
> if (FM & (1 << 7))
> mask |= 0x0000000f;
>
> I think it should be:
>
> mask = 0;
> if (FM & (1 << 0))
> mask |= 0x0000000f;
> if (FM & (1 << 1))
> mask |= 0x000000f0;
> if (FM & (1 << 2))
> mask |= 0x00000f00;
> if (FM & (1 << 3))
> mask |= 0x0000f000;
> if (FM & (1 << 4))
> mask |= 0x000f0000;
> if (FM & (1 << 5))
> mask |= 0x00f00000;
> if (FM & (1 << 6))
> mask |= 0x0f000000;
> if (FM & (1 << 7))
> mask |= 0x90000000;
>
> With the above mask computation I get consistent results for both the
> MPC8548
> and MPC7410 boards.
>
> Am I missing something subtle?
No I think you are correct. This said, this code may probably be optimized
to eliminate a lot of the conditional branches. I think that:
mask = (FM & 1);
mask |= (FM << 3) & 0x10;
mask |= (FM << 6) & 0x100;
mask |= (FM << 9) & 0x1000;
mask |= (FM << 12) & 0x10000;
mask |= (FM << 15) & 0x100000;
mask |= (FM << 18) & 0x1000000;
mask |= (FM << 21) & 0x10000000;
mask *= 15;
should do the job, in less code space and without a single branch.
Each one of the "mask |=" lines should be translated into an
rlwinm instruction followed by an "or". Actually it should be possible
to transform each of these lines into a single "rlwimi" instruction
but I don't know how to coerce gcc to reach this level of optimization.
Another way of optomizing this could be:
mask = (FM & 0x0f) | ((FM << 12) & 0x000f0000);
mask = (mask & 0x00030003) | ((mask << 6) & 0x03030303);
mask = (mask & 0x01010101) | ((mask << 3) & 0x10101010);
mask *= 15;
It's not easy to see which of the solutions is faster, the second one
needs to create quite a few constants, but its dependency length is
lower. It is very likely that the first solution is faster in cache-cold
case and the second in cache-hot.
Regardless, the original code is rather naïve, larger and slower in all cases,
with timing variation depending on branch mispredictions.
Regards,
Gabriel
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-06 8:26 ` Gabriel Paubert
@ 2014-02-07 1:27 ` Stephen N Chivers
2014-02-07 10:10 ` Gabriel Paubert
0 siblings, 1 reply; 44+ messages in thread
From: Stephen N Chivers @ 2014-02-07 1:27 UTC (permalink / raw)
To: Gabriel Paubert; +Cc: Chris Proctor, linuxppc-dev
Gabriel Paubert <paubert@iram.es> wrote on 02/06/2014 07:26:37 PM:
> From: Gabriel Paubert <paubert@iram.es>
> To: Stephen N Chivers <schivers@csc.com.au>
> Cc: linuxppc-dev@lists.ozlabs.org, Chris Proctor <cproctor@csc.com.au>
> Date: 02/06/2014 07:26 PM
> Subject: Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
>=20
> On Thu, Feb 06, 2014 at 12:09:00PM +1000, Stephen N Chivers wrote:
> > I have a MPC8548e based board and an application that makes
> > extensive use of floating point including numerous calls to cos.
> > In the same program there is the use of an sqlite database.
> >=20
> > The kernel is derived from 2.6.31 and is compiled with math emulation.
> >=20
> > At some point after the reading of the SQLITE database, the
> > return result from cos goes from an in range value to an out
> > of range value.
> >=20
> > This is as a result of the FP rounding mode mutating from "round to=20
> > nearest"
> > to "round toward zero".
> >=20
> > The cos in the glibc version being used is known to be sensitive to=20
> > rounding
> > direction and Joseph Myers has previously fixed glibc.
> >=20
> > The failure does not occur on a machine that has a hardware floating
> > point unit (a MPC7410 processor).
> >=20
> > I have traced the mutation to the following series of instructions:
> >=20
> > mffs f0
> > mtfsb1 4*cr7+so
> > mtfsb0 4*cr7+eq
> > fadd f13,f1,f2
> > mtfsf 1, f0
> >=20
> > The instructions are part of the stream emitted by gcc for the=20
conversion
> > of a 128 bit floating point value into an integer in the sqlite=20
database=20
> > read.
> >=20
> > Immediately before the execution of the mffs instruction the "rounding
> > mode" is "round to nearest".
> >=20
> > On the MPC8548 board, the execution of the mtfsf instruction does not
> > restore the rounding mode to "round to nearest".
> >=20
> > I believe that the mask computation in mtfsf.c is incorrect and is=20
> > reversed.
> >=20
> > In the latest version of the file (linux-3.14-rc1), the mask is=20
computed=20
> > by:
> >=20
> > mask =3D 0;
> > if (FM & (1 << 0))
> > mask |=3D 0x90000000;
> > if (FM & (1 << 1))
> > mask |=3D 0x0f000000;
> > if (FM & (1 << 2))
> > mask |=3D 0x00f00000;
> > if (FM & (1 << 3))
> > mask |=3D 0x000f0000;
> > if (FM & (1 << 4))
> > mask |=3D 0x0000f000;
> > if (FM & (1 << 5))
> > mask |=3D 0x00000f00;
> > if (FM & (1 << 6))
> > mask |=3D 0x000000f0;
> > if (FM & (1 << 7))
> > mask |=3D 0x0000000f;
> >=20
> > I think it should be:
> >=20
> > mask =3D 0;
> > if (FM & (1 << 0))
> > mask |=3D 0x0000000f;
> > if (FM & (1 << 1))
> > mask |=3D 0x000000f0;
> > if (FM & (1 << 2))
> > mask |=3D 0x00000f00;
> > if (FM & (1 << 3))
> > mask |=3D 0x0000f000;
> > if (FM & (1 << 4))
> > mask |=3D 0x000f0000;
> > if (FM & (1 << 5))
> > mask |=3D 0x00f00000;
> > if (FM & (1 << 6))
> > mask |=3D 0x0f000000;
> > if (FM & (1 << 7))
> > mask |=3D 0x90000000;
> >=20
> > With the above mask computation I get consistent results for both the=20
> > MPC8548
> > and MPC7410 boards.
> >=20
> > Am I missing something subtle?
>=20
> No I think you are correct. This said, this code may probably be=20
optimized=20
> to eliminate a lot of the conditional branches. I think that:
>=20
> mask =3D (FM & 1);
> mask |=3D (FM << 3) & 0x10;
> mask |=3D (FM << 6) & 0x100;
> mask |=3D (FM << 9) & 0x1000;
> mask |=3D (FM << 12) & 0x10000;
> mask |=3D (FM << 15) & 0x100000;
> mask |=3D (FM << 18) & 0x1000000;
> mask |=3D (FM << 21) & 0x10000000;
> mask *=3D 15;
>=20
> should do the job, in less code space and without a single branch.
>=20
> Each one of the "mask |=3D" lines should be translated into an
> rlwinm instruction followed by an "or". Actually it should be possible
> to transform each of these lines into a single "rlwimi" instruction
> but I don't know how to coerce gcc to reach this level of optimization.
>=20
> Another way of optomizing this could be:
>=20
> mask =3D (FM & 0x0f) | ((FM << 12) & 0x000f0000);
> mask =3D (mask & 0x00030003) | ((mask << 6) & 0x03030303);
> mask =3D (mask & 0x01010101) | ((mask << 3) & 0x10101010);
> mask *=3D 15;
>=20
> It's not easy to see which of the solutions is faster, the second one
> needs to create quite a few constants, but its dependency length is=20
> lower. It is very likely that the first solution is faster in cache-cold
> case and the second in cache-hot.=20
>=20
> Regardless, the original code is rather na=EFve, larger and slower in all=
=20
cases,
> with timing variation depending on branch mispredictions.
Thanks for the response, it is appreciated.
I have tried simple test versions of the two suggestions above, both
produce the same results as the original formulation.
My toolchain is gcc-4.1.2 with binutils 2.17.
When compiled without optimization I get:
original 58 instructions 20 memory accesses 9 branches
method1 53 instructions 27 memory accesses 0 branches
method2 37 instructions 13 memory accesses 0 branches
with optimization:
original 25 instructions 0 memory accesses 8 branches
method1 18 instructions 0 memory accesses 0 branches
method2 21 instructions 0 memory accesses 0 branches
The memory accesses do not include "setup" such as moving FM to
a register.
The instruction counts for method1 and method2 include an extra
and operation to preserve the original behaviour wrt the sticky
FX bit (I think) although maybe that is also something else
that it is wrong with the original implementation.
In my naivety I would go with method2 as it generates fewer
instructions when not optimized and isn't far from method1 when
optimized.
I will await more comments before providing a patch next week.
>=20
> Regards,
> Gabriel
Stephen Chivers,
CSC Australia Pty. Ltd.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-07 1:27 ` Stephen N Chivers
@ 2014-02-07 10:10 ` Gabriel Paubert
2014-02-07 20:49 ` James Yang
0 siblings, 1 reply; 44+ messages in thread
From: Gabriel Paubert @ 2014-02-07 10:10 UTC (permalink / raw)
To: Stephen N Chivers; +Cc: Chris Proctor, linuxppc-dev
Hi Stephen,
On Fri, Feb 07, 2014 at 11:27:57AM +1000, Stephen N Chivers wrote:
> Gabriel Paubert <paubert@iram.es> wrote on 02/06/2014 07:26:37 PM:
>
> > From: Gabriel Paubert <paubert@iram.es>
> > To: Stephen N Chivers <schivers@csc.com.au>
> > Cc: linuxppc-dev@lists.ozlabs.org, Chris Proctor <cproctor@csc.com.au>
> > Date: 02/06/2014 07:26 PM
> > Subject: Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
> >
> > On Thu, Feb 06, 2014 at 12:09:00PM +1000, Stephen N Chivers wrote:
> > > I have a MPC8548e based board and an application that makes
> > > extensive use of floating point including numerous calls to cos.
> > > In the same program there is the use of an sqlite database.
> > >
> > > The kernel is derived from 2.6.31 and is compiled with math emulation.
> > >
> > > At some point after the reading of the SQLITE database, the
> > > return result from cos goes from an in range value to an out
> > > of range value.
> > >
> > > This is as a result of the FP rounding mode mutating from "round to
> > > nearest"
> > > to "round toward zero".
> > >
> > > The cos in the glibc version being used is known to be sensitive to
> > > rounding
> > > direction and Joseph Myers has previously fixed glibc.
> > >
> > > The failure does not occur on a machine that has a hardware floating
> > > point unit (a MPC7410 processor).
> > >
> > > I have traced the mutation to the following series of instructions:
> > >
> > > mffs f0
> > > mtfsb1 4*cr7+so
> > > mtfsb0 4*cr7+eq
> > > fadd f13,f1,f2
> > > mtfsf 1, f0
> > >
> > > The instructions are part of the stream emitted by gcc for the
> conversion
> > > of a 128 bit floating point value into an integer in the sqlite
> database
> > > read.
> > >
> > > Immediately before the execution of the mffs instruction the "rounding
> > > mode" is "round to nearest".
> > >
> > > On the MPC8548 board, the execution of the mtfsf instruction does not
> > > restore the rounding mode to "round to nearest".
> > >
> > > I believe that the mask computation in mtfsf.c is incorrect and is
> > > reversed.
> > >
> > > In the latest version of the file (linux-3.14-rc1), the mask is
> computed
> > > by:
> > >
> > > mask = 0;
> > > if (FM & (1 << 0))
> > > mask |= 0x90000000;
> > > if (FM & (1 << 1))
> > > mask |= 0x0f000000;
> > > if (FM & (1 << 2))
> > > mask |= 0x00f00000;
> > > if (FM & (1 << 3))
> > > mask |= 0x000f0000;
> > > if (FM & (1 << 4))
> > > mask |= 0x0000f000;
> > > if (FM & (1 << 5))
> > > mask |= 0x00000f00;
> > > if (FM & (1 << 6))
> > > mask |= 0x000000f0;
> > > if (FM & (1 << 7))
> > > mask |= 0x0000000f;
> > >
> > > I think it should be:
> > >
> > > mask = 0;
> > > if (FM & (1 << 0))
> > > mask |= 0x0000000f;
> > > if (FM & (1 << 1))
> > > mask |= 0x000000f0;
> > > if (FM & (1 << 2))
> > > mask |= 0x00000f00;
> > > if (FM & (1 << 3))
> > > mask |= 0x0000f000;
> > > if (FM & (1 << 4))
> > > mask |= 0x000f0000;
> > > if (FM & (1 << 5))
> > > mask |= 0x00f00000;
> > > if (FM & (1 << 6))
> > > mask |= 0x0f000000;
> > > if (FM & (1 << 7))
> > > mask |= 0x90000000;
> > >
> > > With the above mask computation I get consistent results for both the
> > > MPC8548
> > > and MPC7410 boards.
> > >
> > > Am I missing something subtle?
> >
> > No I think you are correct. This said, this code may probably be
> optimized
> > to eliminate a lot of the conditional branches. I think that:
> >
> > mask = (FM & 1);
> > mask |= (FM << 3) & 0x10;
> > mask |= (FM << 6) & 0x100;
> > mask |= (FM << 9) & 0x1000;
> > mask |= (FM << 12) & 0x10000;
> > mask |= (FM << 15) & 0x100000;
> > mask |= (FM << 18) & 0x1000000;
> > mask |= (FM << 21) & 0x10000000;
> > mask *= 15;
> >
> > should do the job, in less code space and without a single branch.
> >
> > Each one of the "mask |=" lines should be translated into an
> > rlwinm instruction followed by an "or". Actually it should be possible
> > to transform each of these lines into a single "rlwimi" instruction
> > but I don't know how to coerce gcc to reach this level of optimization.
> >
> > Another way of optomizing this could be:
> >
> > mask = (FM & 0x0f) | ((FM << 12) & 0x000f0000);
> > mask = (mask & 0x00030003) | ((mask << 6) & 0x03030303);
> > mask = (mask & 0x01010101) | ((mask << 3) & 0x10101010);
> > mask *= 15;
> >
> > It's not easy to see which of the solutions is faster, the second one
> > needs to create quite a few constants, but its dependency length is
> > lower. It is very likely that the first solution is faster in cache-cold
> > case and the second in cache-hot.
> >
> > Regardless, the original code is rather naïve, larger and slower in all
> cases,
> > with timing variation depending on branch mispredictions.
>
> Thanks for the response, it is appreciated.
>
> I have tried simple test versions of the two suggestions above, both
> produce the same results as the original formulation.
>
> My toolchain is gcc-4.1.2 with binutils 2.17.
>
> When compiled without optimization I get:
>
> original 58 instructions 20 memory accesses 9 branches
> method1 53 instructions 27 memory accesses 0 branches
> method2 37 instructions 13 memory accesses 0 branches
>
> with optimization:
>
> original 25 instructions 0 memory accesses 8 branches
> method1 18 instructions 0 memory accesses 0 branches
> method2 21 instructions 0 memory accesses 0 branches
>
> The memory accesses do not include "setup" such as moving FM to
> a register.
>
> The instruction counts for method1 and method2 include an extra
> and operation to preserve the original behaviour wrt the sticky
> FX bit (I think) although maybe that is also something else
> that it is wrong with the original implementation.
These bits are not sticky, they are actually a logical combination
of other bits in the register. The code in mtfsf.c recomputes
the FPSCR_VX and FPSCR_VEX bits from the contents of the new FPSCR,
so, unless I miss something, the and operation is unnecessary.
> In my naivety I would go with method2 as it generates fewer
> instructions when not optimized and isn't far from method1 when
> optimized.
I would never count the size of non-optimized gcc code as a valid
benchmark. At least -O1 is necessary to implement basic, trivial
optimizations, as you see in the number of memory accesses.
Ok, I finally edited my sources and test compiled the suggestions
I gave. I'd say that method2 is the best overall indeed. You can
actually save one more instruction by setting mask to all ones in
the case FM=0xff, but that's about all in this area.
This said, there are better ways to write the end of the routine
to be more compact. The __FPU_FPSCR is a global variable which is
accessed very often, and it would be better to use more the local
fpscr variable. I see really stupid things in the code, for example
the following:
and. 8,10,0 #, tmp184, D.11439
==> stw 0,740(2) # current.0_21->thread.fp_state.fpscr, D.11439
beq- 0,.L4 #
oris 0,0,0x2000 #, tmp188, D.11439,
==> stw 0,740(2) # current.0_21->thread.fp_state.fpscr, tmp188
.L4:
==> lwz 0,740(9) # current.0_21->thread.fp_state.fpscr, fpscr
lis 11,0x2000 # tmp227,
(and yes gcc also copies around r2, the current thread pointer, to
other registers, for no good reason).
Anyway, the first step would be to get the current patch in, then
work on more optimizations.
Regards,
Gabriel
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-07 10:10 ` Gabriel Paubert
@ 2014-02-07 20:49 ` James Yang
2014-02-09 19:42 ` Stephen N Chivers
2014-02-10 11:03 ` Gabriel Paubert
0 siblings, 2 replies; 44+ messages in thread
From: James Yang @ 2014-02-07 20:49 UTC (permalink / raw)
To: Gabriel Paubert; +Cc: Chris Proctor, Stephen N Chivers, linuxppc-dev
On Fri, 7 Feb 2014, Gabriel Paubert wrote:
> Hi Stephen,
>
> On Fri, Feb 07, 2014 at 11:27:57AM +1000, Stephen N Chivers wrote:
> > Gabriel Paubert <paubert@iram.es> wrote on 02/06/2014 07:26:37 PM:
> >
> > > From: Gabriel Paubert <paubert@iram.es>
> > > To: Stephen N Chivers <schivers@csc.com.au>
> > > Cc: linuxppc-dev@lists.ozlabs.org, Chris Proctor <cproctor@csc.com.au>
> > > Date: 02/06/2014 07:26 PM
> > > Subject: Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
> > >
> > > On Thu, Feb 06, 2014 at 12:09:00PM +1000, Stephen N Chivers wrote:
> > > >
> > > > mask = 0;
> > > > if (FM & (1 << 0))
> > > > mask |= 0x0000000f;
> > > > if (FM & (1 << 1))
> > > > mask |= 0x000000f0;
> > > > if (FM & (1 << 2))
> > > > mask |= 0x00000f00;
> > > > if (FM & (1 << 3))
> > > > mask |= 0x0000f000;
> > > > if (FM & (1 << 4))
> > > > mask |= 0x000f0000;
> > > > if (FM & (1 << 5))
> > > > mask |= 0x00f00000;
> > > > if (FM & (1 << 6))
> > > > mask |= 0x0f000000;
> > > > if (FM & (1 << 7))
> > > > mask |= 0x90000000;
> > > >
> > > > With the above mask computation I get consistent results for
> > > > both the MPC8548 and MPC7410 boards.
> > > >
> > > > Am I missing something subtle?
> > >
> > > No I think you are correct. This said, this code may probably be
> > optimized
> > > to eliminate a lot of the conditional branches. I think that:
If the compiler is enabled to generate isel instructions, it would not
use a conditional branch for this code. (ignore the andi's values,
this is an old compile)
c0037c2c <mtfsf>:
c0037c2c: 2c 03 00 00 cmpwi r3,0
c0037c30: 41 82 01 1c beq- c0037d4c <mtfsf+0x120>
c0037c34: 2f 83 00 ff cmpwi cr7,r3,255
c0037c38: 41 9e 01 28 beq- cr7,c0037d60 <mtfsf+0x134>
c0037c3c: 70 66 00 01 andi. r6,r3,1
c0037c40: 3d 00 90 00 lis r8,-28672
c0037c44: 7d 20 40 9e iseleq r9,r0,r8
c0037c48: 70 6a 00 02 andi. r10,r3,2
c0037c4c: 65 28 0f 00 oris r8,r9,3840
c0037c50: 7d 29 40 9e iseleq r9,r9,r8
c0037c54: 70 66 00 04 andi. r6,r3,4
c0037c58: 65 28 00 f0 oris r8,r9,240
c0037c5c: 7d 29 40 9e iseleq r9,r9,r8
c0037c60: 70 6a 00 08 andi. r10,r3,8
c0037c64: 65 28 00 0f oris r8,r9,15
c0037c68: 7d 29 40 9e iseleq r9,r9,r8
c0037c6c: 70 66 00 10 andi. r6,r3,16
c0037c70: 61 28 f0 00 ori r8,r9,61440
c0037c74: 7d 29 40 9e iseleq r9,r9,r8
c0037c78: 70 6a 00 20 andi. r10,r3,32
c0037c7c: 61 28 0f 00 ori r8,r9,3840
c0037c80: 54 6a cf fe rlwinm r10,r3,25,31,31
c0037c84: 7d 29 40 9e iseleq r9,r9,r8
c0037c88: 2f 8a 00 00 cmpwi cr7,r10,0
c0037c8c: 70 66 00 40 andi. r6,r3,64
c0037c90: 61 28 00 f0 ori r8,r9,240
c0037c94: 7d 29 40 9e iseleq r9,r9,r8
c0037c98: 41 9e 00 08 beq- cr7,c0037ca0 <mtfsf+0x74>
c0037c9c: 61 29 00 0f ori r9,r9,15
...
However, your other solutions are better.
> > >
> > > mask = (FM & 1);
> > > mask |= (FM << 3) & 0x10;
> > > mask |= (FM << 6) & 0x100;
> > > mask |= (FM << 9) & 0x1000;
> > > mask |= (FM << 12) & 0x10000;
> > > mask |= (FM << 15) & 0x100000;
> > > mask |= (FM << 18) & 0x1000000;
> > > mask |= (FM << 21) & 0x10000000;
> > > mask *= 15;
> > >
> > > should do the job, in less code space and without a single branch.
> > >
> > > Each one of the "mask |=" lines should be translated into an
> > > rlwinm instruction followed by an "or". Actually it should be possible
> > > to transform each of these lines into a single "rlwimi" instruction
> > > but I don't know how to coerce gcc to reach this level of optimization.
> > >
> > > Another way of optomizing this could be:
> > >
> > > mask = (FM & 0x0f) | ((FM << 12) & 0x000f0000);
> > > mask = (mask & 0x00030003) | ((mask << 6) & 0x03030303);
> > > mask = (mask & 0x01010101) | ((mask << 3) & 0x10101010);
> > > mask *= 15;
> > >
> Ok, I finally edited my sources and test compiled the suggestions
> I gave. I'd say that method2 is the best overall indeed. You can
> actually save one more instruction by setting mask to all ones in
> the case FM=0xff, but that's about all in this area.
My measurements show method1 to be smaller and faster than method2 due
to the number of instructions needed to generate the constant masks in
method2, but it may depend upon your compiler and hardware. Both are
faster than the original with isel.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-07 20:49 ` James Yang
@ 2014-02-09 19:42 ` Stephen N Chivers
2014-02-10 16:50 ` James Yang
2014-02-10 11:03 ` Gabriel Paubert
1 sibling, 1 reply; 44+ messages in thread
From: Stephen N Chivers @ 2014-02-09 19:42 UTC (permalink / raw)
To: James Yang; +Cc: Chris Proctor, linuxppc-dev, Stephen N Chivers
James Yang <James.Yang@freescale.com> wrote on 02/08/2014 07:49:40 AM:
> From: James Yang <James.Yang@freescale.com>
> To: Gabriel Paubert <paubert@iram.es>
> Cc: Stephen N Chivers <schivers@csc.com.au>, Chris Proctor
> <cproctor@csc.com.au>, <linuxppc-dev@lists.ozlabs.org>
> Date: 02/08/2014 07:49 AM
> Subject: Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
>
> On Fri, 7 Feb 2014, Gabriel Paubert wrote:
>
> > Hi Stephen,
> >
> > On Fri, Feb 07, 2014 at 11:27:57AM +1000, Stephen N Chivers wrote:
> > > Gabriel Paubert <paubert@iram.es> wrote on 02/06/2014 07:26:37 PM:
> > >
> > > > From: Gabriel Paubert <paubert@iram.es>
> > > > To: Stephen N Chivers <schivers@csc.com.au>
> > > > Cc: linuxppc-dev@lists.ozlabs.org, Chris Proctor
<cproctor@csc.com.au>
> > > > Date: 02/06/2014 07:26 PM
> > > > Subject: Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
> > > >
> > > > On Thu, Feb 06, 2014 at 12:09:00PM +1000, Stephen N Chivers wrote:
>
> > > > >
> > > > > mask = 0;
> > > > > if (FM & (1 << 0))
> > > > > mask |= 0x0000000f;
> > > > > if (FM & (1 << 1))
> > > > > mask |= 0x000000f0;
> > > > > if (FM & (1 << 2))
> > > > > mask |= 0x00000f00;
> > > > > if (FM & (1 << 3))
> > > > > mask |= 0x0000f000;
> > > > > if (FM & (1 << 4))
> > > > > mask |= 0x000f0000;
> > > > > if (FM & (1 << 5))
> > > > > mask |= 0x00f00000;
> > > > > if (FM & (1 << 6))
> > > > > mask |= 0x0f000000;
> > > > > if (FM & (1 << 7))
> > > > > mask |= 0x90000000;
> > > > >
> > > > > With the above mask computation I get consistent results for
> > > > > both the MPC8548 and MPC7410 boards.
> > > > >
> > > > > Am I missing something subtle?
> > > >
> > > > No I think you are correct. This said, this code may probably be
> > > optimized
> > > > to eliminate a lot of the conditional branches. I think that:
>
>
> If the compiler is enabled to generate isel instructions, it would not
> use a conditional branch for this code. (ignore the andi's values,
> this is an old compile)
>
>From limited research, the 440GP is a processor
that doesn't implement the isel instruction and it does
not implement floating point.
The kernel emulates isel and so using that instruction
for the 440GP would have a double trap penalty.
Correct me if I am wrong, the isel instruction first appears
in PowerPC ISA v2.04 around mid 2007.
Stephen Chivers,
CSC Australia Pty. Ltd.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-07 20:49 ` James Yang
2014-02-09 19:42 ` Stephen N Chivers
@ 2014-02-10 11:03 ` Gabriel Paubert
2014-02-10 11:17 ` David Laight
2014-02-10 17:03 ` James Yang
1 sibling, 2 replies; 44+ messages in thread
From: Gabriel Paubert @ 2014-02-10 11:03 UTC (permalink / raw)
To: James Yang; +Cc: Chris Proctor, Stephen N Chivers, linuxppc-dev
On Fri, Feb 07, 2014 at 02:49:40PM -0600, James Yang wrote:
> On Fri, 7 Feb 2014, Gabriel Paubert wrote:
>
> > Hi Stephen,
> >
> > On Fri, Feb 07, 2014 at 11:27:57AM +1000, Stephen N Chivers wrote:
> > > Gabriel Paubert <paubert@iram.es> wrote on 02/06/2014 07:26:37 PM:
> > >
> > > > From: Gabriel Paubert <paubert@iram.es>
> > > > To: Stephen N Chivers <schivers@csc.com.au>
> > > > Cc: linuxppc-dev@lists.ozlabs.org, Chris Proctor <cproctor@csc.com.au>
> > > > Date: 02/06/2014 07:26 PM
> > > > Subject: Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
> > > >
> > > > On Thu, Feb 06, 2014 at 12:09:00PM +1000, Stephen N Chivers wrote:
>
> > > > >
> > > > > mask = 0;
> > > > > if (FM & (1 << 0))
> > > > > mask |= 0x0000000f;
> > > > > if (FM & (1 << 1))
> > > > > mask |= 0x000000f0;
> > > > > if (FM & (1 << 2))
> > > > > mask |= 0x00000f00;
> > > > > if (FM & (1 << 3))
> > > > > mask |= 0x0000f000;
> > > > > if (FM & (1 << 4))
> > > > > mask |= 0x000f0000;
> > > > > if (FM & (1 << 5))
> > > > > mask |= 0x00f00000;
> > > > > if (FM & (1 << 6))
> > > > > mask |= 0x0f000000;
> > > > > if (FM & (1 << 7))
> > > > > mask |= 0x90000000;
> > > > >
> > > > > With the above mask computation I get consistent results for
> > > > > both the MPC8548 and MPC7410 boards.
> > > > >
> > > > > Am I missing something subtle?
> > > >
> > > > No I think you are correct. This said, this code may probably be
> > > optimized
> > > > to eliminate a lot of the conditional branches. I think that:
>
>
> If the compiler is enabled to generate isel instructions, it would not
> use a conditional branch for this code. (ignore the andi's values,
> this is an old compile)
>
> c0037c2c <mtfsf>:
> c0037c2c: 2c 03 00 00 cmpwi r3,0
> c0037c30: 41 82 01 1c beq- c0037d4c <mtfsf+0x120>
> c0037c34: 2f 83 00 ff cmpwi cr7,r3,255
> c0037c38: 41 9e 01 28 beq- cr7,c0037d60 <mtfsf+0x134>
> c0037c3c: 70 66 00 01 andi. r6,r3,1
> c0037c40: 3d 00 90 00 lis r8,-28672
> c0037c44: 7d 20 40 9e iseleq r9,r0,r8
> c0037c48: 70 6a 00 02 andi. r10,r3,2
> c0037c4c: 65 28 0f 00 oris r8,r9,3840
> c0037c50: 7d 29 40 9e iseleq r9,r9,r8
> c0037c54: 70 66 00 04 andi. r6,r3,4
> c0037c58: 65 28 00 f0 oris r8,r9,240
> c0037c5c: 7d 29 40 9e iseleq r9,r9,r8
> c0037c60: 70 6a 00 08 andi. r10,r3,8
> c0037c64: 65 28 00 0f oris r8,r9,15
> c0037c68: 7d 29 40 9e iseleq r9,r9,r8
> c0037c6c: 70 66 00 10 andi. r6,r3,16
> c0037c70: 61 28 f0 00 ori r8,r9,61440
> c0037c74: 7d 29 40 9e iseleq r9,r9,r8
> c0037c78: 70 6a 00 20 andi. r10,r3,32
> c0037c7c: 61 28 0f 00 ori r8,r9,3840
> c0037c80: 54 6a cf fe rlwinm r10,r3,25,31,31
> c0037c84: 7d 29 40 9e iseleq r9,r9,r8
> c0037c88: 2f 8a 00 00 cmpwi cr7,r10,0
> c0037c8c: 70 66 00 40 andi. r6,r3,64
> c0037c90: 61 28 00 f0 ori r8,r9,240
> c0037c94: 7d 29 40 9e iseleq r9,r9,r8
> c0037c98: 41 9e 00 08 beq- cr7,c0037ca0 <mtfsf+0x74>
> c0037c9c: 61 29 00 0f ori r9,r9,15
> ...
>
> However, your other solutions are better.
>
>
> > > >
> > > > mask = (FM & 1);
> > > > mask |= (FM << 3) & 0x10;
> > > > mask |= (FM << 6) & 0x100;
> > > > mask |= (FM << 9) & 0x1000;
> > > > mask |= (FM << 12) & 0x10000;
> > > > mask |= (FM << 15) & 0x100000;
> > > > mask |= (FM << 18) & 0x1000000;
> > > > mask |= (FM << 21) & 0x10000000;
> > > > mask *= 15;
> > > >
> > > > should do the job, in less code space and without a single branch.
> > > >
> > > > Each one of the "mask |=" lines should be translated into an
> > > > rlwinm instruction followed by an "or". Actually it should be possible
> > > > to transform each of these lines into a single "rlwimi" instruction
> > > > but I don't know how to coerce gcc to reach this level of optimization.
> > > >
> > > > Another way of optomizing this could be:
> > > >
> > > > mask = (FM & 0x0f) | ((FM << 12) & 0x000f0000);
> > > > mask = (mask & 0x00030003) | ((mask << 6) & 0x03030303);
> > > > mask = (mask & 0x01010101) | ((mask << 3) & 0x10101010);
> > > > mask *= 15;
> > > >
>
>
> > Ok, I finally edited my sources and test compiled the suggestions
> > I gave. I'd say that method2 is the best overall indeed. You can
> > actually save one more instruction by setting mask to all ones in
> > the case FM=0xff, but that's about all in this area.
>
>
> My measurements show method1 to be smaller and faster than method2 due
> to the number of instructions needed to generate the constant masks in
> method2, but it may depend upon your compiler and hardware. Both are
> faster than the original with isel.
Ok, if you have measured that method1 is faster than method2, let us go for it.
I believe method2 would be faster if you had a large out-of-order execution
window, because more parallelism can be extracted from it, but this is probably
only true for high end cores, which do not need FPU emulation in the first place.
The other place where we can optimize is the generation of FEX. Here is
my current patch:
diff --git a/arch/powerpc/math-emu/mtfsf.c b/arch/powerpc/math-emu/mtfsf.c
index dbce92e..b57b3fa8 100644
--- a/arch/powerpc/math-emu/mtfsf.c
+++ b/arch/powerpc/math-emu/mtfsf.c
@@ -11,48 +11,35 @@ mtfsf(unsigned int FM, u32 *frB)
u32 mask;
u32 fpscr;
- if (FM == 0)
+ if (likely(FM == 0xff))
+ mask = 0xffffffff;
+ else if (unlikely(FM == 0))
return 0;
-
- if (FM == 0xff)
- mask = 0x9fffffff;
else {
- mask = 0;
- if (FM & (1 << 0))
- mask |= 0x90000000;
- if (FM & (1 << 1))
- mask |= 0x0f000000;
- if (FM & (1 << 2))
- mask |= 0x00f00000;
- if (FM & (1 << 3))
- mask |= 0x000f0000;
- if (FM & (1 << 4))
- mask |= 0x0000f000;
- if (FM & (1 << 5))
- mask |= 0x00000f00;
- if (FM & (1 << 6))
- mask |= 0x000000f0;
- if (FM & (1 << 7))
- mask |= 0x0000000f;
+ mask = (FM & 1);
+ mask |= (FM << 3) & 0x10;
+ mask |= (FM << 6) & 0x100;
+ mask |= (FM << 9) & 0x1000;
+ mask |= (FM << 12) & 0x10000;
+ mask |= (FM << 15) & 0x100000;
+ mask |= (FM << 18) & 0x1000000;
+ mask |= (FM << 21) & 0x10000000;
+ mask *= 15;
}
- __FPU_FPSCR &= ~(mask);
- __FPU_FPSCR |= (frB[1] & mask);
+ fpscr = ((__FPU_FPSCR & ~mask) | (frB[1] & mask)) &
+ ~(FPSCR_VX | FPSCR_FEX);
- __FPU_FPSCR &= ~(FPSCR_VX);
- if (__FPU_FPSCR & (FPSCR_VXSNAN | FPSCR_VXISI | FPSCR_VXIDI |
+ if (fpscr & (FPSCR_VXSNAN | FPSCR_VXISI | FPSCR_VXIDI |
FPSCR_VXZDZ | FPSCR_VXIMZ | FPSCR_VXVC |
FPSCR_VXSOFT | FPSCR_VXSQRT | FPSCR_VXCVI))
- __FPU_FPSCR |= FPSCR_VX;
-
- fpscr = __FPU_FPSCR;
- fpscr &= ~(FPSCR_FEX);
- if (((fpscr & FPSCR_VX) && (fpscr & FPSCR_VE)) ||
- ((fpscr & FPSCR_OX) && (fpscr & FPSCR_OE)) ||
- ((fpscr & FPSCR_UX) && (fpscr & FPSCR_UE)) ||
- ((fpscr & FPSCR_ZX) && (fpscr & FPSCR_ZE)) ||
- ((fpscr & FPSCR_XX) && (fpscr & FPSCR_XE)))
- fpscr |= FPSCR_FEX;
+ fpscr |= FPSCR_VX;
+
+ /* The bit order of exception enables and exception status
+ * is the same. Simply shift and mask to check for enabled
+ * exceptions.
+ */
+ if (fpscr & (fpscr >> 22) & 0xf8) fpscr |= FPSCR_FEX;
__FPU_FPSCR = fpscr;
#ifdef DEBUG
mtfsf.c | 57 ++++++++++++++++++++++-----------------------------------
1 file changed, 22 insertions(+), 35 deletions(-)
Notes:
1) I'm really unsure on whether 0xff is frequent or not. So the likely()
statement at the beginning may be wrong. Actually, if it is not very likely,
it might be better to remove the special casef for FM==0xff. A look at
GCC sources shows that it never generates a mask of 0xff. From glibc
sources, there vast majority of cases uses 0x1, only isnan() uses 0xff.
2) it may be better to remove the check for FM==0, after all, the instruction
efectively becomes a nop, and generating the instruction in the first place
would be too stupid for words.
3) if might be better to #define the magic constants (22 and 0xf8) used
in the last if statement.
Gabriel
>
^ permalink raw reply related [flat|nested] 44+ messages in thread
* RE: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-10 11:03 ` Gabriel Paubert
@ 2014-02-10 11:17 ` David Laight
2014-02-10 12:21 ` Gabriel Paubert
2014-02-10 17:03 ` James Yang
1 sibling, 1 reply; 44+ messages in thread
From: David Laight @ 2014-02-10 11:17 UTC (permalink / raw)
To: 'Gabriel Paubert', James Yang
Cc: Chris Proctor, Stephen N Chivers, linuxppc-dev
> > However, your other solutions are better.
> >
> >
> > > > >
> > > > > mask =3D (FM & 1);
> > > > > mask |=3D (FM << 3) & 0x10;
> > > > > mask |=3D (FM << 6) & 0x100;
> > > > > mask |=3D (FM << 9) & 0x1000;
> > > > > mask |=3D (FM << 12) & 0x10000;
> > > > > mask |=3D (FM << 15) & 0x100000;
> > > > > mask |=3D (FM << 18) & 0x1000000;
> > > > > mask |=3D (FM << 21) & 0x10000000;
> > > > > mask *=3D 15;
> > > > >
> > > > > should do the job, in less code space and without a single branch=
.
...
> > > > > Another way of optomizing this could be:
> > > > >
> > > > > mask =3D (FM & 0x0f) | ((FM << 12) & 0x000f0000);
> > > > > mask =3D (mask & 0x00030003) | ((mask << 6) & 0x03030303);
> > > > > mask =3D (mask & 0x01010101) | ((mask << 3) & 0x10101010);
> > > > > mask *=3D 15;
...
> Ok, if you have measured that method1 is faster than method2, let us go f=
or it.
> I believe method2 would be faster if you had a large out-of-order executi=
on
> window, because more parallelism can be extracted from it, but this is pr=
obably
> only true for high end cores, which do not need FPU emulation in the firs=
t place.
FWIW the second has a long dependency chain on 'mask', whereas the first ca=
n execute
the shift/and in any order and then merge the results.
So on most superscalar cpu, or one with result delays for arithmetic, the f=
irst
is likely to be faster.
David
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-10 11:17 ` David Laight
@ 2014-02-10 12:21 ` Gabriel Paubert
2014-02-10 12:32 ` David Laight
0 siblings, 1 reply; 44+ messages in thread
From: Gabriel Paubert @ 2014-02-10 12:21 UTC (permalink / raw)
To: David Laight; +Cc: James Yang, Chris Proctor, Stephen N Chivers, linuxppc-dev
On Mon, Feb 10, 2014 at 11:17:38AM +0000, David Laight wrote:
> > > However, your other solutions are better.
> > >
> > >
> > > > > >
> > > > > > mask = (FM & 1);
> > > > > > mask |= (FM << 3) & 0x10;
> > > > > > mask |= (FM << 6) & 0x100;
> > > > > > mask |= (FM << 9) & 0x1000;
> > > > > > mask |= (FM << 12) & 0x10000;
> > > > > > mask |= (FM << 15) & 0x100000;
> > > > > > mask |= (FM << 18) & 0x1000000;
> > > > > > mask |= (FM << 21) & 0x10000000;
> > > > > > mask *= 15;
> > > > > >
> > > > > > should do the job, in less code space and without a single branch.
> ...
> > > > > > Another way of optomizing this could be:
> > > > > >
> > > > > > mask = (FM & 0x0f) | ((FM << 12) & 0x000f0000);
> > > > > > mask = (mask & 0x00030003) | ((mask << 6) & 0x03030303);
> > > > > > mask = (mask & 0x01010101) | ((mask << 3) & 0x10101010);
> > > > > > mask *= 15;
> ...
> > Ok, if you have measured that method1 is faster than method2, let us go for it.
> > I believe method2 would be faster if you had a large out-of-order execution
> > window, because more parallelism can be extracted from it, but this is probably
> > only true for high end cores, which do not need FPU emulation in the first place.
>
> FWIW the second has a long dependency chain on 'mask', whereas the first can execute
> the shift/and in any order and then merge the results.
> So on most superscalar cpu, or one with result delays for arithmetic, the first
> is likely to be faster.
I disagree, perhaps mostly because the compiler is not clever enough, but right
now the code for solution 1 is (actually I have rewritten the code
and it reads:
mask = (FM & 1)
| ((FM << 3) & 0x10)
| ((FM << 6) & 0x100)
| ((FM << 9) & 0x1000)
| ((FM << 12) & 0x10000)
| ((FM << 15) & 0x100000)
| ((FM << 18) & 0x1000000)
| ((FM << 21) & 0x10000000);
to avoid sequence point in case it hampers the compiler)
and the output is:
rlwinm 10,3,3,27,27 # D.11621, FM,,
rlwinm 9,3,6,23,23 # D.11621, FM,,
or 9,10,9 #, D.11621, D.11621, D.11621
rlwinm 10,3,0,31,31 # D.11621, FM,
or 9,9,10 #, D.11621, D.11621, D.11621
rlwinm 10,3,9,19,19 # D.11621, FM,,
or 9,9,10 #, D.11621, D.11621, D.11621
rlwinm 10,3,12,15,15 # D.11621, FM,,
or 9,9,10 #, D.11621, D.11621, D.11621
rlwinm 10,3,15,11,11 # D.11621, FM,,
or 9,9,10 #, D.11621, D.11621, D.11621
rlwinm 10,3,18,7,7 # D.11621, FM,,
or 9,9,10 #, D.11621, D.11621, D.11621
rlwinm 3,3,21,3,3 # D.11621, FM,,
or 9,9,3 #, mask, D.11621, D.11621
mulli 9,9,15 # mask, mask,
see that r9 is used 7 times as both input and output operand, plus
once for rlwinm. This gives a dependency length of 8 at least.
In the other case (I've deleted the code) the dependency length
was significantly shorter. In any case that one is fewer instructions,
which is good for occasional use.
Gabriel
^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-10 12:21 ` Gabriel Paubert
@ 2014-02-10 12:32 ` David Laight
2014-02-10 13:00 ` Gabriel Paubert
0 siblings, 1 reply; 44+ messages in thread
From: David Laight @ 2014-02-10 12:32 UTC (permalink / raw)
To: 'Gabriel Paubert'
Cc: James Yang, Chris Proctor, Stephen N Chivers, linuxppc-dev
> I disagree, perhaps mostly because the compiler is not clever enough, but=
right
> now the code for solution 1 is (actually I have rewritten the code
> and it reads:
>=20
> mask =3D (FM & 1)
> | ((FM << 3) & 0x10)
> | ((FM << 6) & 0x100)
> | ((FM << 9) & 0x1000)
> | ((FM << 12) & 0x10000)
> | ((FM << 15) & 0x100000)
> | ((FM << 18) & 0x1000000)
> | ((FM << 21) & 0x10000000);
> to avoid sequence point in case it hampers the compiler)
>=20
> and the output is:
>=20
> rlwinm 10,3,3,27,27 # D.11621, FM,,
> rlwinm 9,3,6,23,23 # D.11621, FM,,
> or 9,10,9 #, D.11621, D.11621, D.11621
> rlwinm 10,3,0,31,31 # D.11621, FM,
> or 9,9,10 #, D.11621, D.11621, D.11621
> rlwinm 10,3,9,19,19 # D.11621, FM,,
> or 9,9,10 #, D.11621, D.11621, D.11621
> rlwinm 10,3,12,15,15 # D.11621, FM,,
> or 9,9,10 #, D.11621, D.11621, D.11621
> rlwinm 10,3,15,11,11 # D.11621, FM,,
> or 9,9,10 #, D.11621, D.11621, D.11621
> rlwinm 10,3,18,7,7 # D.11621, FM,,
> or 9,9,10 #, D.11621, D.11621, D.11621
> rlwinm 3,3,21,3,3 # D.11621, FM,,
> or 9,9,3 #, mask, D.11621, D.11621
> mulli 9,9,15 # mask, mask,
>=20
> see that r9 is used 7 times as both input and output operand, plus
> once for rlwinm. This gives a dependency length of 8 at least.
>=20
> In the other case (I've deleted the code) the dependency length
> was significantly shorter. In any case that one is fewer instructions,
> which is good for occasional use.
Hmmm... I hand-counted a dependency length of 8 for the other version.
Maybe there are some ppc instructions that reduce it.
Stupid compiler :-)
Trouble is, I bet that even if you code it as:
mask1 =3D (FM & 1) | ((FM << 3) & 0x10);
mask2 =3D ((FM << 6) & 0x100) | ((FM << 9) & 0x1000);
mask3 =3D ((FM << 12) & 0x10000) | ((FM << 15) & 0x100000);
mask4 =3D ((FM << 18) & 0x1000000) | ((FM << 21) & 0x10000000);
mask1 |=3D mask2;
mask3 |=3D mask4;
mask =3D mask1 | mask3;
the compiler will 'optimise' it to the above before code generation.
If it doesn't adding () to pair the | might be enough.
Then a new version of the compiler will change the behaviour again.
David
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-10 12:32 ` David Laight
@ 2014-02-10 13:00 ` Gabriel Paubert
0 siblings, 0 replies; 44+ messages in thread
From: Gabriel Paubert @ 2014-02-10 13:00 UTC (permalink / raw)
To: David Laight; +Cc: James Yang, Chris Proctor, Stephen N Chivers, linuxppc-dev
On Mon, Feb 10, 2014 at 12:32:18PM +0000, David Laight wrote:
> > I disagree, perhaps mostly because the compiler is not clever enough, but right
> > now the code for solution 1 is (actually I have rewritten the code
> > and it reads:
> >
> > mask = (FM & 1)
> > | ((FM << 3) & 0x10)
> > | ((FM << 6) & 0x100)
> > | ((FM << 9) & 0x1000)
> > | ((FM << 12) & 0x10000)
> > | ((FM << 15) & 0x100000)
> > | ((FM << 18) & 0x1000000)
> > | ((FM << 21) & 0x10000000);
> > to avoid sequence point in case it hampers the compiler)
> >
> > and the output is:
> >
> > rlwinm 10,3,3,27,27 # D.11621, FM,,
> > rlwinm 9,3,6,23,23 # D.11621, FM,,
> > or 9,10,9 #, D.11621, D.11621, D.11621
> > rlwinm 10,3,0,31,31 # D.11621, FM,
> > or 9,9,10 #, D.11621, D.11621, D.11621
> > rlwinm 10,3,9,19,19 # D.11621, FM,,
> > or 9,9,10 #, D.11621, D.11621, D.11621
> > rlwinm 10,3,12,15,15 # D.11621, FM,,
> > or 9,9,10 #, D.11621, D.11621, D.11621
> > rlwinm 10,3,15,11,11 # D.11621, FM,,
> > or 9,9,10 #, D.11621, D.11621, D.11621
> > rlwinm 10,3,18,7,7 # D.11621, FM,,
> > or 9,9,10 #, D.11621, D.11621, D.11621
> > rlwinm 3,3,21,3,3 # D.11621, FM,,
> > or 9,9,3 #, mask, D.11621, D.11621
> > mulli 9,9,15 # mask, mask,
> >
> > see that r9 is used 7 times as both input and output operand, plus
> > once for rlwinm. This gives a dependency length of 8 at least.
> >
> > In the other case (I've deleted the code) the dependency length
> > was significantly shorter. In any case that one is fewer instructions,
> > which is good for occasional use.
>
> Hmmm... I hand-counted a dependency length of 8 for the other version.
> Maybe there are some ppc instructions that reduce it.
Either I misread the generated code or I got somewhat less.
What helps for method1 is the rotate and mask instructions of PPC. Each of
left shift and mask becomes a single rlwinm.
>
> Stupid compiler :-)
Indeed. I've trying to coerce it into generating rlwimi instructions
(in which case the whole building of the mask reduces to 8 assembly
instructions) and failed. It seems that the compiler lacks some patterns
some patterns that would directly map to rlwimi.
> Trouble is, I bet that even if you code it as:
> mask1 = (FM & 1) | ((FM << 3) & 0x10);
> mask2 = ((FM << 6) & 0x100) | ((FM << 9) & 0x1000);
> mask3 = ((FM << 12) & 0x10000) | ((FM << 15) & 0x100000);
> mask4 = ((FM << 18) & 0x1000000) | ((FM << 21) & 0x10000000);
> mask1 |= mask2;
> mask3 |= mask4;
> mask = mask1 | mask3;
> the compiler will 'optimise' it to the above before code generation.
Indeed it's what it does :-(
I believe that the current suggestion is good enough.
Gabriel
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-09 19:42 ` Stephen N Chivers
@ 2014-02-10 16:50 ` James Yang
0 siblings, 0 replies; 44+ messages in thread
From: James Yang @ 2014-02-10 16:50 UTC (permalink / raw)
To: Stephen N Chivers; +Cc: James Yang, Chris Proctor, linuxppc-dev
On Sun, 9 Feb 2014, Stephen N Chivers wrote:
> James Yang <James.Yang@freescale.com> wrote on 02/08/2014 07:49:40 AM:
>
> > From: James Yang <James.Yang@freescale.com>
> > To: Gabriel Paubert <paubert@iram.es>
> > Cc: Stephen N Chivers <schivers@csc.com.au>, Chris Proctor
> > <cproctor@csc.com.au>, <linuxppc-dev@lists.ozlabs.org>
> > Date: 02/08/2014 07:49 AM
> > Subject: Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
> >
> > On Fri, 7 Feb 2014, Gabriel Paubert wrote:
> >
> > > Hi Stephen,
> > >
> > > On Fri, Feb 07, 2014 at 11:27:57AM +1000, Stephen N Chivers wrote:
> > > > Gabriel Paubert <paubert@iram.es> wrote on 02/06/2014 07:26:37 PM:
> > > >
> > > > > From: Gabriel Paubert <paubert@iram.es>
> > > > > To: Stephen N Chivers <schivers@csc.com.au>
> > > > > Cc: linuxppc-dev@lists.ozlabs.org, Chris Proctor
> <cproctor@csc.com.au>
> > > > > Date: 02/06/2014 07:26 PM
> > > > > Subject: Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
> > > > >
> > > > > On Thu, Feb 06, 2014 at 12:09:00PM +1000, Stephen N Chivers wrote:
> > > > > > With the above mask computation I get consistent results for
> > > > > > both the MPC8548 and MPC7410 boards.
> > > > > >
> > > > > > Am I missing something subtle?
> > > > >
> > > > > No I think you are correct. This said, this code may probably be
> > > > optimized
> > > > > to eliminate a lot of the conditional branches. I think that:
> >
> >
> > If the compiler is enabled to generate isel instructions, it would not
> > use a conditional branch for this code. (ignore the andi's values,
> > this is an old compile)
> >
> From limited research, the 440GP is a processor
> that doesn't implement the isel instruction and it does
> not implement floating point.
>
> The kernel emulates isel and so using that instruction
> for the 440GP would have a double trap penalty.
Are you writing about something outside the scope of this thread? OP
was using MPC8548 not a 440GP. The compiler should not be using or
targeting 8548 for a 440GP so having to emulate isel shouldn't be an
issue because there wouldn't be any. (The assembly listing I posted
was generated by gcc targeting 8548.) Anyway, I had measured the
non-isel routines to be faster and that works without illop traps.
> Correct me if I am wrong, the isel instruction first appears
> in PowerPC ISA v2.04 around mid 2007.
isel appeared in 2003 in the e500 (v1) core that is in the MPC8540.
The instruction is Power ISA 2.03 (9/2006).
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-10 11:03 ` Gabriel Paubert
2014-02-10 11:17 ` David Laight
@ 2014-02-10 17:03 ` James Yang
2014-02-11 7:26 ` Gabriel Paubert
1 sibling, 1 reply; 44+ messages in thread
From: James Yang @ 2014-02-10 17:03 UTC (permalink / raw)
To: Gabriel Paubert
Cc: James Yang, Chris Proctor, Stephen N Chivers, linuxppc-dev
On Mon, 10 Feb 2014, Gabriel Paubert wrote:
> On Fri, Feb 07, 2014 at 02:49:40PM -0600, James Yang wrote:
> > On Fri, 7 Feb 2014, Gabriel Paubert wrote:
> >
> > > Hi Stephen,
> > >
> > > On Fri, Feb 07, 2014 at 11:27:57AM +1000, Stephen N Chivers wrote:
> > > > Gabriel Paubert <paubert@iram.es> wrote on 02/06/2014 07:26:37 PM:
> > > >
> > > > > From: Gabriel Paubert <paubert@iram.es>
> > > > > To: Stephen N Chivers <schivers@csc.com.au>
> > > > > Cc: linuxppc-dev@lists.ozlabs.org, Chris Proctor <cproctor@csc.com.au>
> > > > > Date: 02/06/2014 07:26 PM
> > > > > Subject: Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
> > > > >
> > > > > On Thu, Feb 06, 2014 at 12:09:00PM +1000, Stephen N Chivers wrote:
> >
> > > > > >
> > > > > > mask = 0;
> > > > > > if (FM & (1 << 0))
> > > > > > mask |= 0x0000000f;
> > > > > > if (FM & (1 << 1))
> > > > > > mask |= 0x000000f0;
> > > > > > if (FM & (1 << 2))
> > > > > > mask |= 0x00000f00;
> > > > > > if (FM & (1 << 3))
> > > > > > mask |= 0x0000f000;
> > > > > > if (FM & (1 << 4))
> > > > > > mask |= 0x000f0000;
> > > > > > if (FM & (1 << 5))
> > > > > > mask |= 0x00f00000;
> > > > > > if (FM & (1 << 6))
> > > > > > mask |= 0x0f000000;
> > > > > > if (FM & (1 << 7))
> > > > > > mask |= 0x90000000;
> > > > > >
> > > > > > With the above mask computation I get consistent results for
> > > > > > both the MPC8548 and MPC7410 boards.
> Ok, if you have measured that method1 is faster than method2, let us go for it.
> I believe method2 would be faster if you had a large out-of-order execution
> window, because more parallelism can be extracted from it, but this is probably
> only true for high end cores, which do not need FPU emulation in the first place.
Yeah, 8548 can issue 2 SFX instructions per cycle which is what the
compiler generated.
> The other place where we can optimize is the generation of FEX. Here is
> my current patch:
>
>
> diff --git a/arch/powerpc/math-emu/mtfsf.c b/arch/powerpc/math-emu/mtfsf.c
> index dbce92e..b57b3fa8 100644
> --- a/arch/powerpc/math-emu/mtfsf.c
> +++ b/arch/powerpc/math-emu/mtfsf.c
> @@ -11,48 +11,35 @@ mtfsf(unsigned int FM, u32 *frB)
> u32 mask;
> u32 fpscr;
>
> - if (FM == 0)
> + if (likely(FM == 0xff))
> + mask = 0xffffffff;
> + else if (unlikely(FM == 0))
> return 0;
> -
> - if (FM == 0xff)
> - mask = 0x9fffffff;
> else {
> - mask = 0;
> - if (FM & (1 << 0))
> - mask |= 0x90000000;
> - if (FM & (1 << 1))
> - mask |= 0x0f000000;
> - if (FM & (1 << 2))
> - mask |= 0x00f00000;
> - if (FM & (1 << 3))
> - mask |= 0x000f0000;
> - if (FM & (1 << 4))
> - mask |= 0x0000f000;
> - if (FM & (1 << 5))
> - mask |= 0x00000f00;
> - if (FM & (1 << 6))
> - mask |= 0x000000f0;
> - if (FM & (1 << 7))
> - mask |= 0x0000000f;
> + mask = (FM & 1);
> + mask |= (FM << 3) & 0x10;
> + mask |= (FM << 6) & 0x100;
> + mask |= (FM << 9) & 0x1000;
> + mask |= (FM << 12) & 0x10000;
> + mask |= (FM << 15) & 0x100000;
> + mask |= (FM << 18) & 0x1000000;
> + mask |= (FM << 21) & 0x10000000;
> + mask *= 15;
Needs to also mask out bits 1 and 2, they aren't to be set from frB.
mask &= 0x9FFFFFFF;
> }
>
> - __FPU_FPSCR &= ~(mask);
> - __FPU_FPSCR |= (frB[1] & mask);
> + fpscr = ((__FPU_FPSCR & ~mask) | (frB[1] & mask)) &
> + ~(FPSCR_VX | FPSCR_FEX);
>
> - __FPU_FPSCR &= ~(FPSCR_VX);
> - if (__FPU_FPSCR & (FPSCR_VXSNAN | FPSCR_VXISI | FPSCR_VXIDI |
> + if (fpscr & (FPSCR_VXSNAN | FPSCR_VXISI | FPSCR_VXIDI |
> FPSCR_VXZDZ | FPSCR_VXIMZ | FPSCR_VXVC |
> FPSCR_VXSOFT | FPSCR_VXSQRT | FPSCR_VXCVI))
> - __FPU_FPSCR |= FPSCR_VX;
> -
> - fpscr = __FPU_FPSCR;
> - fpscr &= ~(FPSCR_FEX);
> - if (((fpscr & FPSCR_VX) && (fpscr & FPSCR_VE)) ||
> - ((fpscr & FPSCR_OX) && (fpscr & FPSCR_OE)) ||
> - ((fpscr & FPSCR_UX) && (fpscr & FPSCR_UE)) ||
> - ((fpscr & FPSCR_ZX) && (fpscr & FPSCR_ZE)) ||
> - ((fpscr & FPSCR_XX) && (fpscr & FPSCR_XE)))
> - fpscr |= FPSCR_FEX;
> + fpscr |= FPSCR_VX;
> +
> + /* The bit order of exception enables and exception status
> + * is the same. Simply shift and mask to check for enabled
> + * exceptions.
> + */
> + if (fpscr & (fpscr >> 22) & 0xf8) fpscr |= FPSCR_FEX;
> __FPU_FPSCR = fpscr;
>
> #ifdef DEBUG
> mtfsf.c | 57 ++++++++++++++++++++++-----------------------------------
> 1 file changed, 22 insertions(+), 35 deletions(-)
>
>
> Notes:
>
> 1) I'm really unsure on whether 0xff is frequent or not. So the likely()
> statement at the beginning may be wrong. Actually, if it is not very likely,
> it might be better to remove the special casef for FM==0xff. A look at
> GCC sources shows that it never generates a mask of 0xff. From glibc
> sources, there vast majority of cases uses 0x1, only isnan() uses 0xff.
Can't handle all cases here.
> 2) it may be better to remove the check for FM==0, after all, the instruction
> efectively becomes a nop, and generating the instruction in the first place
> would be too stupid for words.
Hmm a heavy no-op. I wonder if it is heavier than a sync.
> 3) if might be better to #define the magic constants (22 and 0xf8) used
> in the last if statement.
>
> Gabriel
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: arch/powerpc/math-emu/mtfsf.c - incorrect mask?
2014-02-10 17:03 ` James Yang
@ 2014-02-11 7:26 ` Gabriel Paubert
2014-02-11 20:57 ` Linux-3.14-rc2: Order of serial node compatibles in DTS files Stephen N Chivers
0 siblings, 1 reply; 44+ messages in thread
From: Gabriel Paubert @ 2014-02-11 7:26 UTC (permalink / raw)
To: James Yang; +Cc: Chris Proctor, Stephen N Chivers, linuxppc-dev
Hi James,
On Mon, Feb 10, 2014 at 11:03:07AM -0600, James Yang wrote:
[snipped]
> > Ok, if you have measured that method1 is faster than method2, let us go for it.
> > I believe method2 would be faster if you had a large out-of-order execution
> > window, because more parallelism can be extracted from it, but this is probably
> > only true for high end cores, which do not need FPU emulation in the first place.
>
> Yeah, 8548 can issue 2 SFX instructions per cycle which is what the
> compiler generated.
>
Then it is method1.
>
> > The other place where we can optimize is the generation of FEX. Here is
> > my current patch:
> >
> >
> > diff --git a/arch/powerpc/math-emu/mtfsf.c b/arch/powerpc/math-emu/mtfsf.c
> > index dbce92e..b57b3fa8 100644
> > --- a/arch/powerpc/math-emu/mtfsf.c
> > +++ b/arch/powerpc/math-emu/mtfsf.c
> > @@ -11,48 +11,35 @@ mtfsf(unsigned int FM, u32 *frB)
> > u32 mask;
> > u32 fpscr;
> >
> > - if (FM == 0)
> > + if (likely(FM == 0xff))
> > + mask = 0xffffffff;
> > + else if (unlikely(FM == 0))
> > return 0;
> > -
> > - if (FM == 0xff)
> > - mask = 0x9fffffff;
> > else {
> > - mask = 0;
> > - if (FM & (1 << 0))
> > - mask |= 0x90000000;
> > - if (FM & (1 << 1))
> > - mask |= 0x0f000000;
> > - if (FM & (1 << 2))
> > - mask |= 0x00f00000;
> > - if (FM & (1 << 3))
> > - mask |= 0x000f0000;
> > - if (FM & (1 << 4))
> > - mask |= 0x0000f000;
> > - if (FM & (1 << 5))
> > - mask |= 0x00000f00;
> > - if (FM & (1 << 6))
> > - mask |= 0x000000f0;
> > - if (FM & (1 << 7))
> > - mask |= 0x0000000f;
> > + mask = (FM & 1);
> > + mask |= (FM << 3) & 0x10;
> > + mask |= (FM << 6) & 0x100;
> > + mask |= (FM << 9) & 0x1000;
> > + mask |= (FM << 12) & 0x10000;
> > + mask |= (FM << 15) & 0x100000;
> > + mask |= (FM << 18) & 0x1000000;
> > + mask |= (FM << 21) & 0x10000000;
> > + mask *= 15;
>
>
> Needs to also mask out bits 1 and 2, they aren't to be set from frB.
>
> mask &= 0x9FFFFFFF;
>
>
Look at the following lines:
>
>
> > }
> >
> > - __FPU_FPSCR &= ~(mask);
> > - __FPU_FPSCR |= (frB[1] & mask);
> > + fpscr = ((__FPU_FPSCR & ~mask) | (frB[1] & mask)) &
> > + ~(FPSCR_VX | FPSCR_FEX);
It's here (masking FPSCR_VX and FPSCR_FEX).
Actually the previous code was redundant, it cleared FEX and VX in the
mask computation and later again when recomputing them. Clearing them
once should be enough.
> >
> > - __FPU_FPSCR &= ~(FPSCR_VX);
> > - if (__FPU_FPSCR & (FPSCR_VXSNAN | FPSCR_VXISI | FPSCR_VXIDI |
> > + if (fpscr & (FPSCR_VXSNAN | FPSCR_VXISI | FPSCR_VXIDI |
> > FPSCR_VXZDZ | FPSCR_VXIMZ | FPSCR_VXVC |
> > FPSCR_VXSOFT | FPSCR_VXSQRT | FPSCR_VXCVI))
> > - __FPU_FPSCR |= FPSCR_VX;
> > -
> > - fpscr = __FPU_FPSCR;
> > - fpscr &= ~(FPSCR_FEX);
> > - if (((fpscr & FPSCR_VX) && (fpscr & FPSCR_VE)) ||
> > - ((fpscr & FPSCR_OX) && (fpscr & FPSCR_OE)) ||
> > - ((fpscr & FPSCR_UX) && (fpscr & FPSCR_UE)) ||
> > - ((fpscr & FPSCR_ZX) && (fpscr & FPSCR_ZE)) ||
> > - ((fpscr & FPSCR_XX) && (fpscr & FPSCR_XE)))
> > - fpscr |= FPSCR_FEX;
> > + fpscr |= FPSCR_VX;
> > +
> > + /* The bit order of exception enables and exception status
> > + * is the same. Simply shift and mask to check for enabled
> > + * exceptions.
> > + */
> > + if (fpscr & (fpscr >> 22) & 0xf8) fpscr |= FPSCR_FEX;
> > __FPU_FPSCR = fpscr;
> >
> > #ifdef DEBUG
> > mtfsf.c | 57 ++++++++++++++++++++++-----------------------------------
> > 1 file changed, 22 insertions(+), 35 deletions(-)
> >
> >
> > Notes:
> >
> > 1) I'm really unsure on whether 0xff is frequent or not. So the likely()
> > statement at the beginning may be wrong. Actually, if it is not very likely,
> > it might be better to remove the special casef for FM==0xff. A look at
> > GCC sources shows that it never generates a mask of 0xff. From glibc
> > sources, there vast majority of cases uses 0x1, only isnan() uses 0xff.
>
> Can't handle all cases here.
That's why I would go for the simplest possible code. Conditionals are
expensive and minimizing cache footprint is often the best measure of
performance for infrequently used code. With this in mind, I would get
rid of all the tests for special FM values and rely on the optimized
generic case.
>
> > 2) it may be better to remove the check for FM==0, after all, the instruction
> > effectively becomes a nop, and generating the instruction in the first place
> > would be too stupid for words.
>
> Hmm a heavy no-op. I wonder if it is heavier than a sync.
In theory not. It contains the equivalent of several isync (taking an exception
and returning from it), but not any synchronization wrt the memory accesses.
Gabriel
^ permalink raw reply [flat|nested] 44+ messages in thread
* Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-11 7:26 ` Gabriel Paubert
@ 2014-02-11 20:57 ` Stephen N Chivers
[not found] ` <OF6F6A0029.3B20EF5B-ONCA257C7C.0071BFDA-CA257C7C.00732AD3-SmukeSwxQOQ@public.gmane.org>
0 siblings, 1 reply; 44+ messages in thread
From: Stephen N Chivers @ 2014-02-11 20:57 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Chris Proctor
I have been trial booting a 3.14-rc2 kernel for a 85xx platform
(dtbImage).
After mounting the root filesystem there are no messages from the init
scripts
and the serial console is not available for login.
In the kernel log messages there is:
of_serial f1004500.serial: Unknown serial port found, ignored.
The serial nodes in boards dts file are specified as:
serial0: serial@4500 {
cell-index = <0>;
device_type = "serial";
compatible = "fsl,ns16550", "ns16550";
reg = <0x4500 0x100>;
clock-frequency = <0>;
interrupts = <0x2a 0x2>;
interrupt-parent = <&mpic>;
};
Reversing the order of the compatible:
compatible = "ns16550", "fsl,ns16550";
restores the serial console.
Linux-3.13 does not have this behaviour.
There are 49 dts files in Linux-3.14-rc2 that have the fsl,ns16550
compatible first.
Stephen Chivers,
CSC Australia Pty. Ltd.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-11 20:57 ` Linux-3.14-rc2: Order of serial node compatibles in DTS files Stephen N Chivers
@ 2014-02-11 22:33 ` Kumar Gala
0 siblings, 0 replies; 44+ messages in thread
From: Kumar Gala @ 2014-02-11 22:33 UTC (permalink / raw)
To: Stephen N Chivers
Cc: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, Chris Proctor, devicetree,
sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w
On Feb 11, 2014, at 2:57 PM, Stephen N Chivers <schivers-znpAAEhiOVUQrrorzV6ljw@public.gmane.org> wrote:
> I have been trial booting a 3.14-rc2 kernel for a 85xx platform
> (dtbImage).
>
> After mounting the root filesystem there are no messages from the init
> scripts
> and the serial console is not available for login.
>
> In the kernel log messages there is:
>
> of_serial f1004500.serial: Unknown serial port found, ignored.
>
> The serial nodes in boards dts file are specified as:
>
> serial0: serial@4500 {
> cell-index = <0>;
> device_type = "serial";
> compatible = "fsl,ns16550", "ns16550";
> reg = <0x4500 0x100>;
> clock-frequency = <0>;
> interrupts = <0x2a 0x2>;
> interrupt-parent = <&mpic>;
> };
>
> Reversing the order of the compatible:
>
> compatible = "ns16550", "fsl,ns16550";
>
> restores the serial console.
>
> Linux-3.13 does not have this behaviour.
>
> There are 49 dts files in Linux-3.14-rc2 that have the fsl,ns16550
> compatible first.
Hmm,
Wondering if this caused the issue:
commit 105353145eafb3ea919f5cdeb652a9d8f270228e
Author: Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Date: Tue Dec 3 14:52:00 2013 +0100
OF: base: match each node compatible against all given matches first
- k
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-11 22:33 ` Kumar Gala
0 siblings, 0 replies; 44+ messages in thread
From: Kumar Gala @ 2014-02-11 22:33 UTC (permalink / raw)
To: Stephen N Chivers
Cc: Chris Proctor, sebastian.hesselbarth, linuxppc-dev, devicetree
On Feb 11, 2014, at 2:57 PM, Stephen N Chivers <schivers@csc.com.au> wrote:
> I have been trial booting a 3.14-rc2 kernel for a 85xx platform
> (dtbImage).
>
> After mounting the root filesystem there are no messages from the init
> scripts
> and the serial console is not available for login.
>
> In the kernel log messages there is:
>
> of_serial f1004500.serial: Unknown serial port found, ignored.
>
> The serial nodes in boards dts file are specified as:
>
> serial0: serial@4500 {
> cell-index = <0>;
> device_type = "serial";
> compatible = "fsl,ns16550", "ns16550";
> reg = <0x4500 0x100>;
> clock-frequency = <0>;
> interrupts = <0x2a 0x2>;
> interrupt-parent = <&mpic>;
> };
>
> Reversing the order of the compatible:
>
> compatible = "ns16550", "fsl,ns16550";
>
> restores the serial console.
>
> Linux-3.13 does not have this behaviour.
>
> There are 49 dts files in Linux-3.14-rc2 that have the fsl,ns16550
> compatible first.
Hmm,
Wondering if this caused the issue:
commit 105353145eafb3ea919f5cdeb652a9d8f270228e
Author: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Date: Tue Dec 3 14:52:00 2013 +0100
OF: base: match each node compatible against all given matches first
- k
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-11 22:33 ` Kumar Gala
@ 2014-02-11 22:51 ` Sebastian Hesselbarth
-1 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-11 22:51 UTC (permalink / raw)
To: Kumar Gala, Stephen N Chivers
Cc: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, Chris Proctor, devicetree,
Arnd Bergmann
On 02/11/2014 11:33 PM, Kumar Gala wrote:
>
> On Feb 11, 2014, at 2:57 PM, Stephen N Chivers <schivers-znpAAEhiOVUQrrorzV6ljw@public.gmane.org> wrote:
>
>> I have been trial booting a 3.14-rc2 kernel for a 85xx platform
>> (dtbImage).
>>
>> After mounting the root filesystem there are no messages from the init
>> scripts
>> and the serial console is not available for login.
>>
>> In the kernel log messages there is:
>>
>> of_serial f1004500.serial: Unknown serial port found, ignored.
>>
>> The serial nodes in boards dts file are specified as:
>>
>> serial0: serial@4500 {
>> cell-index = <0>;
>> device_type = "serial";
>> compatible = "fsl,ns16550", "ns16550";
>> reg = <0x4500 0x100>;
>> clock-frequency = <0>;
>> interrupts = <0x2a 0x2>;
>> interrupt-parent = <&mpic>;
>> };
>>
>> Reversing the order of the compatible:
>>
>> compatible = "ns16550", "fsl,ns16550";
>>
>> restores the serial console.
>>
>> Linux-3.13 does not have this behaviour.
>>
>> There are 49 dts files in Linux-3.14-rc2 that have the fsl,ns16550
>> compatible first.
>
> Hmm,
>
> Wondering if this caused the issue:
>
> commit 105353145eafb3ea919f5cdeb652a9d8f270228e
> Author: Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> Date: Tue Dec 3 14:52:00 2013 +0100
>
> OF: base: match each node compatible against all given matches first
[adding Arnd on Cc]
Could be. I checked tty/serial/of_serial.c and it does not provide a
compatible for "fsl,ns16550". Does reverting the patch fix the issue
observed?
I don't think the missing compatible is causing it, but of_serial
provides a DT match for .type = "serial" just to fail later on
with the error seen above.
The commit in question reorders of_match_device in a way that match
table order is not relevant anymore. This can cause it to match
.type = "serial" first here.
Rather than touching the commit, I suggest to remove the problematic
.type = "serial" from the match table. It is of no use anyway.
Sebastian
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-11 22:51 ` Sebastian Hesselbarth
0 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-11 22:51 UTC (permalink / raw)
To: Kumar Gala, Stephen N Chivers
Cc: Chris Proctor, linuxppc-dev, Arnd Bergmann, devicetree
On 02/11/2014 11:33 PM, Kumar Gala wrote:
>
> On Feb 11, 2014, at 2:57 PM, Stephen N Chivers <schivers@csc.com.au> wrote:
>
>> I have been trial booting a 3.14-rc2 kernel for a 85xx platform
>> (dtbImage).
>>
>> After mounting the root filesystem there are no messages from the init
>> scripts
>> and the serial console is not available for login.
>>
>> In the kernel log messages there is:
>>
>> of_serial f1004500.serial: Unknown serial port found, ignored.
>>
>> The serial nodes in boards dts file are specified as:
>>
>> serial0: serial@4500 {
>> cell-index = <0>;
>> device_type = "serial";
>> compatible = "fsl,ns16550", "ns16550";
>> reg = <0x4500 0x100>;
>> clock-frequency = <0>;
>> interrupts = <0x2a 0x2>;
>> interrupt-parent = <&mpic>;
>> };
>>
>> Reversing the order of the compatible:
>>
>> compatible = "ns16550", "fsl,ns16550";
>>
>> restores the serial console.
>>
>> Linux-3.13 does not have this behaviour.
>>
>> There are 49 dts files in Linux-3.14-rc2 that have the fsl,ns16550
>> compatible first.
>
> Hmm,
>
> Wondering if this caused the issue:
>
> commit 105353145eafb3ea919f5cdeb652a9d8f270228e
> Author: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
> Date: Tue Dec 3 14:52:00 2013 +0100
>
> OF: base: match each node compatible against all given matches first
[adding Arnd on Cc]
Could be. I checked tty/serial/of_serial.c and it does not provide a
compatible for "fsl,ns16550". Does reverting the patch fix the issue
observed?
I don't think the missing compatible is causing it, but of_serial
provides a DT match for .type = "serial" just to fail later on
with the error seen above.
The commit in question reorders of_match_device in a way that match
table order is not relevant anymore. This can cause it to match
.type = "serial" first here.
Rather than touching the commit, I suggest to remove the problematic
.type = "serial" from the match table. It is of no use anyway.
Sebastian
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-11 22:51 ` Sebastian Hesselbarth
(?)
@ 2014-02-11 23:38 ` Stephen N Chivers
[not found] ` <OFB203CA90.B048F8AA-ONCA257C7C.0081A816-CA257C7C.0081DCE3-SmukeSwxQOQ@public.gmane.org>
-1 siblings, 1 reply; 44+ messages in thread
From: Stephen N Chivers @ 2014-02-11 23:38 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Chris Proctor, Arnd Bergmann, devicetree, Stephen N Chivers,
linuxppc-dev
Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> wrote on
02/12/2014 09:51:43 AM:
> From: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
> To: Kumar Gala <galak@kernel.crashing.org>, Stephen N Chivers
> <schivers@csc.com.au>
> Cc: linuxppc-dev@lists.ozlabs.org, Chris Proctor
> <cproctor@csc.com.au>, devicetree <devicetree@vger.kernel.org>, Arnd
> Bergmann <arnd@arndb.de>
> Date: 02/12/2014 09:51 AM
> Subject: Re: Linux-3.14-rc2: Order of serial node compatibles in DTS
files.
>
> On 02/11/2014 11:33 PM, Kumar Gala wrote:
> >
> > On Feb 11, 2014, at 2:57 PM, Stephen N Chivers <schivers@csc.com.au>
wrote:
> >
> >> I have been trial booting a 3.14-rc2 kernel for a 85xx platform
> >> (dtbImage).
> >>
> >> After mounting the root filesystem there are no messages from the
init
> >> scripts
> >> and the serial console is not available for login.
> >>
> >> In the kernel log messages there is:
> >>
> >> of_serial f1004500.serial: Unknown serial port found, ignored.
> >>
> >> The serial nodes in boards dts file are specified as:
> >>
> >> serial0: serial@4500 {
> >> cell-index = <0>;
> >> device_type = "serial";
> >> compatible = "fsl,ns16550", "ns16550";
> >> reg = <0x4500 0x100>;
> >> clock-frequency = <0>;
> >> interrupts = <0x2a 0x2>;
> >> interrupt-parent = <&mpic>;
> >> };
> >>
> >> Reversing the order of the compatible:
> >>
> >> compatible = "ns16550", "fsl,ns16550";
> >>
> >> restores the serial console.
> >>
> >> Linux-3.13 does not have this behaviour.
> >>
> >> There are 49 dts files in Linux-3.14-rc2 that have the fsl,ns16550
> >> compatible first.
> >
> > Hmm,
> >
> > Wondering if this caused the issue:
> >
> > commit 105353145eafb3ea919f5cdeb652a9d8f270228e
> > Author: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
> > Date: Tue Dec 3 14:52:00 2013 +0100
> >
> > OF: base: match each node compatible against all given matches
first
>
> [adding Arnd on Cc]
>
> Could be. I checked tty/serial/of_serial.c and it does not provide a
> compatible for "fsl,ns16550". Does reverting the patch fix the issue
> observed?
>
> I don't think the missing compatible is causing it, but of_serial
> provides a DT match for .type = "serial" just to fail later on
> with the error seen above.
>
> The commit in question reorders of_match_device in a way that match
> table order is not relevant anymore. This can cause it to match
> .type = "serial" first here.
>
> Rather than touching the commit, I suggest to remove the problematic
> .type = "serial" from the match table. It is of no use anyway.
Deleting the "serial" line from the match table fixes the problem.
I tested it for both orderings of compatible.
>
> Sebastian
Thanks,
Stephen Chivers,
CSC Australia Pty. Ltd.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-11 22:51 ` Sebastian Hesselbarth
@ 2014-02-11 23:41 ` Scott Wood
-1 siblings, 0 replies; 44+ messages in thread
From: Scott Wood @ 2014-02-11 23:41 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Kumar Gala, Stephen N Chivers, Chris Proctor,
linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, Arnd Bergmann, devicetree
On Tue, 2014-02-11 at 23:51 +0100, Sebastian Hesselbarth wrote:
> On 02/11/2014 11:33 PM, Kumar Gala wrote:
> > Hmm,
> >
> > Wondering if this caused the issue:
> >
> > commit 105353145eafb3ea919f5cdeb652a9d8f270228e
> > Author: Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> > Date: Tue Dec 3 14:52:00 2013 +0100
> >
> > OF: base: match each node compatible against all given matches first
>
> [adding Arnd on Cc]
>
> Could be. I checked tty/serial/of_serial.c and it does not provide a
> compatible for "fsl,ns16550". Does reverting the patch fix the issue
> observed?
>
> I don't think the missing compatible is causing it, but of_serial
> provides a DT match for .type = "serial" just to fail later on
> with the error seen above.
>
> The commit in question reorders of_match_device in a way that match
> table order is not relevant anymore. This can cause it to match
> .type = "serial" first here.
>
> Rather than touching the commit, I suggest to remove the problematic
> .type = "serial" from the match table. It is of no use anyway.
Regardless of whether .type = "serial" gets removed, it seems wrong for
of_match_node() to accept a .type-only match (or .name, or anything else
that doesn't involve .compatible) before it accepts a compatible match
other than the first in the compatible property.
-Scott
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-11 23:41 ` Scott Wood
0 siblings, 0 replies; 44+ messages in thread
From: Scott Wood @ 2014-02-11 23:41 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Chris Proctor, Arnd Bergmann, devicetree, Stephen N Chivers,
linuxppc-dev
On Tue, 2014-02-11 at 23:51 +0100, Sebastian Hesselbarth wrote:
> On 02/11/2014 11:33 PM, Kumar Gala wrote:
> > Hmm,
> >
> > Wondering if this caused the issue:
> >
> > commit 105353145eafb3ea919f5cdeb652a9d8f270228e
> > Author: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
> > Date: Tue Dec 3 14:52:00 2013 +0100
> >
> > OF: base: match each node compatible against all given matches first
>
> [adding Arnd on Cc]
>
> Could be. I checked tty/serial/of_serial.c and it does not provide a
> compatible for "fsl,ns16550". Does reverting the patch fix the issue
> observed?
>
> I don't think the missing compatible is causing it, but of_serial
> provides a DT match for .type = "serial" just to fail later on
> with the error seen above.
>
> The commit in question reorders of_match_device in a way that match
> table order is not relevant anymore. This can cause it to match
> .type = "serial" first here.
>
> Rather than touching the commit, I suggest to remove the problematic
> .type = "serial" from the match table. It is of no use anyway.
Regardless of whether .type = "serial" gets removed, it seems wrong for
of_match_node() to accept a .type-only match (or .name, or anything else
that doesn't involve .compatible) before it accepts a compatible match
other than the first in the compatible property.
-Scott
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-11 23:38 ` Stephen N Chivers
@ 2014-02-11 23:43 ` Sebastian Hesselbarth
0 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-11 23:43 UTC (permalink / raw)
To: Stephen N Chivers
Cc: Arnd Bergmann, Chris Proctor, devicetree, Kumar Gala,
linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
[-- Attachment #1: Type: text/plain, Size: 2132 bytes --]
On 02/12/2014 12:38 AM, Stephen N Chivers wrote:
> Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote on
>> On 02/11/2014 11:33 PM, Kumar Gala wrote:
>>> On Feb 11, 2014, at 2:57 PM, Stephen N Chivers <schivers-znpAAEhiOVUQrrorzV6ljw@public.gmane.org> wrote:
>>>> I have been trial booting a 3.14-rc2 kernel for a 85xx platform
>>>> (dtbImage).
[...]
>>>>
>>>> of_serial f1004500.serial: Unknown serial port found, ignored.
>>>>
>>>> The serial nodes in boards dts file are specified as:
>>>>
>>>> serial0: serial@4500 {
>>>> cell-index = <0>;
>>>> device_type = "serial";
>>>> compatible = "fsl,ns16550", "ns16550";
>>>> reg = <0x4500 0x100>;
>>>> clock-frequency = <0>;
>>>> interrupts = <0x2a 0x2>;
>>>> interrupt-parent = <&mpic>;
>>>> };
>>>
>>> Wondering if this caused the issue:
>>>
>>> commit 105353145eafb3ea919f5cdeb652a9d8f270228e
>>> Author: Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>> Date: Tue Dec 3 14:52:00 2013 +0100
>>>
>>> OF: base: match each node compatible against all given matches first
>>
[...]
>>
>> I don't think the missing compatible is causing it, but of_serial
>> provides a DT match for .type = "serial" just to fail later on
>> with the error seen above.
>>
>> The commit in question reorders of_match_device in a way that match
>> table order is not relevant anymore. This can cause it to match
>> .type = "serial" first here.
>>
>> Rather than touching the commit, I suggest to remove the problematic
>> .type = "serial" from the match table. It is of no use anyway.
> Deleting the "serial" line from the match table fixes the problem.
> I tested it for both orderings of compatible.
I revert my statement about removing anything from of_serial.c. Instead
we should try to prefer matches with compatibles over type/name without
compatibles. Something like the patch below (compile tested only)
[-- Attachment #2: of_base_match.patch --]
[-- Type: text/x-patch, Size: 1484 bytes --]
diff --git a/drivers/of/base.c b/drivers/of/base.c
index ff85450d5683..60da53b385ff 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -734,6 +734,7 @@ static
const struct of_device_id *__of_match_node(const struct of_device_id *matches,
const struct device_node *node)
{
+ const struct of_device_id *m;
const char *cp;
int cplen, l;
@@ -742,15 +743,15 @@ const struct of_device_id *__of_match_node(const struct of_device_id *matches,
cp = __of_get_property(node, "compatible", &cplen);
do {
- const struct of_device_id *m = matches;
+ m = matches;
/* Check against matches with current compatible string */
while (m->name[0] || m->type[0] || m->compatible[0]) {
int match = 1;
- if (m->name[0])
+ if (m->name[0] && m->compatible[0])
match &= node->name
&& !strcmp(m->name, node->name);
- if (m->type[0])
+ if (m->type[0] && m->compatible[0])
match &= node->type
&& !strcmp(m->type, node->type);
if (m->compatible[0])
@@ -770,6 +771,21 @@ const struct of_device_id *__of_match_node(const struct of_device_id *matches,
}
} while (cp && (cplen > 0));
+ /* Check against matches without compatible string */
+ m = matches;
+ while (m->name[0] || m->type[0]) {
+ int match = 1;
+ if (m->name[0])
+ match &= node->name
+ && !strcmp(m->name, node->name);
+ if (m->type[0])
+ match &= node->type
+ && !strcmp(m->type, node->type);
+ if (match)
+ return m;
+ m++;
+ }
+
return NULL;
}
^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-11 23:43 ` Sebastian Hesselbarth
0 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-11 23:43 UTC (permalink / raw)
To: Stephen N Chivers; +Cc: Chris Proctor, linuxppc-dev, Arnd Bergmann, devicetree
[-- Attachment #1: Type: text/plain, Size: 2043 bytes --]
On 02/12/2014 12:38 AM, Stephen N Chivers wrote:
> Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> wrote on
>> On 02/11/2014 11:33 PM, Kumar Gala wrote:
>>> On Feb 11, 2014, at 2:57 PM, Stephen N Chivers <schivers@csc.com.au> wrote:
>>>> I have been trial booting a 3.14-rc2 kernel for a 85xx platform
>>>> (dtbImage).
[...]
>>>>
>>>> of_serial f1004500.serial: Unknown serial port found, ignored.
>>>>
>>>> The serial nodes in boards dts file are specified as:
>>>>
>>>> serial0: serial@4500 {
>>>> cell-index = <0>;
>>>> device_type = "serial";
>>>> compatible = "fsl,ns16550", "ns16550";
>>>> reg = <0x4500 0x100>;
>>>> clock-frequency = <0>;
>>>> interrupts = <0x2a 0x2>;
>>>> interrupt-parent = <&mpic>;
>>>> };
>>>
>>> Wondering if this caused the issue:
>>>
>>> commit 105353145eafb3ea919f5cdeb652a9d8f270228e
>>> Author: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
>>> Date: Tue Dec 3 14:52:00 2013 +0100
>>>
>>> OF: base: match each node compatible against all given matches first
>>
[...]
>>
>> I don't think the missing compatible is causing it, but of_serial
>> provides a DT match for .type = "serial" just to fail later on
>> with the error seen above.
>>
>> The commit in question reorders of_match_device in a way that match
>> table order is not relevant anymore. This can cause it to match
>> .type = "serial" first here.
>>
>> Rather than touching the commit, I suggest to remove the problematic
>> .type = "serial" from the match table. It is of no use anyway.
> Deleting the "serial" line from the match table fixes the problem.
> I tested it for both orderings of compatible.
I revert my statement about removing anything from of_serial.c. Instead
we should try to prefer matches with compatibles over type/name without
compatibles. Something like the patch below (compile tested only)
[-- Attachment #2: of_base_match.patch --]
[-- Type: text/x-patch, Size: 1484 bytes --]
diff --git a/drivers/of/base.c b/drivers/of/base.c
index ff85450d5683..60da53b385ff 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -734,6 +734,7 @@ static
const struct of_device_id *__of_match_node(const struct of_device_id *matches,
const struct device_node *node)
{
+ const struct of_device_id *m;
const char *cp;
int cplen, l;
@@ -742,15 +743,15 @@ const struct of_device_id *__of_match_node(const struct of_device_id *matches,
cp = __of_get_property(node, "compatible", &cplen);
do {
- const struct of_device_id *m = matches;
+ m = matches;
/* Check against matches with current compatible string */
while (m->name[0] || m->type[0] || m->compatible[0]) {
int match = 1;
- if (m->name[0])
+ if (m->name[0] && m->compatible[0])
match &= node->name
&& !strcmp(m->name, node->name);
- if (m->type[0])
+ if (m->type[0] && m->compatible[0])
match &= node->type
&& !strcmp(m->type, node->type);
if (m->compatible[0])
@@ -770,6 +771,21 @@ const struct of_device_id *__of_match_node(const struct of_device_id *matches,
}
} while (cp && (cplen > 0));
+ /* Check against matches without compatible string */
+ m = matches;
+ while (m->name[0] || m->type[0]) {
+ int match = 1;
+ if (m->name[0])
+ match &= node->name
+ && !strcmp(m->name, node->name);
+ if (m->type[0])
+ match &= node->type
+ && !strcmp(m->type, node->type);
+ if (match)
+ return m;
+ m++;
+ }
+
return NULL;
}
^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-11 23:41 ` Scott Wood
@ 2014-02-11 23:46 ` Sebastian Hesselbarth
-1 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-11 23:46 UTC (permalink / raw)
To: Scott Wood
Cc: Kumar Gala, Stephen N Chivers, Chris Proctor,
linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, Arnd Bergmann, devicetree
On 02/12/2014 12:41 AM, Scott Wood wrote:
> On Tue, 2014-02-11 at 23:51 +0100, Sebastian Hesselbarth wrote:
>> On 02/11/2014 11:33 PM, Kumar Gala wrote:
>>> Hmm,
>>>
>>> Wondering if this caused the issue:
>>>
>>> commit 105353145eafb3ea919f5cdeb652a9d8f270228e
>>> Author: Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>>> Date: Tue Dec 3 14:52:00 2013 +0100
>>>
>>> OF: base: match each node compatible against all given matches first
>>
>> [adding Arnd on Cc]
>>
>> Could be. I checked tty/serial/of_serial.c and it does not provide a
>> compatible for "fsl,ns16550". Does reverting the patch fix the issue
>> observed?
>>
>> I don't think the missing compatible is causing it, but of_serial
>> provides a DT match for .type = "serial" just to fail later on
>> with the error seen above.
>>
>> The commit in question reorders of_match_device in a way that match
>> table order is not relevant anymore. This can cause it to match
>> .type = "serial" first here.
>>
>> Rather than touching the commit, I suggest to remove the problematic
>> .type = "serial" from the match table. It is of no use anyway.
>
> Regardless of whether .type = "serial" gets removed, it seems wrong for
> of_match_node() to accept a .type-only match (or .name, or anything else
> that doesn't involve .compatible) before it accepts a compatible match
> other than the first in the compatible property.
Right, I thought about it and came to the same conclusion. I sent a
patch a second ago to prefer .compatible != NULL matches over those
with .compatible == NULL.
Would be great if Stephen can re-test that. If it solves the issue, I
can send a patch tomorrow.
Sebastian
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-11 23:46 ` Sebastian Hesselbarth
0 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-11 23:46 UTC (permalink / raw)
To: Scott Wood
Cc: Chris Proctor, Arnd Bergmann, devicetree, Stephen N Chivers,
linuxppc-dev
On 02/12/2014 12:41 AM, Scott Wood wrote:
> On Tue, 2014-02-11 at 23:51 +0100, Sebastian Hesselbarth wrote:
>> On 02/11/2014 11:33 PM, Kumar Gala wrote:
>>> Hmm,
>>>
>>> Wondering if this caused the issue:
>>>
>>> commit 105353145eafb3ea919f5cdeb652a9d8f270228e
>>> Author: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
>>> Date: Tue Dec 3 14:52:00 2013 +0100
>>>
>>> OF: base: match each node compatible against all given matches first
>>
>> [adding Arnd on Cc]
>>
>> Could be. I checked tty/serial/of_serial.c and it does not provide a
>> compatible for "fsl,ns16550". Does reverting the patch fix the issue
>> observed?
>>
>> I don't think the missing compatible is causing it, but of_serial
>> provides a DT match for .type = "serial" just to fail later on
>> with the error seen above.
>>
>> The commit in question reorders of_match_device in a way that match
>> table order is not relevant anymore. This can cause it to match
>> .type = "serial" first here.
>>
>> Rather than touching the commit, I suggest to remove the problematic
>> .type = "serial" from the match table. It is of no use anyway.
>
> Regardless of whether .type = "serial" gets removed, it seems wrong for
> of_match_node() to accept a .type-only match (or .name, or anything else
> that doesn't involve .compatible) before it accepts a compatible match
> other than the first in the compatible property.
Right, I thought about it and came to the same conclusion. I sent a
patch a second ago to prefer .compatible != NULL matches over those
with .compatible == NULL.
Would be great if Stephen can re-test that. If it solves the issue, I
can send a patch tomorrow.
Sebastian
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-11 23:46 ` Sebastian Hesselbarth
@ 2014-02-12 0:21 ` Stephen N Chivers
-1 siblings, 0 replies; 44+ messages in thread
From: Stephen N Chivers @ 2014-02-12 0:21 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Arnd Bergmann, Chris Proctor, devicetree, Kumar Gala,
linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, Stephen N Chivers,
Scott Wood
Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote on
02/12/2014 10:46:36 AM:
> From: Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> To: Scott Wood <scottwood-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
> Cc: Kumar Gala <galak-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>, Stephen N Chivers
> <schivers-znpAAEhiOVUQrrorzV6ljw@public.gmane.org>, Chris Proctor <cproctor-znpAAEhiOVUQrrorzV6ljw@public.gmane.org>,
> linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>,
> devicetree <devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
> Date: 02/12/2014 11:04 AM
> Subject: Re: Linux-3.14-rc2: Order of serial node compatibles in DTS
files.
>
> On 02/12/2014 12:41 AM, Scott Wood wrote:
> > On Tue, 2014-02-11 at 23:51 +0100, Sebastian Hesselbarth wrote:
> >> On 02/11/2014 11:33 PM, Kumar Gala wrote:
> >>> Hmm,
> >>>
> >>> Wondering if this caused the issue:
> >>>
> >>> commit 105353145eafb3ea919f5cdeb652a9d8f270228e
> >>> Author: Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> >>> Date: Tue Dec 3 14:52:00 2013 +0100
> >>>
> >>> OF: base: match each node compatible against all given matches
first
> >>
> >> [adding Arnd on Cc]
> >>
> >> Could be. I checked tty/serial/of_serial.c and it does not provide a
> >> compatible for "fsl,ns16550". Does reverting the patch fix the issue
> >> observed?
> >>
> >> I don't think the missing compatible is causing it, but of_serial
> >> provides a DT match for .type = "serial" just to fail later on
> >> with the error seen above.
> >>
> >> The commit in question reorders of_match_device in a way that match
> >> table order is not relevant anymore. This can cause it to match
> >> .type = "serial" first here.
> >>
> >> Rather than touching the commit, I suggest to remove the problematic
> >> .type = "serial" from the match table. It is of no use anyway.
> >
> > Regardless of whether .type = "serial" gets removed, it seems wrong
for
> > of_match_node() to accept a .type-only match (or .name, or anything
else
> > that doesn't involve .compatible) before it accepts a compatible match
> > other than the first in the compatible property.
>
> Right, I thought about it and came to the same conclusion. I sent a
> patch a second ago to prefer .compatible != NULL matches over those
> with .compatible == NULL.
>
> Would be great if Stephen can re-test that. If it solves the issue, I
> can send a patch tomorrow.
Done.
But, the Interrupt Controller (MPIC)
goes AWOL and it is down hill from there.
The MPIC is specified in the DTS as:
mpic: pic@40000 {
interrupt-controller;
#address-cells = <0>;
#interrupt-cells = <2>;
reg = <0x40000 0x40000>;
compatible = "chrp,open-pic";
device_type = "open-pic";
big-endian;
};
The board support file has the standard mechanism for allocating
the PIC:
struct mpic *mpic;
mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
BUG_ON(mpic == NULL);
mpic_init(mpic);
I checked for damage in applying the patch and it has applied
correctly.
Stephen Chivers,
CSC Australia Pty. Ltd.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-12 0:21 ` Stephen N Chivers
0 siblings, 0 replies; 44+ messages in thread
From: Stephen N Chivers @ 2014-02-12 0:21 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Chris Proctor, Arnd Bergmann, devicetree, Stephen N Chivers,
Scott Wood, linuxppc-dev
Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> wrote on
02/12/2014 10:46:36 AM:
> From: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
> To: Scott Wood <scottwood@freescale.com>
> Cc: Kumar Gala <galak@kernel.crashing.org>, Stephen N Chivers
> <schivers@csc.com.au>, Chris Proctor <cproctor@csc.com.au>,
> linuxppc-dev@lists.ozlabs.org, Arnd Bergmann <arnd@arndb.de>,
> devicetree <devicetree@vger.kernel.org>
> Date: 02/12/2014 11:04 AM
> Subject: Re: Linux-3.14-rc2: Order of serial node compatibles in DTS
files.
>
> On 02/12/2014 12:41 AM, Scott Wood wrote:
> > On Tue, 2014-02-11 at 23:51 +0100, Sebastian Hesselbarth wrote:
> >> On 02/11/2014 11:33 PM, Kumar Gala wrote:
> >>> Hmm,
> >>>
> >>> Wondering if this caused the issue:
> >>>
> >>> commit 105353145eafb3ea919f5cdeb652a9d8f270228e
> >>> Author: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
> >>> Date: Tue Dec 3 14:52:00 2013 +0100
> >>>
> >>> OF: base: match each node compatible against all given matches
first
> >>
> >> [adding Arnd on Cc]
> >>
> >> Could be. I checked tty/serial/of_serial.c and it does not provide a
> >> compatible for "fsl,ns16550". Does reverting the patch fix the issue
> >> observed?
> >>
> >> I don't think the missing compatible is causing it, but of_serial
> >> provides a DT match for .type = "serial" just to fail later on
> >> with the error seen above.
> >>
> >> The commit in question reorders of_match_device in a way that match
> >> table order is not relevant anymore. This can cause it to match
> >> .type = "serial" first here.
> >>
> >> Rather than touching the commit, I suggest to remove the problematic
> >> .type = "serial" from the match table. It is of no use anyway.
> >
> > Regardless of whether .type = "serial" gets removed, it seems wrong
for
> > of_match_node() to accept a .type-only match (or .name, or anything
else
> > that doesn't involve .compatible) before it accepts a compatible match
> > other than the first in the compatible property.
>
> Right, I thought about it and came to the same conclusion. I sent a
> patch a second ago to prefer .compatible != NULL matches over those
> with .compatible == NULL.
>
> Would be great if Stephen can re-test that. If it solves the issue, I
> can send a patch tomorrow.
Done.
But, the Interrupt Controller (MPIC)
goes AWOL and it is down hill from there.
The MPIC is specified in the DTS as:
mpic: pic@40000 {
interrupt-controller;
#address-cells = <0>;
#interrupt-cells = <2>;
reg = <0x40000 0x40000>;
compatible = "chrp,open-pic";
device_type = "open-pic";
big-endian;
};
The board support file has the standard mechanism for allocating
the PIC:
struct mpic *mpic;
mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
BUG_ON(mpic == NULL);
mpic_init(mpic);
I checked for damage in applying the patch and it has applied
correctly.
Stephen Chivers,
CSC Australia Pty. Ltd.
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-12 0:21 ` Stephen N Chivers
@ 2014-02-12 5:28 ` Kevin Hao
-1 siblings, 0 replies; 44+ messages in thread
From: Kevin Hao @ 2014-02-12 5:28 UTC (permalink / raw)
To: Stephen N Chivers
Cc: Sebastian Hesselbarth, Chris Proctor, Arnd Bergmann, devicetree,
Scott Wood, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
[-- Attachment #1: Type: text/plain, Size: 3128 bytes --]
On Wed, Feb 12, 2014 at 10:21:58AM +1000, Stephen N Chivers wrote:
> But, the Interrupt Controller (MPIC)
> goes AWOL and it is down hill from there.
>
> The MPIC is specified in the DTS as:
>
> mpic: pic@40000 {
> interrupt-controller;
> #address-cells = <0>;
> #interrupt-cells = <2>;
> reg = <0x40000 0x40000>;
> compatible = "chrp,open-pic";
> device_type = "open-pic";
> big-endian;
> };
>
> The board support file has the standard mechanism for allocating
> the PIC:
>
> struct mpic *mpic;
>
> mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
> BUG_ON(mpic == NULL);
>
> mpic_init(mpic);
>
> I checked for damage in applying the patch and it has applied
> correctly.
How about the following fix?
diff --git a/drivers/of/base.c b/drivers/of/base.c
index ff85450d5683..ca91984d3c4b 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -730,32 +730,40 @@ out:
}
EXPORT_SYMBOL(of_find_node_with_property);
+static int of_match_type_name(const struct device_node *node,
+ const struct of_device_id *m)
+{
+ int match = 1;
+
+ if (m->name[0])
+ match &= node->name && !strcmp(m->name, node->name);
+
+ if (m->type[0])
+ match &= node->type && !strcmp(m->type, node->type);
+
+ return match;
+}
+
static
const struct of_device_id *__of_match_node(const struct of_device_id *matches,
const struct device_node *node)
{
const char *cp;
int cplen, l;
+ const struct of_device_id *m;
+ int match;
if (!matches)
return NULL;
cp = __of_get_property(node, "compatible", &cplen);
do {
- const struct of_device_id *m = matches;
+ m = matches;
/* Check against matches with current compatible string */
- while (m->name[0] || m->type[0] || m->compatible[0]) {
- int match = 1;
- if (m->name[0])
- match &= node->name
- && !strcmp(m->name, node->name);
- if (m->type[0])
- match &= node->type
- && !strcmp(m->type, node->type);
- if (m->compatible[0])
- match &= cp
- && !of_compat_cmp(m->compatible, cp,
+ while (m->compatible[0]) {
+ match = of_match_type_name(node, m);
+ match &= cp && !of_compat_cmp(m->compatible, cp,
strlen(m->compatible));
if (match)
return m;
@@ -770,6 +778,15 @@ const struct of_device_id *__of_match_node(const struct of_device_id *matches,
}
} while (cp && (cplen > 0));
+ /* Check against matches without compatible string */
+ m = matches;
+ while (!m->compatible[0] && (m->name[0] || m->type[0])) {
+ match = of_match_type_name(node, m);
+ if (match)
+ return m;
+ m++;
+ }
+
return NULL;
}
Thanks,
Kevin
>
> Stephen Chivers,
> CSC Australia Pty. Ltd.
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-12 5:28 ` Kevin Hao
0 siblings, 0 replies; 44+ messages in thread
From: Kevin Hao @ 2014-02-12 5:28 UTC (permalink / raw)
To: Stephen N Chivers
Cc: Chris Proctor, Arnd Bergmann, devicetree, Scott Wood,
linuxppc-dev, Sebastian Hesselbarth
[-- Attachment #1: Type: text/plain, Size: 3105 bytes --]
On Wed, Feb 12, 2014 at 10:21:58AM +1000, Stephen N Chivers wrote:
> But, the Interrupt Controller (MPIC)
> goes AWOL and it is down hill from there.
>
> The MPIC is specified in the DTS as:
>
> mpic: pic@40000 {
> interrupt-controller;
> #address-cells = <0>;
> #interrupt-cells = <2>;
> reg = <0x40000 0x40000>;
> compatible = "chrp,open-pic";
> device_type = "open-pic";
> big-endian;
> };
>
> The board support file has the standard mechanism for allocating
> the PIC:
>
> struct mpic *mpic;
>
> mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
> BUG_ON(mpic == NULL);
>
> mpic_init(mpic);
>
> I checked for damage in applying the patch and it has applied
> correctly.
How about the following fix?
diff --git a/drivers/of/base.c b/drivers/of/base.c
index ff85450d5683..ca91984d3c4b 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -730,32 +730,40 @@ out:
}
EXPORT_SYMBOL(of_find_node_with_property);
+static int of_match_type_name(const struct device_node *node,
+ const struct of_device_id *m)
+{
+ int match = 1;
+
+ if (m->name[0])
+ match &= node->name && !strcmp(m->name, node->name);
+
+ if (m->type[0])
+ match &= node->type && !strcmp(m->type, node->type);
+
+ return match;
+}
+
static
const struct of_device_id *__of_match_node(const struct of_device_id *matches,
const struct device_node *node)
{
const char *cp;
int cplen, l;
+ const struct of_device_id *m;
+ int match;
if (!matches)
return NULL;
cp = __of_get_property(node, "compatible", &cplen);
do {
- const struct of_device_id *m = matches;
+ m = matches;
/* Check against matches with current compatible string */
- while (m->name[0] || m->type[0] || m->compatible[0]) {
- int match = 1;
- if (m->name[0])
- match &= node->name
- && !strcmp(m->name, node->name);
- if (m->type[0])
- match &= node->type
- && !strcmp(m->type, node->type);
- if (m->compatible[0])
- match &= cp
- && !of_compat_cmp(m->compatible, cp,
+ while (m->compatible[0]) {
+ match = of_match_type_name(node, m);
+ match &= cp && !of_compat_cmp(m->compatible, cp,
strlen(m->compatible));
if (match)
return m;
@@ -770,6 +778,15 @@ const struct of_device_id *__of_match_node(const struct of_device_id *matches,
}
} while (cp && (cplen > 0));
+ /* Check against matches without compatible string */
+ m = matches;
+ while (!m->compatible[0] && (m->name[0] || m->type[0])) {
+ match = of_match_type_name(node, m);
+ if (match)
+ return m;
+ m++;
+ }
+
return NULL;
}
Thanks,
Kevin
>
> Stephen Chivers,
> CSC Australia Pty. Ltd.
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-12 0:21 ` Stephen N Chivers
@ 2014-02-12 8:25 ` Sebastian Hesselbarth
-1 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-12 8:25 UTC (permalink / raw)
To: Stephen N Chivers
Cc: Arnd Bergmann, Chris Proctor, devicetree, Kumar Gala,
linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ, Scott Wood
On 02/12/2014 01:21 AM, Stephen N Chivers wrote:
> Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote on
> 02/12/2014 10:46:36 AM:
>
>> From: Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> To: Scott Wood <scottwood-KZfg59tc24xl57MIdRCFDg@public.gmane.org>
>> Cc: Kumar Gala <galak-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>, Stephen N Chivers
>> <schivers-znpAAEhiOVUQrrorzV6ljw@public.gmane.org>, Chris Proctor <cproctor-znpAAEhiOVUQrrorzV6ljw@public.gmane.org>,
>> linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org, Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>,
>> devicetree <devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
>> Date: 02/12/2014 11:04 AM
>> Subject: Re: Linux-3.14-rc2: Order of serial node compatibles in DTS
> files.
>>
>> On 02/12/2014 12:41 AM, Scott Wood wrote:
>>>
>>> Regardless of whether .type = "serial" gets removed, it seems wrong for
>>> of_match_node() to accept a .type-only match (or .name, or anything else
>>> that doesn't involve .compatible) before it accepts a compatible match
>>> other than the first in the compatible property.
>>
>> Right, I thought about it and came to the same conclusion. I sent a
>> patch a second ago to prefer .compatible != NULL matches over those
>> with .compatible == NULL.
>>
>> Would be great if Stephen can re-test that. If it solves the issue, I
>> can send a patch tomorrow.
> Done.
>
> But, the Interrupt Controller (MPIC)
> goes AWOL and it is down hill from there.
>
> The MPIC is specified in the DTS as:
>
> mpic: pic@40000 {
> interrupt-controller;
> #address-cells = <0>;
> #interrupt-cells = <2>;
> reg = <0x40000 0x40000>;
> compatible = "chrp,open-pic";
> device_type = "open-pic";
> big-endian;
> };
>
> The board support file has the standard mechanism for allocating
> the PIC:
>
> struct mpic *mpic;
>
> mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
> BUG_ON(mpic == NULL);
>
> mpic_init(mpic);
>
> I checked for damage in applying the patch and it has applied
> correctly.
Hmm, I did a mistake in the patch:
- while (m->name[0] || m->type[0]) {
+ while (m->compatible[0] || m->name[0] || m->type[0]) {
for the second added match. Otherwise, the matches are not
evaluated down to the sentinel but the loop quits on the first
match table entry without .name and .type set.
Sebastian
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-12 8:25 ` Sebastian Hesselbarth
0 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-12 8:25 UTC (permalink / raw)
To: Stephen N Chivers
Cc: Chris Proctor, Arnd Bergmann, devicetree, Scott Wood, linuxppc-dev
On 02/12/2014 01:21 AM, Stephen N Chivers wrote:
> Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> wrote on
> 02/12/2014 10:46:36 AM:
>
>> From: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
>> To: Scott Wood <scottwood@freescale.com>
>> Cc: Kumar Gala <galak@kernel.crashing.org>, Stephen N Chivers
>> <schivers@csc.com.au>, Chris Proctor <cproctor@csc.com.au>,
>> linuxppc-dev@lists.ozlabs.org, Arnd Bergmann <arnd@arndb.de>,
>> devicetree <devicetree@vger.kernel.org>
>> Date: 02/12/2014 11:04 AM
>> Subject: Re: Linux-3.14-rc2: Order of serial node compatibles in DTS
> files.
>>
>> On 02/12/2014 12:41 AM, Scott Wood wrote:
>>>
>>> Regardless of whether .type = "serial" gets removed, it seems wrong for
>>> of_match_node() to accept a .type-only match (or .name, or anything else
>>> that doesn't involve .compatible) before it accepts a compatible match
>>> other than the first in the compatible property.
>>
>> Right, I thought about it and came to the same conclusion. I sent a
>> patch a second ago to prefer .compatible != NULL matches over those
>> with .compatible == NULL.
>>
>> Would be great if Stephen can re-test that. If it solves the issue, I
>> can send a patch tomorrow.
> Done.
>
> But, the Interrupt Controller (MPIC)
> goes AWOL and it is down hill from there.
>
> The MPIC is specified in the DTS as:
>
> mpic: pic@40000 {
> interrupt-controller;
> #address-cells = <0>;
> #interrupt-cells = <2>;
> reg = <0x40000 0x40000>;
> compatible = "chrp,open-pic";
> device_type = "open-pic";
> big-endian;
> };
>
> The board support file has the standard mechanism for allocating
> the PIC:
>
> struct mpic *mpic;
>
> mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
> BUG_ON(mpic == NULL);
>
> mpic_init(mpic);
>
> I checked for damage in applying the patch and it has applied
> correctly.
Hmm, I did a mistake in the patch:
- while (m->name[0] || m->type[0]) {
+ while (m->compatible[0] || m->name[0] || m->type[0]) {
for the second added match. Otherwise, the matches are not
evaluated down to the sentinel but the loop quits on the first
match table entry without .name and .type set.
Sebastian
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-12 5:28 ` Kevin Hao
@ 2014-02-12 8:30 ` Sebastian Hesselbarth
-1 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-12 8:30 UTC (permalink / raw)
To: Kevin Hao, Stephen N Chivers
Cc: Chris Proctor, Arnd Bergmann, devicetree, Scott Wood,
linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
On 02/12/2014 06:28 AM, Kevin Hao wrote:
> On Wed, Feb 12, 2014 at 10:21:58AM +1000, Stephen N Chivers wrote:
>> But, the Interrupt Controller (MPIC)
>> goes AWOL and it is down hill from there.
>>
>> The MPIC is specified in the DTS as:
>>
>> mpic: pic@40000 {
>> interrupt-controller;
>> #address-cells = <0>;
>> #interrupt-cells = <2>;
>> reg = <0x40000 0x40000>;
>> compatible = "chrp,open-pic";
>> device_type = "open-pic";
>> big-endian;
>> };
>>
>> The board support file has the standard mechanism for allocating
>> the PIC:
>>
>> struct mpic *mpic;
>>
>> mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
>> BUG_ON(mpic == NULL);
>>
>> mpic_init(mpic);
>>
>> I checked for damage in applying the patch and it has applied
>> correctly.
>
> How about the following fix?
>
> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index ff85450d5683..ca91984d3c4b 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -730,32 +730,40 @@ out:
> }
> EXPORT_SYMBOL(of_find_node_with_property);
>
> +static int of_match_type_name(const struct device_node *node,
> + const struct of_device_id *m)
I am fine with having a sub-function here, but it should rather be
named of_match_type_or_name.
> +{
> + int match = 1;
> +
> + if (m->name[0])
> + match &= node->name && !strcmp(m->name, node->name);
> +
> + if (m->type[0])
> + match &= node->type && !strcmp(m->type, node->type);
> +
> + return match;
> +}
[...]
> + /* Check against matches without compatible string */
> + m = matches;
> + while (!m->compatible[0] && (m->name[0] || m->type[0])) {
We shouldn't check for anything else than the sentinel here.
Although I guess yours will not quit early as mine did but that
way we don't have to worry about it.
Sebastian
> + match = of_match_type_name(node, m);
> + if (match)
> + return m;
> + m++;
> + }
> +
> return NULL;
> }
>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-12 8:30 ` Sebastian Hesselbarth
0 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-12 8:30 UTC (permalink / raw)
To: Kevin Hao, Stephen N Chivers
Cc: Scott Wood, Chris Proctor, linuxppc-dev, Arnd Bergmann, devicetree
On 02/12/2014 06:28 AM, Kevin Hao wrote:
> On Wed, Feb 12, 2014 at 10:21:58AM +1000, Stephen N Chivers wrote:
>> But, the Interrupt Controller (MPIC)
>> goes AWOL and it is down hill from there.
>>
>> The MPIC is specified in the DTS as:
>>
>> mpic: pic@40000 {
>> interrupt-controller;
>> #address-cells = <0>;
>> #interrupt-cells = <2>;
>> reg = <0x40000 0x40000>;
>> compatible = "chrp,open-pic";
>> device_type = "open-pic";
>> big-endian;
>> };
>>
>> The board support file has the standard mechanism for allocating
>> the PIC:
>>
>> struct mpic *mpic;
>>
>> mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
>> BUG_ON(mpic == NULL);
>>
>> mpic_init(mpic);
>>
>> I checked for damage in applying the patch and it has applied
>> correctly.
>
> How about the following fix?
>
> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index ff85450d5683..ca91984d3c4b 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -730,32 +730,40 @@ out:
> }
> EXPORT_SYMBOL(of_find_node_with_property);
>
> +static int of_match_type_name(const struct device_node *node,
> + const struct of_device_id *m)
I am fine with having a sub-function here, but it should rather be
named of_match_type_or_name.
> +{
> + int match = 1;
> +
> + if (m->name[0])
> + match &= node->name && !strcmp(m->name, node->name);
> +
> + if (m->type[0])
> + match &= node->type && !strcmp(m->type, node->type);
> +
> + return match;
> +}
[...]
> + /* Check against matches without compatible string */
> + m = matches;
> + while (!m->compatible[0] && (m->name[0] || m->type[0])) {
We shouldn't check for anything else than the sentinel here.
Although I guess yours will not quit early as mine did but that
way we don't have to worry about it.
Sebastian
> + match = of_match_type_name(node, m);
> + if (match)
> + return m;
> + m++;
> + }
> +
> return NULL;
> }
>
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-12 8:30 ` Sebastian Hesselbarth
@ 2014-02-12 10:31 ` Kevin Hao
-1 siblings, 0 replies; 44+ messages in thread
From: Kevin Hao @ 2014-02-12 10:31 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Stephen N Chivers, Chris Proctor, Arnd Bergmann, devicetree,
Scott Wood, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
[-- Attachment #1: Type: text/plain, Size: 2551 bytes --]
On Wed, Feb 12, 2014 at 09:30:00AM +0100, Sebastian Hesselbarth wrote:
> On 02/12/2014 06:28 AM, Kevin Hao wrote:
> >On Wed, Feb 12, 2014 at 10:21:58AM +1000, Stephen N Chivers wrote:
> >>But, the Interrupt Controller (MPIC)
> >>goes AWOL and it is down hill from there.
> >>
> >>The MPIC is specified in the DTS as:
> >>
> >> mpic: pic@40000 {
> >> interrupt-controller;
> >> #address-cells = <0>;
> >> #interrupt-cells = <2>;
> >> reg = <0x40000 0x40000>;
> >> compatible = "chrp,open-pic";
> >> device_type = "open-pic";
> >> big-endian;
> >> };
> >>
> >>The board support file has the standard mechanism for allocating
> >>the PIC:
> >>
> >> struct mpic *mpic;
> >>
> >> mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
> >> BUG_ON(mpic == NULL);
> >>
> >> mpic_init(mpic);
> >>
> >>I checked for damage in applying the patch and it has applied
> >>correctly.
> >
> >How about the following fix?
> >
> >diff --git a/drivers/of/base.c b/drivers/of/base.c
> >index ff85450d5683..ca91984d3c4b 100644
> >--- a/drivers/of/base.c
> >+++ b/drivers/of/base.c
> >@@ -730,32 +730,40 @@ out:
> > }
> > EXPORT_SYMBOL(of_find_node_with_property);
> >
> >+static int of_match_type_name(const struct device_node *node,
> >+ const struct of_device_id *m)
>
> I am fine with having a sub-function here, but it should rather be
> named of_match_type_or_name.
OK.
>
> >+{
> >+ int match = 1;
> >+
> >+ if (m->name[0])
> >+ match &= node->name && !strcmp(m->name, node->name);
> >+
> >+ if (m->type[0])
> >+ match &= node->type && !strcmp(m->type, node->type);
> >+
> >+ return match;
> >+}
> [...]
> >+ /* Check against matches without compatible string */
> >+ m = matches;
> >+ while (!m->compatible[0] && (m->name[0] || m->type[0])) {
>
> We shouldn't check for anything else than the sentinel here.
> Although I guess yours will not quit early as mine did but that
> way we don't have to worry about it.
Yes, this is still buggy. I will change something like this:
m = matches;
/* Check against matches without compatible string */
while (m->name[0] || m->type[0] || m->compatible[0]) {
if (m->compatible[0]) {
m++;
continue;
}
match = of_match_type_name(node, m);
if (match)
return m;
m++;
}
Thanks,
Kevin
[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-12 10:31 ` Kevin Hao
0 siblings, 0 replies; 44+ messages in thread
From: Kevin Hao @ 2014-02-12 10:31 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Chris Proctor, Arnd Bergmann, devicetree, Stephen N Chivers,
Scott Wood, linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 2551 bytes --]
On Wed, Feb 12, 2014 at 09:30:00AM +0100, Sebastian Hesselbarth wrote:
> On 02/12/2014 06:28 AM, Kevin Hao wrote:
> >On Wed, Feb 12, 2014 at 10:21:58AM +1000, Stephen N Chivers wrote:
> >>But, the Interrupt Controller (MPIC)
> >>goes AWOL and it is down hill from there.
> >>
> >>The MPIC is specified in the DTS as:
> >>
> >> mpic: pic@40000 {
> >> interrupt-controller;
> >> #address-cells = <0>;
> >> #interrupt-cells = <2>;
> >> reg = <0x40000 0x40000>;
> >> compatible = "chrp,open-pic";
> >> device_type = "open-pic";
> >> big-endian;
> >> };
> >>
> >>The board support file has the standard mechanism for allocating
> >>the PIC:
> >>
> >> struct mpic *mpic;
> >>
> >> mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
> >> BUG_ON(mpic == NULL);
> >>
> >> mpic_init(mpic);
> >>
> >>I checked for damage in applying the patch and it has applied
> >>correctly.
> >
> >How about the following fix?
> >
> >diff --git a/drivers/of/base.c b/drivers/of/base.c
> >index ff85450d5683..ca91984d3c4b 100644
> >--- a/drivers/of/base.c
> >+++ b/drivers/of/base.c
> >@@ -730,32 +730,40 @@ out:
> > }
> > EXPORT_SYMBOL(of_find_node_with_property);
> >
> >+static int of_match_type_name(const struct device_node *node,
> >+ const struct of_device_id *m)
>
> I am fine with having a sub-function here, but it should rather be
> named of_match_type_or_name.
OK.
>
> >+{
> >+ int match = 1;
> >+
> >+ if (m->name[0])
> >+ match &= node->name && !strcmp(m->name, node->name);
> >+
> >+ if (m->type[0])
> >+ match &= node->type && !strcmp(m->type, node->type);
> >+
> >+ return match;
> >+}
> [...]
> >+ /* Check against matches without compatible string */
> >+ m = matches;
> >+ while (!m->compatible[0] && (m->name[0] || m->type[0])) {
>
> We shouldn't check for anything else than the sentinel here.
> Although I guess yours will not quit early as mine did but that
> way we don't have to worry about it.
Yes, this is still buggy. I will change something like this:
m = matches;
/* Check against matches without compatible string */
while (m->name[0] || m->type[0] || m->compatible[0]) {
if (m->compatible[0]) {
m++;
continue;
}
match = of_match_type_name(node, m);
if (match)
return m;
m++;
}
Thanks,
Kevin
[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-12 8:25 ` Sebastian Hesselbarth
@ 2014-02-12 10:35 ` Kevin Hao
-1 siblings, 0 replies; 44+ messages in thread
From: Kevin Hao @ 2014-02-12 10:35 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Stephen N Chivers, Chris Proctor, Arnd Bergmann, devicetree,
Scott Wood, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
[-- Attachment #1: Type: text/plain, Size: 710 bytes --]
On Wed, Feb 12, 2014 at 09:25:24AM +0100, Sebastian Hesselbarth wrote:
>
> Hmm, I did a mistake in the patch:
>
> - while (m->name[0] || m->type[0]) {
> + while (m->compatible[0] || m->name[0] || m->type[0]) {
>
> for the second added match. Otherwise, the matches are not
> evaluated down to the sentinel but the loop quits on the first
> match table entry without .name and .type set.
But this is still not right. We also need to skip the matches with
.compatible here.
Thanks,
Kevin
>
> Sebastian
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ@public.gmane.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-12 10:35 ` Kevin Hao
0 siblings, 0 replies; 44+ messages in thread
From: Kevin Hao @ 2014-02-12 10:35 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Chris Proctor, Arnd Bergmann, devicetree, Stephen N Chivers,
Scott Wood, linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 687 bytes --]
On Wed, Feb 12, 2014 at 09:25:24AM +0100, Sebastian Hesselbarth wrote:
>
> Hmm, I did a mistake in the patch:
>
> - while (m->name[0] || m->type[0]) {
> + while (m->compatible[0] || m->name[0] || m->type[0]) {
>
> for the second added match. Otherwise, the matches are not
> evaluated down to the sentinel but the loop quits on the first
> match table entry without .name and .type set.
But this is still not right. We also need to skip the matches with
.compatible here.
Thanks,
Kevin
>
> Sebastian
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-11 23:43 ` Sebastian Hesselbarth
@ 2014-02-12 11:00 ` Arnd Bergmann
-1 siblings, 0 replies; 44+ messages in thread
From: Arnd Bergmann @ 2014-02-12 11:00 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Stephen N Chivers, Chris Proctor, devicetree, Kumar Gala,
linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
On Wednesday 12 February 2014, Sebastian Hesselbarth wrote:
> On 02/12/2014 12:38 AM, Stephen N Chivers wrote:
> > Sebastian Hesselbarth <sebastian.hesselbarth-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote on
> >> I don't think the missing compatible is causing it, but of_serial
> >> provides a DT match for .type = "serial" just to fail later on
> >> with the error seen above.
> >>
> >> The commit in question reorders of_match_device in a way that match
> >> table order is not relevant anymore. This can cause it to match
> >> .type = "serial" first here.
> >>
> >> Rather than touching the commit, I suggest to remove the problematic
> >> .type = "serial" from the match table. It is of no use anyway.
> > Deleting the "serial" line from the match table fixes the problem.
> > I tested it for both orderings of compatible.
>
> I revert my statement about removing anything from of_serial.c. Instead
> we should try to prefer matches with compatibles over type/name without
> compatibles. Something like the patch below (compile tested only)
That would probably be a good idea. However, I think in this
case we also want to remove the line from the driver, as it clearly
never works on any hardware and the driver just errors out for the
device_type match.
Arnd
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-12 11:00 ` Arnd Bergmann
0 siblings, 0 replies; 44+ messages in thread
From: Arnd Bergmann @ 2014-02-12 11:00 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Chris Proctor, linuxppc-dev, Stephen N Chivers, devicetree
On Wednesday 12 February 2014, Sebastian Hesselbarth wrote:
> On 02/12/2014 12:38 AM, Stephen N Chivers wrote:
> > Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> wrote on
> >> I don't think the missing compatible is causing it, but of_serial
> >> provides a DT match for .type = "serial" just to fail later on
> >> with the error seen above.
> >>
> >> The commit in question reorders of_match_device in a way that match
> >> table order is not relevant anymore. This can cause it to match
> >> .type = "serial" first here.
> >>
> >> Rather than touching the commit, I suggest to remove the problematic
> >> .type = "serial" from the match table. It is of no use anyway.
> > Deleting the "serial" line from the match table fixes the problem.
> > I tested it for both orderings of compatible.
>
> I revert my statement about removing anything from of_serial.c. Instead
> we should try to prefer matches with compatibles over type/name without
> compatibles. Something like the patch below (compile tested only)
That would probably be a good idea. However, I think in this
case we also want to remove the line from the driver, as it clearly
never works on any hardware and the driver just errors out for the
device_type match.
Arnd
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-12 10:31 ` Kevin Hao
@ 2014-02-12 11:26 ` Sebastian Hesselbarth
-1 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-12 11:26 UTC (permalink / raw)
To: Kevin Hao
Cc: Stephen N Chivers, Chris Proctor, Arnd Bergmann, devicetree,
Scott Wood, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
On 02/12/14 11:31, Kevin Hao wrote:
> On Wed, Feb 12, 2014 at 09:30:00AM +0100, Sebastian Hesselbarth wrote:
>> On 02/12/2014 06:28 AM, Kevin Hao wrote:
>>> On Wed, Feb 12, 2014 at 10:21:58AM +1000, Stephen N Chivers wrote:
>>>> But, the Interrupt Controller (MPIC)
>>>> goes AWOL and it is down hill from there.
>>>>
>>>> The MPIC is specified in the DTS as:
>>>>
>>>> mpic: pic@40000 {
>>>> interrupt-controller;
>>>> #address-cells = <0>;
>>>> #interrupt-cells = <2>;
>>>> reg = <0x40000 0x40000>;
>>>> compatible = "chrp,open-pic";
>>>> device_type = "open-pic";
>>>> big-endian;
>>>> };
>>>>
>>>> The board support file has the standard mechanism for allocating
>>>> the PIC:
>>>>
>>>> struct mpic *mpic;
>>>>
>>>> mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
>>>> BUG_ON(mpic == NULL);
>>>>
>>>> mpic_init(mpic);
>>>>
>>>> I checked for damage in applying the patch and it has applied
>>>> correctly.
>>>
>>> How about the following fix?
>>>
>>> diff --git a/drivers/of/base.c b/drivers/of/base.c
>>> index ff85450d5683..ca91984d3c4b 100644
>>> --- a/drivers/of/base.c
>>> +++ b/drivers/of/base.c
>>> @@ -730,32 +730,40 @@ out:
>>> }
>>> EXPORT_SYMBOL(of_find_node_with_property);
>>>
>>> +static int of_match_type_name(const struct device_node *node,
>>> + const struct of_device_id *m)
>>
>> I am fine with having a sub-function here, but it should rather be
>> named of_match_type_or_name.
>
> OK.
>
>>
>>> +{
>>> + int match = 1;
>>> +
>>> + if (m->name[0])
>>> + match &= node->name && !strcmp(m->name, node->name);
>>> +
>>> + if (m->type[0])
>>> + match &= node->type && !strcmp(m->type, node->type);
>>> +
>>> + return match;
>>> +}
>> [...]
>>> + /* Check against matches without compatible string */
>>> + m = matches;
>>> + while (!m->compatible[0] && (m->name[0] || m->type[0])) {
>>
>> We shouldn't check for anything else than the sentinel here.
>> Although I guess yours will not quit early as mine did but that
>> way we don't have to worry about it.
>
> Yes, this is still buggy. I will change something like this:
>
> m = matches;
> /* Check against matches without compatible string */
> while (m->name[0] || m->type[0] || m->compatible[0]) {
> if (m->compatible[0]) {
> m++;
> continue;
> }
>
> match = of_match_type_name(node, m);
> if (match)
> return m;
> m++;
> }
You can cook it down to:
m = matches;
/* Check against matches without compatible string */
while (m->name[0] || m->type[0] || m->compatible[0]) {
if (!m->compatible[0] && of_match_type_or_name(node, m)
return m;
m++;
}
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-12 11:26 ` Sebastian Hesselbarth
0 siblings, 0 replies; 44+ messages in thread
From: Sebastian Hesselbarth @ 2014-02-12 11:26 UTC (permalink / raw)
To: Kevin Hao
Cc: Chris Proctor, Arnd Bergmann, devicetree, Stephen N Chivers,
Scott Wood, linuxppc-dev
On 02/12/14 11:31, Kevin Hao wrote:
> On Wed, Feb 12, 2014 at 09:30:00AM +0100, Sebastian Hesselbarth wrote:
>> On 02/12/2014 06:28 AM, Kevin Hao wrote:
>>> On Wed, Feb 12, 2014 at 10:21:58AM +1000, Stephen N Chivers wrote:
>>>> But, the Interrupt Controller (MPIC)
>>>> goes AWOL and it is down hill from there.
>>>>
>>>> The MPIC is specified in the DTS as:
>>>>
>>>> mpic: pic@40000 {
>>>> interrupt-controller;
>>>> #address-cells = <0>;
>>>> #interrupt-cells = <2>;
>>>> reg = <0x40000 0x40000>;
>>>> compatible = "chrp,open-pic";
>>>> device_type = "open-pic";
>>>> big-endian;
>>>> };
>>>>
>>>> The board support file has the standard mechanism for allocating
>>>> the PIC:
>>>>
>>>> struct mpic *mpic;
>>>>
>>>> mpic = mpic_alloc(NULL, 0, 0, 0, 256, " OpenPIC ");
>>>> BUG_ON(mpic == NULL);
>>>>
>>>> mpic_init(mpic);
>>>>
>>>> I checked for damage in applying the patch and it has applied
>>>> correctly.
>>>
>>> How about the following fix?
>>>
>>> diff --git a/drivers/of/base.c b/drivers/of/base.c
>>> index ff85450d5683..ca91984d3c4b 100644
>>> --- a/drivers/of/base.c
>>> +++ b/drivers/of/base.c
>>> @@ -730,32 +730,40 @@ out:
>>> }
>>> EXPORT_SYMBOL(of_find_node_with_property);
>>>
>>> +static int of_match_type_name(const struct device_node *node,
>>> + const struct of_device_id *m)
>>
>> I am fine with having a sub-function here, but it should rather be
>> named of_match_type_or_name.
>
> OK.
>
>>
>>> +{
>>> + int match = 1;
>>> +
>>> + if (m->name[0])
>>> + match &= node->name && !strcmp(m->name, node->name);
>>> +
>>> + if (m->type[0])
>>> + match &= node->type && !strcmp(m->type, node->type);
>>> +
>>> + return match;
>>> +}
>> [...]
>>> + /* Check against matches without compatible string */
>>> + m = matches;
>>> + while (!m->compatible[0] && (m->name[0] || m->type[0])) {
>>
>> We shouldn't check for anything else than the sentinel here.
>> Although I guess yours will not quit early as mine did but that
>> way we don't have to worry about it.
>
> Yes, this is still buggy. I will change something like this:
>
> m = matches;
> /* Check against matches without compatible string */
> while (m->name[0] || m->type[0] || m->compatible[0]) {
> if (m->compatible[0]) {
> m++;
> continue;
> }
>
> match = of_match_type_name(node, m);
> if (match)
> return m;
> m++;
> }
You can cook it down to:
m = matches;
/* Check against matches without compatible string */
while (m->name[0] || m->type[0] || m->compatible[0]) {
if (!m->compatible[0] && of_match_type_or_name(node, m)
return m;
m++;
}
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
2014-02-12 11:26 ` Sebastian Hesselbarth
@ 2014-02-12 11:32 ` Kevin Hao
-1 siblings, 0 replies; 44+ messages in thread
From: Kevin Hao @ 2014-02-12 11:32 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Stephen N Chivers, Chris Proctor, Arnd Bergmann, devicetree,
Scott Wood, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
[-- Attachment #1: Type: text/plain, Size: 361 bytes --]
On Wed, Feb 12, 2014 at 12:26:14PM +0100, Sebastian Hesselbarth wrote:
> You can cook it down to:
>
> m = matches;
> /* Check against matches without compatible string */
> while (m->name[0] || m->type[0] || m->compatible[0]) {
> if (!m->compatible[0] && of_match_type_or_name(node, m)
> return m;
> m++;
> }
Will do.
Thanks,
Kevin
[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Linux-3.14-rc2: Order of serial node compatibles in DTS files.
@ 2014-02-12 11:32 ` Kevin Hao
0 siblings, 0 replies; 44+ messages in thread
From: Kevin Hao @ 2014-02-12 11:32 UTC (permalink / raw)
To: Sebastian Hesselbarth
Cc: Chris Proctor, Arnd Bergmann, devicetree, Stephen N Chivers,
Scott Wood, linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 361 bytes --]
On Wed, Feb 12, 2014 at 12:26:14PM +0100, Sebastian Hesselbarth wrote:
> You can cook it down to:
>
> m = matches;
> /* Check against matches without compatible string */
> while (m->name[0] || m->type[0] || m->compatible[0]) {
> if (!m->compatible[0] && of_match_type_or_name(node, m)
> return m;
> m++;
> }
Will do.
Thanks,
Kevin
[-- Attachment #2: Type: application/pgp-signature, Size: 490 bytes --]
^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2014-02-12 11:33 UTC | newest]
Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-06 2:09 arch/powerpc/math-emu/mtfsf.c - incorrect mask? Stephen N Chivers
2014-02-06 8:26 ` Gabriel Paubert
2014-02-07 1:27 ` Stephen N Chivers
2014-02-07 10:10 ` Gabriel Paubert
2014-02-07 20:49 ` James Yang
2014-02-09 19:42 ` Stephen N Chivers
2014-02-10 16:50 ` James Yang
2014-02-10 11:03 ` Gabriel Paubert
2014-02-10 11:17 ` David Laight
2014-02-10 12:21 ` Gabriel Paubert
2014-02-10 12:32 ` David Laight
2014-02-10 13:00 ` Gabriel Paubert
2014-02-10 17:03 ` James Yang
2014-02-11 7:26 ` Gabriel Paubert
2014-02-11 20:57 ` Linux-3.14-rc2: Order of serial node compatibles in DTS files Stephen N Chivers
[not found] ` <OF6F6A0029.3B20EF5B-ONCA257C7C.0071BFDA-CA257C7C.00732AD3-SmukeSwxQOQ@public.gmane.org>
2014-02-11 22:33 ` Kumar Gala
2014-02-11 22:33 ` Kumar Gala
[not found] ` <63AEBD99-AA87-4FD7-BBDA-0CE419959F14-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
2014-02-11 22:51 ` Sebastian Hesselbarth
2014-02-11 22:51 ` Sebastian Hesselbarth
2014-02-11 23:38 ` Stephen N Chivers
[not found] ` <OFB203CA90.B048F8AA-ONCA257C7C.0081A816-CA257C7C.0081DCE3-SmukeSwxQOQ@public.gmane.org>
2014-02-11 23:43 ` Sebastian Hesselbarth
2014-02-11 23:43 ` Sebastian Hesselbarth
[not found] ` <52FAB5A7.7080208-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-02-12 11:00 ` Arnd Bergmann
2014-02-12 11:00 ` Arnd Bergmann
[not found] ` <52FAA97F.4060600-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-02-11 23:41 ` Scott Wood
2014-02-11 23:41 ` Scott Wood
[not found] ` <1392162080.6733.404.camel-88ow+0ZRuxG2UiBs7uKeOtHuzzzSOjJt@public.gmane.org>
2014-02-11 23:46 ` Sebastian Hesselbarth
2014-02-11 23:46 ` Sebastian Hesselbarth
[not found] ` <52FAB65C.4090201-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-02-12 0:21 ` Stephen N Chivers
2014-02-12 0:21 ` Stephen N Chivers
[not found] ` <OF9BA019D5.46DA1E29-ONCA257C7D.000089EA-CA257C7D.00020333-SmukeSwxQOQ@public.gmane.org>
2014-02-12 5:28 ` Kevin Hao
2014-02-12 5:28 ` Kevin Hao
[not found] ` <20140212052816.GA15434-2Y/eJ/Q4tBKYoDLmhA0N63T7OC0tnCxdQQ4Iyu8u01E@public.gmane.org>
2014-02-12 8:30 ` Sebastian Hesselbarth
2014-02-12 8:30 ` Sebastian Hesselbarth
[not found] ` <52FB3108.3000202-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-02-12 10:31 ` Kevin Hao
2014-02-12 10:31 ` Kevin Hao
[not found] ` <20140212103153.GC15434-2Y/eJ/Q4tBKYoDLmhA0N63T7OC0tnCxdQQ4Iyu8u01E@public.gmane.org>
2014-02-12 11:26 ` Sebastian Hesselbarth
2014-02-12 11:26 ` Sebastian Hesselbarth
[not found] ` <52FB5A56.2050402-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-02-12 11:32 ` Kevin Hao
2014-02-12 11:32 ` Kevin Hao
2014-02-12 8:25 ` Sebastian Hesselbarth
2014-02-12 8:25 ` Sebastian Hesselbarth
[not found] ` <52FB2FF4.6060905-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2014-02-12 10:35 ` Kevin Hao
2014-02-12 10:35 ` Kevin Hao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.