All of lore.kernel.org
 help / color / mirror / Atom feed
* [OpenRISC] PCCR and PCRM registers
       [not found] <009d01d58416$7125fe80$5371fb80$@2se.es>
@ 2019-10-17  5:13 ` Stafford Horne
  2019-10-17 15:01   ` ecalvo
  0 siblings, 1 reply; 7+ messages in thread
From: Stafford Horne @ 2019-10-17  5:13 UTC (permalink / raw)
  To: openrisc

+cc mailing list,

Hi Elisa,

Which toolchain are you using? I guess newlib?

It has functions like or1k_mfspr() and or1k_mtspr() see or1k-support.h and
or1k-sprs.h headers for details.

-Stafford

On Wed, Oct 16, 2019, 8:40 PM <ecalvo@2se.es> wrote:

> Hi Stafford,
>
> I am with PCCR and PCRM registers. I have seen that I can access from asm
> language, but there is functions to access from C? Have you got any example
> about their usage?
>
> I have already confirmed my subscription to the mailing list.
>
> Thanks
> Elisa
>
> -----Mensaje original-----
> De: Stafford Horne <shorne@gmail.com>
> Enviado el: miércoles, 9 de octubre de 2019 13:38
> Para: ecalvo at 2se.es; Julius Baxter <juliusbaxter@gmail.com>
> Asunto: Re: other doubt
>
> Hello Elisa,
>
> If you simulate with Icarus or modelsim you will be able to measure pretty
> much the same performance characteristics as FPGA. So there is no need to
> go straight to FPGA.
>
> In terms of my example C code is one option.  You can also read timer data
> directly from the tick timer in assembly and achieve the same thing.
>
> If you are interested we can CC the mailing list and get more opinions.
>
> -Stafford
>
> On Wed, Oct 9, 2019 at 5:09 PM <ecalvo@2se.es> wrote:
> >
> > Hi Stafford,
> >
> > Nice to meet you and, first of all, thanks a lot for your guidance. I am
> new on this, and although there is some documentation, sometimes it is
> difficult some point which maybe it is basic.
> >
> > Ok, to your comments. If.."A simulator like QEMU or or1ksim will not
> give and exact representation of the CPUs real time
> performance"...then...if I simulate directly the processor with modelsim,
> icarus or a similar tool...neither I get a real performance, don’t I? And
> values for the counters that you tell me to enable, neither are real, isn't
> it? should I execute it directly on the FPGA and it will depends on the
> implementation?
> >
> > Ok, to C code. I have understood the dependency with toolchain.
> >
> > Thanks a lot again.
> > Best regards,
> > Elisa
> >
> >
> > -----Mensaje original-----
> > De: Stafford Horne <shorne@gmail.com>
> > Enviado el: martes, 8 de octubre de 2019 16:18
> > Para: Julius Baxter <juliusbaxter@gmail.com>
> > CC: ecalvo at 2se.es
> > Asunto: Re: other doubt
> >
> > Hi Elisa,
> >
> > OpenRISC cpu's can run any algorithm, but how well it will perform
> depends on many things:
> >
> >   - Compiler optimization flags (i.e. -O3)
> >   - Whether or not you are doing FPU instructions and have FPU enabled
> >   - Whether or not you use multiply and divide and have these
> instructions
> >     enabled
> >   - The frequency you are running
> >   - Cache settings Icache Dcache
> >   - The type of algorithm, does it require lots of data which will cause
> many
> >     cache misses?
> >
> > A simulator like QEMU or or1ksim will not give and exact representation
> of the CPUs real time performance.  It can tell you which intructions will
> be executed, but not how fast those will run or how many pipeline stalls of
> cache misses will happen.
> >
> > You can use the performance counters, they are supported in mor1kx if
> you enable them with the FEATURE_PERFCOUNTERS='ENABLED' parameter.  They
> can help count how many events happen between certain events.  Then you can
> combine them with a timer and watchpoints to detect how many times a loop
> can execute in 1000 clock cycles etc.  Please read about PCCRn and PCMRn in
> the architecture manual.
> >
> > It might be just as easy to use simple timing in a c program though,
> depending on the toolchain you use you can compare times between runs of
> your algorithm.
> > i.e.
> >
> >     #include <time.h>
> >     #include <stdio.h>
> >
> >     static long to_micro(struct timespec *time) {
> >       return (time->tv_sec * 1000000) + (time->tv_nsec / 1000);
> >     }
> >
> >     int main() {
> >       int i, j = 0;
> >
> >       struct timespec before, after;
> >
> >       clock_gettime(CLOCK_MONOTONIC, &before);
> >       /* Super complex algorithm */
> >       for (int i = 0; i < 100; i++) {
> >         j = (j+1) * (j+2);
> >       }
> >       clock_gettime(CLOCK_MONOTONIC, &after);
> >
> >       printf("time to run algorithm %ld uSecs\n", to_micro(&after) -
> > to_micro(&before));
> >
> >       return 0;
> >     }
> >
> > $ or1k-smh-linux-gnu-gcc timer.c
> > $ ./glibc-build-scripts/qemu-or1k-libc ./a.out time to run algorithm
> > 164 uSecs
> >
> > I hope it helps.
> >
> > -Stafford
> >
> > On Tue, Oct 08, 2019 at 10:54:29PM +1100, Julius Baxter wrote:
> > > Hi,
> > >
> > > No problem.
> > >
> > > There are performance counters in the OpenRISC architecture but
> > > whether they're implemented in a particular implementation is another
> matter.
> > >
> > > You can use these registers to measure various things the CPU is
> > > doing while it's executing. If you read the ISA document it'll tell
> > > you about them.
> > >
> > > I'm CCing Stafford because he's the main OpenRISC man these days and
> > > probably knows about the state of the performance counter registers
> > > in various simulators and RTL implementations.
> > >
> > > Cheers,
> > > Julius
> > >
> > > On Tue., 8 Oct. 2019, 10:43 pm , <ecalvo@2se.es> wrote:
> > >
> > > > Hi Julius,
> > > >
> > > >
> > > >
> > > > Sorry for bothering you again ☹. Can I do you other fast question
> > > > related to openrisc? If not, ignore the email please.
> > > >
> > > >
> > > >
> > > > Is there any way to characterize the type of application that I
> > > > can run in openrisc? I mean, could you measure (with numbers) if
> > > > an algorithm can be executed on it and the speed that it will
> achieve?
> > > > Is it possible to do it using orksim?
> > > >
> > > >
> > > >
> > > > Sorry because maybe it is so basic and general ☹
> > > >
> > > >
> > > >
> > > > Thanks in advance
> > > >
> > > > Elisa
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > *De:* Julius Baxter <juliusbaxter@gmail.com> *Enviado el:* lunes,
> > > > 16 de septiembre de 2019 13:11
> > > > *Para:* ecalvo at 2se.es
> > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > >
> > > >
> > > >
> > > > Also! To let you know, if you're in Spain, we will soon be having
> > > > our ORConf conference in Europe, and it's in Bordeaux, France,
> > > > just across the border. :)
> > > >
> > > >
> > > >
> > > > There are several people there who can help you get up to speed,
> > > > one of whom is Stafford Horne who knows most about the OpenRISC IP
> > > > lately. He will be presenting. If you can attend, it'd be helpful,
> I'm sure.
> > > >
> > > >
> > > >
> > > > All info at https://orconf.org
> > > >
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Julius
> > > >
> > > >
> > > >
> > > > On Mon, 16 Sep 2019 at 21:09, Julius Baxter
> > > > <juliusbaxter@gmail.com>
> > > > wrote:
> > > >
> > > > Hi Elisa,
> > > >
> > > >
> > > >
> > > > Sorry for the delay in this response.
> > > >
> > > >
> > > >
> > > > You should be using an SoC toplevel. FPGAs have everything you
> > > > need on board like memories and IO blocks and lots of other FPGA
> > > > fabric for you to implement other pieces of hardware.
> > > >
> > > >
> > > >
> > > > FuseSoC provides a really nice and easy way to build an mor1kx
> > > > design for the DE0 nano I believe:
> > > >
> > > >
> > > >
> > > > https://github.com/olofk/de0_nano
> > > >
> > > >
> > > >
> > > > That github page has a rough guide to getting it going.
> > > >
> > > >
> > > >
> > > > If you need help I recommend posting to the OpenRISC mailing list
> > > > and people will respond probably more promptly than I. (I
> > > > recommend getting to know how to use mailing lists.
> > > > https://openrisc.io/community
> > > >
> > > >
> > > >
> > > > There are more resources here: https://openrisc.io/tutorials
> > > >
> > > >
> > > >
> > > > I hope that's helpful.
> > > >
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Julius
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, 11 Sep 2019 at 20:09, <ecalvo@2se.es> wrote:
> > > >
> > > > Hi Julius,
> > > >
> > > >
> > > >
> > > > Thanks a lot for the quick answer.
> > > >
> > > >
> > > >
> > > > Yes, this is the problem: I am using as top level the mor1kx
> > > > module itself. You mean that I need to synthetize also in
> > > > reconfigurable logic these cores, don’t you? I thought that I
> > > > could have these elements as external in a development board.
> > > >
> > > >
> > > >
> > > > Thanks again,
> > > >
> > > > Cheers
> > > >
> > > > Elisa
> > > >
> > > >
> > > >
> > > > *De:* Julius Baxter <juliusbaxter@gmail.com> *Enviado el:*
> > > > miércoles, 11 de septiembre de 2019 12:02
> > > > *Para:* ecalvo at 2se.es
> > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > >
> > > >
> > > >
> > > > Hi Elisa,
> > > >
> > > >
> > > >
> > > > Thanks for getting in touch, that sounds like a cool project.
> > > >
> > > >
> > > >
> > > > Can you tell me about the toplevel - are you using a system
> > > > toplevel, or is your toplevel the mor1kx module itself?
> > > >
> > > >
> > > >
> > > > If it's the latter, then that's not the best way to do it - you
> > > > need a system toplevel which instantiates memories and some reset
> > > > circuitry and likely some IO (UART, GPIO, JTAG debug, etc.) to talk
> to the outside world.
> > > >
> > > >
> > > >
> > > > Is that helpful?
> > > >
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Julius
> > > >
> > > >
> > > >
> > > > On Wed, 11 Sep 2019 at 19:47, <ecalvo@2se.es> wrote:
> > > >
> > > > Dear Dr. Baxter,
> > > >
> > > >
> > > >
> > > > My name is Elisa Calvo Gallego. I am writing you because I have
> > > > started to work with OpenRISC in the framework of a research
> > > > project developed in the company where I am working (Space
> > > > Submicron Electronics, 2SE), and I am having some basic troubles.
> Could you help me?
> > > >
> > > >
> > > >
> > > > Although the FPGA that we are planning to use is larger, I have
> > > > synthetized mor1kx for a DE0 nano board as first step (this is the
> > > > board used in the majority of guides and tutorials). My problem is
> > > > that the results that I have obtained are similar in area and
> > > > resources, except for IOBs, which are more than available IOBs in
> > > > the device. Do you know what I am doing wrong? Should I comment
> > > > debug lines or something like that? I apologize if the question is
> > > > immediate. I didn't find the answer and I'm new in this.
> > > >
> > > >
> > > >
> > > > Thanks very much in advance.
> > > >
> > > > Best regards,
> > > >
> > > >
> > > >
> > > > Elisa
> > > >
> > > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.librecores.org/pipermail/openrisc/attachments/20191017/445c4da1/attachment.html>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [OpenRISC] PCCR and PCRM registers
  2019-10-17  5:13 ` [OpenRISC] PCCR and PCRM registers Stafford Horne
@ 2019-10-17 15:01   ` ecalvo
  2019-10-17 21:41     ` Stafford Horne
  0 siblings, 1 reply; 7+ messages in thread
From: ecalvo @ 2019-10-17 15:01 UTC (permalink / raw)
  To: openrisc

Hi Stafford, 

 

Yes, I am using newlib. I had discovered both files; these registers in sprs.h 

 

/******************************/

/* Performance Counters Group */

/******************************/

#define OR1K_SPR_PERF_GROUP 0x07

 

/* Performance Counters Count Registers */

#define OR1K_SPR_PERF_PCCR_BASE     OR1K_UNSIGNED(0x000)

#define OR1K_SPR_PERF_PCCR_COUNT    OR1K_UNSIGNED(0x008)

#define OR1K_SPR_PERF_PCCR_STEP     OR1K_UNSIGNED(0x001)

#define OR1K_SPR_PERF_PCCR_INDEX(N) (OR1K_SPR_PERF_PCCR_BASE + ((N) * OR1K_SPR_PERF_PCCR_STEP))

#define OR1K_SPR_PERF_PCCR_ADDR(N)  ((OR1K_SPR_PERF_GROUP << OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCCR_INDEX(N))

 

/* Performance Counters Mode Registers */

#define OR1K_SPR_PERF_PCMR_BASE     OR1K_UNSIGNED(0x008)

#define OR1K_SPR_PERF_PCMR_COUNT    OR1K_UNSIGNED(0x008)

#define OR1K_SPR_PERF_PCMR_STEP     OR1K_UNSIGNED(0x001)

#define OR1K_SPR_PERF_PCMR_INDEX(N) (OR1K_SPR_PERF_PCMR_BASE + ((N) *OR1K_SPR_PERF_PCMR_STEP))

#define OR1K_SPR_PERF_PCMR_ADDR(N)  ((OR1K_SPR_PERF_GROUP << OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCMR_INDEX(N))

 

/* Performance Counters Configuration */

#define OR1K_SPR_SYS_PCCFGR_INDEX OR1K_UNSIGNED(0x008)

#define OR1K_SPR_SYS_PCCFGR_ADDR  OR1K_UNSIGNED(0x0008)

 

/* Number of Performance Counters */

#define OR1K_SPR_SYS_PCCFGR_NPC_LSB    0

#define OR1K_SPR_SYS_PCCFGR_NPC_MSB    2

#define OR1K_SPR_SYS_PCCFGR_NPC_BITS   3

#define OR1K_SPR_SYS_PCCFGR_NPC_MASK   OR1K_UNSIGNED(0x00000007)

#define OR1K_SPR_SYS_PCCFGR_NPC_GET(X) (((X) >> 0) & OR1K_UNSIGNED(0x00000007))

#define OR1K_SPR_SYS_PCCFGR_NPC_SET(X, Y) (((X) & OR1K_UNSIGNED(0xfffffff8)) | ((Y) << 0))

 

And these functions in support.h

 

static inline void or1k_mtspr (uint32_t spr, uint32_t value)

static inline uint32_t or1k_mfspr (uint32_t spr)

 

Despite this I don’t have clear how to use it. 

 

1.	If I do: or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR , 0)  -> Does it allow me to configure the PCCFGR to one performance counter?

Is This the same than or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR , OR1K_SPR_SYS_PCCFGR_NPC_LSB ) or  Do OR1K_SPR_SYS_PCCFGR_NPC_LSB   , OR1K_SPR_SYS_PCCFGR_NPC_MSB, etc. provide different functions on each one performance counter?

2.	What is the meaning of PCCR_BASE,  PCCR_COUNT, PCCR_STEP, PCMR_INDEX(N), PCMR_ADDR(N) ? (the same for PCMR) (Is BASE the base address of all PCCR and ADDR the position of each one of them? ….why PCMR_BASE and COUNT hasta de same value OR1K_UNSIGNED(0x008)? )      
3.	Should I define first PCCFGR, second PCMR and last get PCCR?

 

Thanks and sorry for the inconveniences.

Elisa

 

De: Stafford Horne <shorne@gmail.com> 
Enviado el: jueves, 17 de octubre de 2019 7:14
Para: ecalvo at 2se.es; Openrisc <openrisc@lists.librecores.org>
Asunto: Re: PCCR and PCRM registers

 

+cc mailing list,

 

Hi Elisa,

 

Which toolchain are you using? I guess newlib?

 

It has functions like or1k_mfspr() and or1k_mtspr() see or1k-support.h and or1k-sprs.h headers for details.

 

-Stafford

On Wed, Oct 16, 2019, 8:40 PM < <mailto:ecalvo@2se.es> ecalvo at 2se.es> wrote:

Hi Stafford, 

I am with PCCR and PCRM registers. I have seen that I can access from asm language, but there is functions to access from C? Have you got any example about their usage?

I have already confirmed my subscription to the mailing list. 

Thanks 
Elisa

-----Mensaje original-----
De: Stafford Horne < <mailto:shorne@gmail.com> shorne at gmail.com> 
Enviado el: miércoles, 9 de octubre de 2019 13:38
Para:  <mailto:ecalvo@2se.es> ecalvo at 2se.es; Julius Baxter < <mailto:juliusbaxter@gmail.com> juliusbaxter at gmail.com>
Asunto: Re: other doubt

Hello Elisa,

If you simulate with Icarus or modelsim you will be able to measure pretty much the same performance characteristics as FPGA. So there is no need to go straight to FPGA.

In terms of my example C code is one option.  You can also read timer data directly from the tick timer in assembly and achieve the same thing.

If you are interested we can CC the mailing list and get more opinions.

-Stafford

On Wed, Oct 9, 2019 at 5:09 PM < <mailto:ecalvo@2se.es> ecalvo at 2se.es> wrote:
>
> Hi Stafford,
>
> Nice to meet you and, first of all, thanks a lot for your guidance. I am new on this, and although there is some documentation, sometimes it is difficult some point which maybe it is basic.
>
> Ok, to your comments. If.."A simulator like QEMU or or1ksim will not give and exact representation of the CPUs real time performance"...then...if I simulate directly the processor with modelsim, icarus or a similar tool...neither I get a real performance, don’t I? And values for the counters that you tell me to enable, neither are real, isn't it? should I execute it directly on the FPGA and it will depends on the implementation?
>
> Ok, to C code. I have understood the dependency with toolchain.
>
> Thanks a lot again.
> Best regards,
> Elisa
>
>
> -----Mensaje original-----
> De: Stafford Horne < <mailto:shorne@gmail.com> shorne at gmail.com>
> Enviado el: martes, 8 de octubre de 2019 16:18
> Para: Julius Baxter < <mailto:juliusbaxter@gmail.com> juliusbaxter at gmail.com>
> CC:  <mailto:ecalvo@2se.es> ecalvo at 2se.es
> Asunto: Re: other doubt
>
> Hi Elisa,
>
> OpenRISC cpu's can run any algorithm, but how well it will perform depends on many things:
>
>   - Compiler optimization flags (i.e. -O3)
>   - Whether or not you are doing FPU instructions and have FPU enabled
>   - Whether or not you use multiply and divide and have these instructions
>     enabled
>   - The frequency you are running
>   - Cache settings Icache Dcache
>   - The type of algorithm, does it require lots of data which will cause many
>     cache misses?
>
> A simulator like QEMU or or1ksim will not give and exact representation of the CPUs real time performance.  It can tell you which intructions will be executed, but not how fast those will run or how many pipeline stalls of cache misses will happen.
>
> You can use the performance counters, they are supported in mor1kx if you enable them with the FEATURE_PERFCOUNTERS='ENABLED' parameter.  They can help count how many events happen between certain events.  Then you can combine them with a timer and watchpoints to detect how many times a loop can execute in 1000 clock cycles etc.  Please read about PCCRn and PCMRn in the architecture manual.
>
> It might be just as easy to use simple timing in a c program though, depending on the toolchain you use you can compare times between runs of your algorithm.
> i.e.
>
>     #include <time.h>
>     #include <stdio.h>
>
>     static long to_micro(struct timespec *time) {
>       return (time->tv_sec * 1000000) + (time->tv_nsec / 1000);
>     }
>
>     int main() {
>       int i, j = 0;
>
>       struct timespec before, after;
>
>       clock_gettime(CLOCK_MONOTONIC, &before);
>       /* Super complex algorithm */
>       for (int i = 0; i < 100; i++) {
>         j = (j+1) * (j+2);
>       }
>       clock_gettime(CLOCK_MONOTONIC, &after);
>
>       printf("time to run algorithm %ld uSecs\n", to_micro(&after) - 
> to_micro(&before));
>
>       return 0;
>     }
>
> $ or1k-smh-linux-gnu-gcc timer.c
> $ ./glibc-build-scripts/qemu-or1k-libc ./a.out time to run algorithm 
> 164 uSecs
>
> I hope it helps.
>
> -Stafford
>
> On Tue, Oct 08, 2019 at 10:54:29PM +1100, Julius Baxter wrote:
> > Hi,
> >
> > No problem.
> >
> > There are performance counters in the OpenRISC architecture but 
> > whether they're implemented in a particular implementation is another matter.
> >
> > You can use these registers to measure various things the CPU is 
> > doing while it's executing. If you read the ISA document it'll tell 
> > you about them.
> >
> > I'm CCing Stafford because he's the main OpenRISC man these days and 
> > probably knows about the state of the performance counter registers 
> > in various simulators and RTL implementations.
> >
> > Cheers,
> > Julius
> >
> > On Tue., 8 Oct. 2019, 10:43 pm , < <mailto:ecalvo@2se.es> ecalvo at 2se.es> wrote:
> >
> > > Hi Julius,
> > >
> > >
> > >
> > > Sorry for bothering you again ☹. Can I do you other fast question 
> > > related to openrisc? If not, ignore the email please.
> > >
> > >
> > >
> > > Is there any way to characterize the type of application that I 
> > > can run in openrisc? I mean, could you measure (with numbers) if 
> > > an algorithm can be executed on it and the speed that it will achieve?
> > > Is it possible to do it using orksim?
> > >
> > >
> > >
> > > Sorry because maybe it is so basic and general ☹
> > >
> > >
> > >
> > > Thanks in advance
> > >
> > > Elisa
> > >
> > >
> > >
> > >
> > >
> > > *De:* Julius Baxter < <mailto:juliusbaxter@gmail.com> juliusbaxter at gmail.com> *Enviado el:* lunes, 
> > > 16 de septiembre de 2019 13:11
> > > *Para:*  <mailto:ecalvo@2se.es> ecalvo at 2se.es
> > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > >
> > >
> > >
> > > Also! To let you know, if you're in Spain, we will soon be having 
> > > our ORConf conference in Europe, and it's in Bordeaux, France, 
> > > just across the border. :)
> > >
> > >
> > >
> > > There are several people there who can help you get up to speed, 
> > > one of whom is Stafford Horne who knows most about the OpenRISC IP 
> > > lately. He will be presenting. If you can attend, it'd be helpful, I'm sure.
> > >
> > >
> > >
> > > All info at  <https://orconf.org> https://orconf.org
> > >
> > >
> > >
> > > Cheers,
> > >
> > > Julius
> > >
> > >
> > >
> > > On Mon, 16 Sep 2019 at 21:09, Julius Baxter 
> > > < <mailto:juliusbaxter@gmail.com> juliusbaxter at gmail.com>
> > > wrote:
> > >
> > > Hi Elisa,
> > >
> > >
> > >
> > > Sorry for the delay in this response.
> > >
> > >
> > >
> > > You should be using an SoC toplevel. FPGAs have everything you 
> > > need on board like memories and IO blocks and lots of other FPGA 
> > > fabric for you to implement other pieces of hardware.
> > >
> > >
> > >
> > > FuseSoC provides a really nice and easy way to build an mor1kx 
> > > design for the DE0 nano I believe:
> > >
> > >
> > >
> > >  <https://github.com/olofk/de0_nano> https://github.com/olofk/de0_nano
> > >
> > >
> > >
> > > That github page has a rough guide to getting it going.
> > >
> > >
> > >
> > > If you need help I recommend posting to the OpenRISC mailing list 
> > > and people will respond probably more promptly than I. (I 
> > > recommend getting to know how to use mailing lists.
> > >  <https://openrisc.io/community> https://openrisc.io/community
> > >
> > >
> > >
> > > There are more resources here:  <https://openrisc.io/tutorials> https://openrisc.io/tutorials
> > >
> > >
> > >
> > > I hope that's helpful.
> > >
> > >
> > >
> > > Cheers,
> > >
> > > Julius
> > >
> > >
> > >
> > >
> > >
> > > On Wed, 11 Sep 2019 at 20:09, < <mailto:ecalvo@2se.es> ecalvo at 2se.es> wrote:
> > >
> > > Hi Julius,
> > >
> > >
> > >
> > > Thanks a lot for the quick answer.
> > >
> > >
> > >
> > > Yes, this is the problem: I am using as top level the mor1kx 
> > > module itself. You mean that I need to synthetize also in 
> > > reconfigurable logic these cores, don’t you? I thought that I 
> > > could have these elements as external in a development board.
> > >
> > >
> > >
> > > Thanks again,
> > >
> > > Cheers
> > >
> > > Elisa
> > >
> > >
> > >
> > > *De:* Julius Baxter < <mailto:juliusbaxter@gmail.com> juliusbaxter at gmail.com> *Enviado el:* 
> > > miércoles, 11 de septiembre de 2019 12:02
> > > *Para:*  <mailto:ecalvo@2se.es> ecalvo at 2se.es
> > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > >
> > >
> > >
> > > Hi Elisa,
> > >
> > >
> > >
> > > Thanks for getting in touch, that sounds like a cool project.
> > >
> > >
> > >
> > > Can you tell me about the toplevel - are you using a system 
> > > toplevel, or is your toplevel the mor1kx module itself?
> > >
> > >
> > >
> > > If it's the latter, then that's not the best way to do it - you 
> > > need a system toplevel which instantiates memories and some reset 
> > > circuitry and likely some IO (UART, GPIO, JTAG debug, etc.) to talk to the outside world.
> > >
> > >
> > >
> > > Is that helpful?
> > >
> > >
> > >
> > > Cheers,
> > >
> > > Julius
> > >
> > >
> > >
> > > On Wed, 11 Sep 2019 at 19:47, < <mailto:ecalvo@2se.es> ecalvo at 2se.es> wrote:
> > >
> > > Dear Dr. Baxter,
> > >
> > >
> > >
> > > My name is Elisa Calvo Gallego. I am writing you because I have 
> > > started to work with OpenRISC in the framework of a research 
> > > project developed in the company where I am working (Space 
> > > Submicron Electronics, 2SE), and I am having some basic troubles. Could you help me?
> > >
> > >
> > >
> > > Although the FPGA that we are planning to use is larger, I have 
> > > synthetized mor1kx for a DE0 nano board as first step (this is the 
> > > board used in the majority of guides and tutorials). My problem is 
> > > that the results that I have obtained are similar in area and 
> > > resources, except for IOBs, which are more than available IOBs in 
> > > the device. Do you know what I am doing wrong? Should I comment 
> > > debug lines or something like that? I apologize if the question is 
> > > immediate. I didn't find the answer and I'm new in this.
> > >
> > >
> > >
> > > Thanks very much in advance.
> > >
> > > Best regards,
> > >
> > >
> > >
> > > Elisa
> > >
> > >
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.librecores.org/pipermail/openrisc/attachments/20191017/23f38c86/attachment-0001.html>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [OpenRISC] PCCR and PCRM registers
  2019-10-17 15:01   ` ecalvo
@ 2019-10-17 21:41     ` Stafford Horne
  2019-10-18 10:51       ` ecalvo
  0 siblings, 1 reply; 7+ messages in thread
From: Stafford Horne @ 2019-10-17 21:41 UTC (permalink / raw)
  To: openrisc

Hi Elisa,

Right, these are the functions.  You only need to be concerned with:
  OR1K_SPR_PERF_PCCR_ADDR(n)
  OR1K_SPR_PERF_PCMR_ADDR(n)
  OR1K_SPR_SYS_PCCFGR_ADDR
  OR1K_SPR_SYS_PCCFGR_NPC_GET(x)

The others are used internally.
The PCCFGR is read only, it specifies how many performance counters
your CPU has built in.  It may be 0-7.

An example of how to use them:

#include <or1k-support.h>
#include <or1k-sprs.h>
#include <stdio.h>

#define PCMR_CISM 1<<3
#define PCMR_CIUM 1<<3
#define PCMR_IF   1<<6

int main() {

  int number_of_pcs;
  uint32_t pcmr, pccr, upr, pccfgr;

  /* Check if PCs are even available */
  upr = or1k_mfspr(OR1K_SPR_SYS_UPR_ADDR);

  if (OR1K_SPR_SYS_UPR_PCUP_GET(upr)) {

    pccfgr = or1k_mfspr(OR1K_SPR_SYS_PCCFGR_ADDR);
    number_of_pcs = OR1K_SPR_SYS_PCCFGR_NPC_GET(pccfgr) + 1;

    printf ("We have %d program counters.\n", number_of_pcs);

    pccr = or1k_mfspr(OR1K_SPR_PERF_PCCR_ADDR(0));
    printf ("PCCR before setup %x\n", pccr);

    /* Turn on counter and enable instruction fetch counting */
    pcmr = or1k_mfspr(OR1K_SPR_PERF_PCMR_ADDR(0));
    pcmr |= PCMR_CISM | PCMR_CIUM | PCMR_IF;
    or1k_mtspr(OR1K_SPR_PERF_PCMR_ADDR(0), pcmr);

    /* Read the PCCR after we are done */
    printf ("Run a printf.");
    pccr = or1k_mfspr(OR1K_SPR_PERF_PCCR_ADDR(0));
    printf ("PCCR after printf %x\n", pccr);
  } else {
    printf ("No performance counters available.\n");
  }

  return 0;
}

On Fri, Oct 18, 2019 at 12:01 AM <ecalvo@2se.es> wrote:
>
> Hi Stafford,
>
>
>
> Yes, I am using newlib. I had discovered both files; these registers in sprs.h
>
>
>
> /******************************/
>
> /* Performance Counters Group */
>
> /******************************/
>
> #define OR1K_SPR_PERF_GROUP 0x07
>
>
>
> /* Performance Counters Count Registers */
>
> #define OR1K_SPR_PERF_PCCR_BASE     OR1K_UNSIGNED(0x000)
>
> #define OR1K_SPR_PERF_PCCR_COUNT    OR1K_UNSIGNED(0x008)
>
> #define OR1K_SPR_PERF_PCCR_STEP     OR1K_UNSIGNED(0x001)
>
> #define OR1K_SPR_PERF_PCCR_INDEX(N) (OR1K_SPR_PERF_PCCR_BASE + ((N) * OR1K_SPR_PERF_PCCR_STEP))
>
> #define OR1K_SPR_PERF_PCCR_ADDR(N)  ((OR1K_SPR_PERF_GROUP << OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCCR_INDEX(N))
>
>
>
> /* Performance Counters Mode Registers */
>
> #define OR1K_SPR_PERF_PCMR_BASE     OR1K_UNSIGNED(0x008)
>
> #define OR1K_SPR_PERF_PCMR_COUNT    OR1K_UNSIGNED(0x008)
>
> #define OR1K_SPR_PERF_PCMR_STEP     OR1K_UNSIGNED(0x001)
>
> #define OR1K_SPR_PERF_PCMR_INDEX(N) (OR1K_SPR_PERF_PCMR_BASE + ((N) *OR1K_SPR_PERF_PCMR_STEP))
>
> #define OR1K_SPR_PERF_PCMR_ADDR(N)  ((OR1K_SPR_PERF_GROUP << OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCMR_INDEX(N))
>
>
>
> /* Performance Counters Configuration */
>
> #define OR1K_SPR_SYS_PCCFGR_INDEX OR1K_UNSIGNED(0x008)
>
> #define OR1K_SPR_SYS_PCCFGR_ADDR  OR1K_UNSIGNED(0x0008)
>
>
>
> /* Number of Performance Counters */
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_LSB    0
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_MSB    2
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_BITS   3
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_MASK   OR1K_UNSIGNED(0x00000007)
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_GET(X) (((X) >> 0) & OR1K_UNSIGNED(0x00000007))
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_SET(X, Y) (((X) & OR1K_UNSIGNED(0xfffffff8)) | ((Y) << 0))
>
>
>
> And these functions in support.h
>
>
>
> static inline void or1k_mtspr (uint32_t spr, uint32_t value)
>
> static inline uint32_t or1k_mfspr (uint32_t spr)
>
>
>
> Despite this I don’t have clear how to use it.
>
>
>
> If I do: or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR , 0)  -> Does it allow me to configure the PCCFGR to one performance counter?
>
> Is This the same than or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR , OR1K_SPR_SYS_PCCFGR_NPC_LSB ) or  Do OR1K_SPR_SYS_PCCFGR_NPC_LSB   , OR1K_SPR_SYS_PCCFGR_NPC_MSB, etc. provide different functions on each one performance counter?
>
> What is the meaning of PCCR_BASE,  PCCR_COUNT, PCCR_STEP, PCMR_INDEX(N), PCMR_ADDR(N) ? (the same for PCMR) (Is BASE the base address of all PCCR and ADDR the position of each one of them? ….why PCMR_BASE and COUNT hasta de same value OR1K_UNSIGNED(0x008)? )
> Should I define first PCCFGR, second PCMR and last get PCCR?
>
>
>
> Thanks and sorry for the inconveniences.
>
> Elisa
>
>
>
> De: Stafford Horne <shorne@gmail.com>
> Enviado el: jueves, 17 de octubre de 2019 7:14
> Para: ecalvo at 2se.es; Openrisc <openrisc@lists.librecores.org>
> Asunto: Re: PCCR and PCRM registers
>
>
>
> +cc mailing list,
>
>
>
> Hi Elisa,
>
>
>
> Which toolchain are you using? I guess newlib?
>
>
>
> It has functions like or1k_mfspr() and or1k_mtspr() see or1k-support.h and or1k-sprs.h headers for details.
>
>
>
> -Stafford
>
> On Wed, Oct 16, 2019, 8:40 PM <ecalvo@2se.es> wrote:
>
> Hi Stafford,
>
> I am with PCCR and PCRM registers. I have seen that I can access from asm language, but there is functions to access from C? Have you got any example about their usage?
>
> I have already confirmed my subscription to the mailing list.
>
> Thanks
> Elisa
>
> -----Mensaje original-----
> De: Stafford Horne <shorne@gmail.com>
> Enviado el: miércoles, 9 de octubre de 2019 13:38
> Para: ecalvo at 2se.es; Julius Baxter <juliusbaxter@gmail.com>
> Asunto: Re: other doubt
>
> Hello Elisa,
>
> If you simulate with Icarus or modelsim you will be able to measure pretty much the same performance characteristics as FPGA. So there is no need to go straight to FPGA.
>
> In terms of my example C code is one option.  You can also read timer data directly from the tick timer in assembly and achieve the same thing.
>
> If you are interested we can CC the mailing list and get more opinions.
>
> -Stafford
>
> On Wed, Oct 9, 2019 at 5:09 PM <ecalvo@2se.es> wrote:
> >
> > Hi Stafford,
> >
> > Nice to meet you and, first of all, thanks a lot for your guidance. I am new on this, and although there is some documentation, sometimes it is difficult some point which maybe it is basic.
> >
> > Ok, to your comments. If.."A simulator like QEMU or or1ksim will not give and exact representation of the CPUs real time performance"...then...if I simulate directly the processor with modelsim, icarus or a similar tool...neither I get a real performance, don’t I? And values for the counters that you tell me to enable, neither are real, isn't it? should I execute it directly on the FPGA and it will depends on the implementation?
> >
> > Ok, to C code. I have understood the dependency with toolchain.
> >
> > Thanks a lot again.
> > Best regards,
> > Elisa
> >
> >
> > -----Mensaje original-----
> > De: Stafford Horne <shorne@gmail.com>
> > Enviado el: martes, 8 de octubre de 2019 16:18
> > Para: Julius Baxter <juliusbaxter@gmail.com>
> > CC: ecalvo at 2se.es
> > Asunto: Re: other doubt
> >
> > Hi Elisa,
> >
> > OpenRISC cpu's can run any algorithm, but how well it will perform depends on many things:
> >
> >   - Compiler optimization flags (i.e. -O3)
> >   - Whether or not you are doing FPU instructions and have FPU enabled
> >   - Whether or not you use multiply and divide and have these instructions
> >     enabled
> >   - The frequency you are running
> >   - Cache settings Icache Dcache
> >   - The type of algorithm, does it require lots of data which will cause many
> >     cache misses?
> >
> > A simulator like QEMU or or1ksim will not give and exact representation of the CPUs real time performance.  It can tell you which intructions will be executed, but not how fast those will run or how many pipeline stalls of cache misses will happen.
> >
> > You can use the performance counters, they are supported in mor1kx if you enable them with the FEATURE_PERFCOUNTERS='ENABLED' parameter.  They can help count how many events happen between certain events.  Then you can combine them with a timer and watchpoints to detect how many times a loop can execute in 1000 clock cycles etc.  Please read about PCCRn and PCMRn in the architecture manual.
> >
> > It might be just as easy to use simple timing in a c program though, depending on the toolchain you use you can compare times between runs of your algorithm.
> > i.e.
> >
> >     #include <time.h>
> >     #include <stdio.h>
> >
> >     static long to_micro(struct timespec *time) {
> >       return (time->tv_sec * 1000000) + (time->tv_nsec / 1000);
> >     }
> >
> >     int main() {
> >       int i, j = 0;
> >
> >       struct timespec before, after;
> >
> >       clock_gettime(CLOCK_MONOTONIC, &before);
> >       /* Super complex algorithm */
> >       for (int i = 0; i < 100; i++) {
> >         j = (j+1) * (j+2);
> >       }
> >       clock_gettime(CLOCK_MONOTONIC, &after);
> >
> >       printf("time to run algorithm %ld uSecs\n", to_micro(&after) -
> > to_micro(&before));
> >
> >       return 0;
> >     }
> >
> > $ or1k-smh-linux-gnu-gcc timer.c
> > $ ./glibc-build-scripts/qemu-or1k-libc ./a.out time to run algorithm
> > 164 uSecs
> >
> > I hope it helps.
> >
> > -Stafford
> >
> > On Tue, Oct 08, 2019 at 10:54:29PM +1100, Julius Baxter wrote:
> > > Hi,
> > >
> > > No problem.
> > >
> > > There are performance counters in the OpenRISC architecture but
> > > whether they're implemented in a particular implementation is another matter.
> > >
> > > You can use these registers to measure various things the CPU is
> > > doing while it's executing. If you read the ISA document it'll tell
> > > you about them.
> > >
> > > I'm CCing Stafford because he's the main OpenRISC man these days and
> > > probably knows about the state of the performance counter registers
> > > in various simulators and RTL implementations.
> > >
> > > Cheers,
> > > Julius
> > >
> > > On Tue., 8 Oct. 2019, 10:43 pm , <ecalvo@2se.es> wrote:
> > >
> > > > Hi Julius,
> > > >
> > > >
> > > >
> > > > Sorry for bothering you again ☹. Can I do you other fast question
> > > > related to openrisc? If not, ignore the email please.
> > > >
> > > >
> > > >
> > > > Is there any way to characterize the type of application that I
> > > > can run in openrisc? I mean, could you measure (with numbers) if
> > > > an algorithm can be executed on it and the speed that it will achieve?
> > > > Is it possible to do it using orksim?
> > > >
> > > >
> > > >
> > > > Sorry because maybe it is so basic and general ☹
> > > >
> > > >
> > > >
> > > > Thanks in advance
> > > >
> > > > Elisa
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > *De:* Julius Baxter <juliusbaxter@gmail.com> *Enviado el:* lunes,
> > > > 16 de septiembre de 2019 13:11
> > > > *Para:* ecalvo at 2se.es
> > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > >
> > > >
> > > >
> > > > Also! To let you know, if you're in Spain, we will soon be having
> > > > our ORConf conference in Europe, and it's in Bordeaux, France,
> > > > just across the border. :)
> > > >
> > > >
> > > >
> > > > There are several people there who can help you get up to speed,
> > > > one of whom is Stafford Horne who knows most about the OpenRISC IP
> > > > lately. He will be presenting. If you can attend, it'd be helpful, I'm sure.
> > > >
> > > >
> > > >
> > > > All info at https://orconf.org
> > > >
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Julius
> > > >
> > > >
> > > >
> > > > On Mon, 16 Sep 2019 at 21:09, Julius Baxter
> > > > <juliusbaxter@gmail.com>
> > > > wrote:
> > > >
> > > > Hi Elisa,
> > > >
> > > >
> > > >
> > > > Sorry for the delay in this response.
> > > >
> > > >
> > > >
> > > > You should be using an SoC toplevel. FPGAs have everything you
> > > > need on board like memories and IO blocks and lots of other FPGA
> > > > fabric for you to implement other pieces of hardware.
> > > >
> > > >
> > > >
> > > > FuseSoC provides a really nice and easy way to build an mor1kx
> > > > design for the DE0 nano I believe:
> > > >
> > > >
> > > >
> > > > https://github.com/olofk/de0_nano
> > > >
> > > >
> > > >
> > > > That github page has a rough guide to getting it going.
> > > >
> > > >
> > > >
> > > > If you need help I recommend posting to the OpenRISC mailing list
> > > > and people will respond probably more promptly than I. (I
> > > > recommend getting to know how to use mailing lists.
> > > > https://openrisc.io/community
> > > >
> > > >
> > > >
> > > > There are more resources here: https://openrisc.io/tutorials
> > > >
> > > >
> > > >
> > > > I hope that's helpful.
> > > >
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Julius
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, 11 Sep 2019 at 20:09, <ecalvo@2se.es> wrote:
> > > >
> > > > Hi Julius,
> > > >
> > > >
> > > >
> > > > Thanks a lot for the quick answer.
> > > >
> > > >
> > > >
> > > > Yes, this is the problem: I am using as top level the mor1kx
> > > > module itself. You mean that I need to synthetize also in
> > > > reconfigurable logic these cores, don’t you? I thought that I
> > > > could have these elements as external in a development board.
> > > >
> > > >
> > > >
> > > > Thanks again,
> > > >
> > > > Cheers
> > > >
> > > > Elisa
> > > >
> > > >
> > > >
> > > > *De:* Julius Baxter <juliusbaxter@gmail.com> *Enviado el:*
> > > > miércoles, 11 de septiembre de 2019 12:02
> > > > *Para:* ecalvo at 2se.es
> > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > >
> > > >
> > > >
> > > > Hi Elisa,
> > > >
> > > >
> > > >
> > > > Thanks for getting in touch, that sounds like a cool project.
> > > >
> > > >
> > > >
> > > > Can you tell me about the toplevel - are you using a system
> > > > toplevel, or is your toplevel the mor1kx module itself?
> > > >
> > > >
> > > >
> > > > If it's the latter, then that's not the best way to do it - you
> > > > need a system toplevel which instantiates memories and some reset
> > > > circuitry and likely some IO (UART, GPIO, JTAG debug, etc.) to talk to the outside world.
> > > >
> > > >
> > > >
> > > > Is that helpful?
> > > >
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Julius
> > > >
> > > >
> > > >
> > > > On Wed, 11 Sep 2019 at 19:47, <ecalvo@2se.es> wrote:
> > > >
> > > > Dear Dr. Baxter,
> > > >
> > > >
> > > >
> > > > My name is Elisa Calvo Gallego. I am writing you because I have
> > > > started to work with OpenRISC in the framework of a research
> > > > project developed in the company where I am working (Space
> > > > Submicron Electronics, 2SE), and I am having some basic troubles. Could you help me?
> > > >
> > > >
> > > >
> > > > Although the FPGA that we are planning to use is larger, I have
> > > > synthetized mor1kx for a DE0 nano board as first step (this is the
> > > > board used in the majority of guides and tutorials). My problem is
> > > > that the results that I have obtained are similar in area and
> > > > resources, except for IOBs, which are more than available IOBs in
> > > > the device. Do you know what I am doing wrong? Should I comment
> > > > debug lines or something like that? I apologize if the question is
> > > > immediate. I didn't find the answer and I'm new in this.
> > > >
> > > >
> > > >
> > > > Thanks very much in advance.
> > > >
> > > > Best regards,
> > > >
> > > >
> > > >
> > > > Elisa
> > > >
> > > >
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [OpenRISC] PCCR and PCRM registers
  2019-10-17 21:41     ` Stafford Horne
@ 2019-10-18 10:51       ` ecalvo
  2019-10-18 14:40         ` Stafford Horne
  0 siblings, 1 reply; 7+ messages in thread
From: ecalvo @ 2019-10-18 10:51 UTC (permalink / raw)
  To: openrisc

Hi Stafford, 

Sorry for bothering you again. 

The program is hanged accessing to PCMR and PCCR registers. Should I change other features apart from perfcounters?

Elisa

-----Mensaje original-----
De: Stafford Horne <shorne@gmail.com> 
Enviado el: jueves, 17 de octubre de 2019 23:42
Para: ecalvo at 2se.es
CC: Openrisc <openrisc@lists.librecores.org>
Asunto: Re: PCCR and PCRM registers

Hi Elisa,

Right, these are the functions.  You only need to be concerned with:
  OR1K_SPR_PERF_PCCR_ADDR(n)
  OR1K_SPR_PERF_PCMR_ADDR(n)
  OR1K_SPR_SYS_PCCFGR_ADDR
  OR1K_SPR_SYS_PCCFGR_NPC_GET(x)

The others are used internally.
The PCCFGR is read only, it specifies how many performance counters your CPU has built in.  It may be 0-7.

An example of how to use them:

#include <or1k-support.h>
#include <or1k-sprs.h>
#include <stdio.h>

#define PCMR_CISM 1<<3
#define PCMR_CIUM 1<<3
#define PCMR_IF   1<<6

int main() {

  int number_of_pcs;
  uint32_t pcmr, pccr, upr, pccfgr;

  /* Check if PCs are even available */
  upr = or1k_mfspr(OR1K_SPR_SYS_UPR_ADDR);

  if (OR1K_SPR_SYS_UPR_PCUP_GET(upr)) {

    pccfgr = or1k_mfspr(OR1K_SPR_SYS_PCCFGR_ADDR);
    number_of_pcs = OR1K_SPR_SYS_PCCFGR_NPC_GET(pccfgr) + 1;

    printf ("We have %d program counters.\n", number_of_pcs);

    pccr = or1k_mfspr(OR1K_SPR_PERF_PCCR_ADDR(0));
    printf ("PCCR before setup %x\n", pccr);

    /* Turn on counter and enable instruction fetch counting */
    pcmr = or1k_mfspr(OR1K_SPR_PERF_PCMR_ADDR(0));
    pcmr |= PCMR_CISM | PCMR_CIUM | PCMR_IF;
    or1k_mtspr(OR1K_SPR_PERF_PCMR_ADDR(0), pcmr);

    /* Read the PCCR after we are done */
    printf ("Run a printf.");
    pccr = or1k_mfspr(OR1K_SPR_PERF_PCCR_ADDR(0));
    printf ("PCCR after printf %x\n", pccr);
  } else {
    printf ("No performance counters available.\n");
  }

  return 0;
}

On Fri, Oct 18, 2019 at 12:01 AM <ecalvo@2se.es> wrote:
>
> Hi Stafford,
>
>
>
> Yes, I am using newlib. I had discovered both files; these registers 
> in sprs.h
>
>
>
> /******************************/
>
> /* Performance Counters Group */
>
> /******************************/
>
> #define OR1K_SPR_PERF_GROUP 0x07
>
>
>
> /* Performance Counters Count Registers */
>
> #define OR1K_SPR_PERF_PCCR_BASE     OR1K_UNSIGNED(0x000)
>
> #define OR1K_SPR_PERF_PCCR_COUNT    OR1K_UNSIGNED(0x008)
>
> #define OR1K_SPR_PERF_PCCR_STEP     OR1K_UNSIGNED(0x001)
>
> #define OR1K_SPR_PERF_PCCR_INDEX(N) (OR1K_SPR_PERF_PCCR_BASE + ((N) * 
> OR1K_SPR_PERF_PCCR_STEP))
>
> #define OR1K_SPR_PERF_PCCR_ADDR(N)  ((OR1K_SPR_PERF_GROUP << 
> OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCCR_INDEX(N))
>
>
>
> /* Performance Counters Mode Registers */
>
> #define OR1K_SPR_PERF_PCMR_BASE     OR1K_UNSIGNED(0x008)
>
> #define OR1K_SPR_PERF_PCMR_COUNT    OR1K_UNSIGNED(0x008)
>
> #define OR1K_SPR_PERF_PCMR_STEP     OR1K_UNSIGNED(0x001)
>
> #define OR1K_SPR_PERF_PCMR_INDEX(N) (OR1K_SPR_PERF_PCMR_BASE + ((N) 
> *OR1K_SPR_PERF_PCMR_STEP))
>
> #define OR1K_SPR_PERF_PCMR_ADDR(N)  ((OR1K_SPR_PERF_GROUP << 
> OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCMR_INDEX(N))
>
>
>
> /* Performance Counters Configuration */
>
> #define OR1K_SPR_SYS_PCCFGR_INDEX OR1K_UNSIGNED(0x008)
>
> #define OR1K_SPR_SYS_PCCFGR_ADDR  OR1K_UNSIGNED(0x0008)
>
>
>
> /* Number of Performance Counters */
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_LSB    0
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_MSB    2
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_BITS   3
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_MASK   OR1K_UNSIGNED(0x00000007)
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_GET(X) (((X) >> 0) & 
> OR1K_UNSIGNED(0x00000007))
>
> #define OR1K_SPR_SYS_PCCFGR_NPC_SET(X, Y) (((X) & 
> OR1K_UNSIGNED(0xfffffff8)) | ((Y) << 0))
>
>
>
> And these functions in support.h
>
>
>
> static inline void or1k_mtspr (uint32_t spr, uint32_t value)
>
> static inline uint32_t or1k_mfspr (uint32_t spr)
>
>
>
> Despite this I don’t have clear how to use it.
>
>
>
> If I do: or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR , 0)  -> Does it allow me to configure the PCCFGR to one performance counter?
>
> Is This the same than or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR , OR1K_SPR_SYS_PCCFGR_NPC_LSB ) or  Do OR1K_SPR_SYS_PCCFGR_NPC_LSB   , OR1K_SPR_SYS_PCCFGR_NPC_MSB, etc. provide different functions on each one performance counter?
>
> What is the meaning of PCCR_BASE,  PCCR_COUNT, PCCR_STEP, 
> PCMR_INDEX(N), PCMR_ADDR(N) ? (the same for PCMR) (Is BASE the base address of all PCCR and ADDR the position of each one of them? ….why PCMR_BASE and COUNT hasta de same value OR1K_UNSIGNED(0x008)? ) Should I define first PCCFGR, second PCMR and last get PCCR?
>
>
>
> Thanks and sorry for the inconveniences.
>
> Elisa
>
>
>
> De: Stafford Horne <shorne@gmail.com>
> Enviado el: jueves, 17 de octubre de 2019 7:14
> Para: ecalvo at 2se.es; Openrisc <openrisc@lists.librecores.org>
> Asunto: Re: PCCR and PCRM registers
>
>
>
> +cc mailing list,
>
>
>
> Hi Elisa,
>
>
>
> Which toolchain are you using? I guess newlib?
>
>
>
> It has functions like or1k_mfspr() and or1k_mtspr() see or1k-support.h and or1k-sprs.h headers for details.
>
>
>
> -Stafford
>
> On Wed, Oct 16, 2019, 8:40 PM <ecalvo@2se.es> wrote:
>
> Hi Stafford,
>
> I am with PCCR and PCRM registers. I have seen that I can access from asm language, but there is functions to access from C? Have you got any example about their usage?
>
> I have already confirmed my subscription to the mailing list.
>
> Thanks
> Elisa
>
> -----Mensaje original-----
> De: Stafford Horne <shorne@gmail.com>
> Enviado el: miércoles, 9 de octubre de 2019 13:38
> Para: ecalvo at 2se.es; Julius Baxter <juliusbaxter@gmail.com>
> Asunto: Re: other doubt
>
> Hello Elisa,
>
> If you simulate with Icarus or modelsim you will be able to measure pretty much the same performance characteristics as FPGA. So there is no need to go straight to FPGA.
>
> In terms of my example C code is one option.  You can also read timer data directly from the tick timer in assembly and achieve the same thing.
>
> If you are interested we can CC the mailing list and get more opinions.
>
> -Stafford
>
> On Wed, Oct 9, 2019 at 5:09 PM <ecalvo@2se.es> wrote:
> >
> > Hi Stafford,
> >
> > Nice to meet you and, first of all, thanks a lot for your guidance. I am new on this, and although there is some documentation, sometimes it is difficult some point which maybe it is basic.
> >
> > Ok, to your comments. If.."A simulator like QEMU or or1ksim will not give and exact representation of the CPUs real time performance"...then...if I simulate directly the processor with modelsim, icarus or a similar tool...neither I get a real performance, don’t I? And values for the counters that you tell me to enable, neither are real, isn't it? should I execute it directly on the FPGA and it will depends on the implementation?
> >
> > Ok, to C code. I have understood the dependency with toolchain.
> >
> > Thanks a lot again.
> > Best regards,
> > Elisa
> >
> >
> > -----Mensaje original-----
> > De: Stafford Horne <shorne@gmail.com> Enviado el: martes, 8 de 
> > octubre de 2019 16:18
> > Para: Julius Baxter <juliusbaxter@gmail.com>
> > CC: ecalvo at 2se.es
> > Asunto: Re: other doubt
> >
> > Hi Elisa,
> >
> > OpenRISC cpu's can run any algorithm, but how well it will perform depends on many things:
> >
> >   - Compiler optimization flags (i.e. -O3)
> >   - Whether or not you are doing FPU instructions and have FPU enabled
> >   - Whether or not you use multiply and divide and have these instructions
> >     enabled
> >   - The frequency you are running
> >   - Cache settings Icache Dcache
> >   - The type of algorithm, does it require lots of data which will cause many
> >     cache misses?
> >
> > A simulator like QEMU or or1ksim will not give and exact representation of the CPUs real time performance.  It can tell you which intructions will be executed, but not how fast those will run or how many pipeline stalls of cache misses will happen.
> >
> > You can use the performance counters, they are supported in mor1kx if you enable them with the FEATURE_PERFCOUNTERS='ENABLED' parameter.  They can help count how many events happen between certain events.  Then you can combine them with a timer and watchpoints to detect how many times a loop can execute in 1000 clock cycles etc.  Please read about PCCRn and PCMRn in the architecture manual.
> >
> > It might be just as easy to use simple timing in a c program though, depending on the toolchain you use you can compare times between runs of your algorithm.
> > i.e.
> >
> >     #include <time.h>
> >     #include <stdio.h>
> >
> >     static long to_micro(struct timespec *time) {
> >       return (time->tv_sec * 1000000) + (time->tv_nsec / 1000);
> >     }
> >
> >     int main() {
> >       int i, j = 0;
> >
> >       struct timespec before, after;
> >
> >       clock_gettime(CLOCK_MONOTONIC, &before);
> >       /* Super complex algorithm */
> >       for (int i = 0; i < 100; i++) {
> >         j = (j+1) * (j+2);
> >       }
> >       clock_gettime(CLOCK_MONOTONIC, &after);
> >
> >       printf("time to run algorithm %ld uSecs\n", to_micro(&after) - 
> > to_micro(&before));
> >
> >       return 0;
> >     }
> >
> > $ or1k-smh-linux-gnu-gcc timer.c
> > $ ./glibc-build-scripts/qemu-or1k-libc ./a.out time to run algorithm
> > 164 uSecs
> >
> > I hope it helps.
> >
> > -Stafford
> >
> > On Tue, Oct 08, 2019 at 10:54:29PM +1100, Julius Baxter wrote:
> > > Hi,
> > >
> > > No problem.
> > >
> > > There are performance counters in the OpenRISC architecture but 
> > > whether they're implemented in a particular implementation is another matter.
> > >
> > > You can use these registers to measure various things the CPU is 
> > > doing while it's executing. If you read the ISA document it'll 
> > > tell you about them.
> > >
> > > I'm CCing Stafford because he's the main OpenRISC man these days 
> > > and probably knows about the state of the performance counter 
> > > registers in various simulators and RTL implementations.
> > >
> > > Cheers,
> > > Julius
> > >
> > > On Tue., 8 Oct. 2019, 10:43 pm , <ecalvo@2se.es> wrote:
> > >
> > > > Hi Julius,
> > > >
> > > >
> > > >
> > > > Sorry for bothering you again ☹. Can I do you other fast 
> > > > question related to openrisc? If not, ignore the email please.
> > > >
> > > >
> > > >
> > > > Is there any way to characterize the type of application that I 
> > > > can run in openrisc? I mean, could you measure (with numbers) if 
> > > > an algorithm can be executed on it and the speed that it will achieve?
> > > > Is it possible to do it using orksim?
> > > >
> > > >
> > > >
> > > > Sorry because maybe it is so basic and general ☹
> > > >
> > > >
> > > >
> > > > Thanks in advance
> > > >
> > > > Elisa
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > *De:* Julius Baxter <juliusbaxter@gmail.com> *Enviado el:* 
> > > > lunes,
> > > > 16 de septiembre de 2019 13:11
> > > > *Para:* ecalvo at 2se.es
> > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > >
> > > >
> > > >
> > > > Also! To let you know, if you're in Spain, we will soon be 
> > > > having our ORConf conference in Europe, and it's in Bordeaux, 
> > > > France, just across the border. :)
> > > >
> > > >
> > > >
> > > > There are several people there who can help you get up to speed, 
> > > > one of whom is Stafford Horne who knows most about the OpenRISC 
> > > > IP lately. He will be presenting. If you can attend, it'd be helpful, I'm sure.
> > > >
> > > >
> > > >
> > > > All info at https://orconf.org
> > > >
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Julius
> > > >
> > > >
> > > >
> > > > On Mon, 16 Sep 2019 at 21:09, Julius Baxter 
> > > > <juliusbaxter@gmail.com>
> > > > wrote:
> > > >
> > > > Hi Elisa,
> > > >
> > > >
> > > >
> > > > Sorry for the delay in this response.
> > > >
> > > >
> > > >
> > > > You should be using an SoC toplevel. FPGAs have everything you 
> > > > need on board like memories and IO blocks and lots of other FPGA 
> > > > fabric for you to implement other pieces of hardware.
> > > >
> > > >
> > > >
> > > > FuseSoC provides a really nice and easy way to build an mor1kx 
> > > > design for the DE0 nano I believe:
> > > >
> > > >
> > > >
> > > > https://github.com/olofk/de0_nano
> > > >
> > > >
> > > >
> > > > That github page has a rough guide to getting it going.
> > > >
> > > >
> > > >
> > > > If you need help I recommend posting to the OpenRISC mailing 
> > > > list and people will respond probably more promptly than I. (I 
> > > > recommend getting to know how to use mailing lists.
> > > > https://openrisc.io/community
> > > >
> > > >
> > > >
> > > > There are more resources here: https://openrisc.io/tutorials
> > > >
> > > >
> > > >
> > > > I hope that's helpful.
> > > >
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Julius
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, 11 Sep 2019 at 20:09, <ecalvo@2se.es> wrote:
> > > >
> > > > Hi Julius,
> > > >
> > > >
> > > >
> > > > Thanks a lot for the quick answer.
> > > >
> > > >
> > > >
> > > > Yes, this is the problem: I am using as top level the mor1kx 
> > > > module itself. You mean that I need to synthetize also in 
> > > > reconfigurable logic these cores, don’t you? I thought that I 
> > > > could have these elements as external in a development board.
> > > >
> > > >
> > > >
> > > > Thanks again,
> > > >
> > > > Cheers
> > > >
> > > > Elisa
> > > >
> > > >
> > > >
> > > > *De:* Julius Baxter <juliusbaxter@gmail.com> *Enviado el:* 
> > > > miércoles, 11 de septiembre de 2019 12:02
> > > > *Para:* ecalvo at 2se.es
> > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > >
> > > >
> > > >
> > > > Hi Elisa,
> > > >
> > > >
> > > >
> > > > Thanks for getting in touch, that sounds like a cool project.
> > > >
> > > >
> > > >
> > > > Can you tell me about the toplevel - are you using a system 
> > > > toplevel, or is your toplevel the mor1kx module itself?
> > > >
> > > >
> > > >
> > > > If it's the latter, then that's not the best way to do it - you 
> > > > need a system toplevel which instantiates memories and some 
> > > > reset circuitry and likely some IO (UART, GPIO, JTAG debug, etc.) to talk to the outside world.
> > > >
> > > >
> > > >
> > > > Is that helpful?
> > > >
> > > >
> > > >
> > > > Cheers,
> > > >
> > > > Julius
> > > >
> > > >
> > > >
> > > > On Wed, 11 Sep 2019 at 19:47, <ecalvo@2se.es> wrote:
> > > >
> > > > Dear Dr. Baxter,
> > > >
> > > >
> > > >
> > > > My name is Elisa Calvo Gallego. I am writing you because I have 
> > > > started to work with OpenRISC in the framework of a research 
> > > > project developed in the company where I am working (Space 
> > > > Submicron Electronics, 2SE), and I am having some basic troubles. Could you help me?
> > > >
> > > >
> > > >
> > > > Although the FPGA that we are planning to use is larger, I have 
> > > > synthetized mor1kx for a DE0 nano board as first step (this is 
> > > > the board used in the majority of guides and tutorials). My 
> > > > problem is that the results that I have obtained are similar in 
> > > > area and resources, except for IOBs, which are more than 
> > > > available IOBs in the device. Do you know what I am doing wrong? 
> > > > Should I comment debug lines or something like that? I apologize 
> > > > if the question is immediate. I didn't find the answer and I'm new in this.
> > > >
> > > >
> > > >
> > > > Thanks very much in advance.
> > > >
> > > > Best regards,
> > > >
> > > >
> > > >
> > > > Elisa
> > > >
> > > >
> >


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [OpenRISC] PCCR and PCRM registers
  2019-10-18 10:51       ` ecalvo
@ 2019-10-18 14:40         ` Stafford Horne
       [not found]           ` <013f01d58976$1c2c7700$54856500$@2se.es>
  0 siblings, 1 reply; 7+ messages in thread
From: Stafford Horne @ 2019-10-18 14:40 UTC (permalink / raw)
  To: openrisc

Hello,

Which core are you using?  With mor1kx I did not have an issue with
hanging.  I had to enable perf counters via parameters.

I extended and posted my example code here:
https://gist.github.com/stffrdhrn/6343706cb1d8124bbac6bb579b6913b0

The results looks like:

 Compile: or1k-elf-gcc or1k-perfcounters.c
 Run: fusesoc run --target mor1kx_tb mor1kx-generic --elf-load ./a.out

 Example Output:

 We have 4 program counters.
 PCCR before setup 0
 Run a printf, to generate instructions..
 IF 1825
 ICS 237

 This shows that the printf took 1825 instructions, 237 times the pipeline
 stalled due to instruction cache misses.

On Fri, Oct 18, 2019 at 7:51 PM <ecalvo@2se.es> wrote:
>
> Hi Stafford,
>
> Sorry for bothering you again.
>
> The program is hanged accessing to PCMR and PCCR registers. Should I change other features apart from perfcounters?
>
> Elisa
>
> -----Mensaje original-----
> De: Stafford Horne <shorne@gmail.com>
> Enviado el: jueves, 17 de octubre de 2019 23:42
> Para: ecalvo at 2se.es
> CC: Openrisc <openrisc@lists.librecores.org>
> Asunto: Re: PCCR and PCRM registers
>
> Hi Elisa,
>
> Right, these are the functions.  You only need to be concerned with:
>   OR1K_SPR_PERF_PCCR_ADDR(n)
>   OR1K_SPR_PERF_PCMR_ADDR(n)
>   OR1K_SPR_SYS_PCCFGR_ADDR
>   OR1K_SPR_SYS_PCCFGR_NPC_GET(x)
>
> The others are used internally.
> The PCCFGR is read only, it specifies how many performance counters your CPU has built in.  It may be 0-7.
>
> An example of how to use them:
>
> #include <or1k-support.h>
> #include <or1k-sprs.h>
> #include <stdio.h>
>
> #define PCMR_CISM 1<<3
> #define PCMR_CIUM 1<<3
> #define PCMR_IF   1<<6
>
> int main() {
>
>   int number_of_pcs;
>   uint32_t pcmr, pccr, upr, pccfgr;
>
>   /* Check if PCs are even available */
>   upr = or1k_mfspr(OR1K_SPR_SYS_UPR_ADDR);
>
>   if (OR1K_SPR_SYS_UPR_PCUP_GET(upr)) {
>
>     pccfgr = or1k_mfspr(OR1K_SPR_SYS_PCCFGR_ADDR);
>     number_of_pcs = OR1K_SPR_SYS_PCCFGR_NPC_GET(pccfgr) + 1;
>
>     printf ("We have %d program counters.\n", number_of_pcs);
>
>     pccr = or1k_mfspr(OR1K_SPR_PERF_PCCR_ADDR(0));
>     printf ("PCCR before setup %x\n", pccr);
>
>     /* Turn on counter and enable instruction fetch counting */
>     pcmr = or1k_mfspr(OR1K_SPR_PERF_PCMR_ADDR(0));
>     pcmr |= PCMR_CISM | PCMR_CIUM | PCMR_IF;
>     or1k_mtspr(OR1K_SPR_PERF_PCMR_ADDR(0), pcmr);
>
>     /* Read the PCCR after we are done */
>     printf ("Run a printf.");
>     pccr = or1k_mfspr(OR1K_SPR_PERF_PCCR_ADDR(0));
>     printf ("PCCR after printf %x\n", pccr);
>   } else {
>     printf ("No performance counters available.\n");
>   }
>
>   return 0;
> }
>
> On Fri, Oct 18, 2019 at 12:01 AM <ecalvo@2se.es> wrote:
> >
> > Hi Stafford,
> >
> >
> >
> > Yes, I am using newlib. I had discovered both files; these registers
> > in sprs.h
> >
> >
> >
> > /******************************/
> >
> > /* Performance Counters Group */
> >
> > /******************************/
> >
> > #define OR1K_SPR_PERF_GROUP 0x07
> >
> >
> >
> > /* Performance Counters Count Registers */
> >
> > #define OR1K_SPR_PERF_PCCR_BASE     OR1K_UNSIGNED(0x000)
> >
> > #define OR1K_SPR_PERF_PCCR_COUNT    OR1K_UNSIGNED(0x008)
> >
> > #define OR1K_SPR_PERF_PCCR_STEP     OR1K_UNSIGNED(0x001)
> >
> > #define OR1K_SPR_PERF_PCCR_INDEX(N) (OR1K_SPR_PERF_PCCR_BASE + ((N) *
> > OR1K_SPR_PERF_PCCR_STEP))
> >
> > #define OR1K_SPR_PERF_PCCR_ADDR(N)  ((OR1K_SPR_PERF_GROUP <<
> > OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCCR_INDEX(N))
> >
> >
> >
> > /* Performance Counters Mode Registers */
> >
> > #define OR1K_SPR_PERF_PCMR_BASE     OR1K_UNSIGNED(0x008)
> >
> > #define OR1K_SPR_PERF_PCMR_COUNT    OR1K_UNSIGNED(0x008)
> >
> > #define OR1K_SPR_PERF_PCMR_STEP     OR1K_UNSIGNED(0x001)
> >
> > #define OR1K_SPR_PERF_PCMR_INDEX(N) (OR1K_SPR_PERF_PCMR_BASE + ((N)
> > *OR1K_SPR_PERF_PCMR_STEP))
> >
> > #define OR1K_SPR_PERF_PCMR_ADDR(N)  ((OR1K_SPR_PERF_GROUP <<
> > OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCMR_INDEX(N))
> >
> >
> >
> > /* Performance Counters Configuration */
> >
> > #define OR1K_SPR_SYS_PCCFGR_INDEX OR1K_UNSIGNED(0x008)
> >
> > #define OR1K_SPR_SYS_PCCFGR_ADDR  OR1K_UNSIGNED(0x0008)
> >
> >
> >
> > /* Number of Performance Counters */
> >
> > #define OR1K_SPR_SYS_PCCFGR_NPC_LSB    0
> >
> > #define OR1K_SPR_SYS_PCCFGR_NPC_MSB    2
> >
> > #define OR1K_SPR_SYS_PCCFGR_NPC_BITS   3
> >
> > #define OR1K_SPR_SYS_PCCFGR_NPC_MASK   OR1K_UNSIGNED(0x00000007)
> >
> > #define OR1K_SPR_SYS_PCCFGR_NPC_GET(X) (((X) >> 0) &
> > OR1K_UNSIGNED(0x00000007))
> >
> > #define OR1K_SPR_SYS_PCCFGR_NPC_SET(X, Y) (((X) &
> > OR1K_UNSIGNED(0xfffffff8)) | ((Y) << 0))
> >
> >
> >
> > And these functions in support.h
> >
> >
> >
> > static inline void or1k_mtspr (uint32_t spr, uint32_t value)
> >
> > static inline uint32_t or1k_mfspr (uint32_t spr)
> >
> >
> >
> > Despite this I don’t have clear how to use it.
> >
> >
> >
> > If I do: or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR , 0)  -> Does it allow me to configure the PCCFGR to one performance counter?
> >
> > Is This the same than or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR , OR1K_SPR_SYS_PCCFGR_NPC_LSB ) or  Do OR1K_SPR_SYS_PCCFGR_NPC_LSB   , OR1K_SPR_SYS_PCCFGR_NPC_MSB, etc. provide different functions on each one performance counter?
> >
> > What is the meaning of PCCR_BASE,  PCCR_COUNT, PCCR_STEP,
> > PCMR_INDEX(N), PCMR_ADDR(N) ? (the same for PCMR) (Is BASE the base address of all PCCR and ADDR the position of each one of them? ….why PCMR_BASE and COUNT hasta de same value OR1K_UNSIGNED(0x008)? ) Should I define first PCCFGR, second PCMR and last get PCCR?
> >
> >
> >
> > Thanks and sorry for the inconveniences.
> >
> > Elisa
> >
> >
> >
> > De: Stafford Horne <shorne@gmail.com>
> > Enviado el: jueves, 17 de octubre de 2019 7:14
> > Para: ecalvo at 2se.es; Openrisc <openrisc@lists.librecores.org>
> > Asunto: Re: PCCR and PCRM registers
> >
> >
> >
> > +cc mailing list,
> >
> >
> >
> > Hi Elisa,
> >
> >
> >
> > Which toolchain are you using? I guess newlib?
> >
> >
> >
> > It has functions like or1k_mfspr() and or1k_mtspr() see or1k-support.h and or1k-sprs.h headers for details.
> >
> >
> >
> > -Stafford
> >
> > On Wed, Oct 16, 2019, 8:40 PM <ecalvo@2se.es> wrote:
> >
> > Hi Stafford,
> >
> > I am with PCCR and PCRM registers. I have seen that I can access from asm language, but there is functions to access from C? Have you got any example about their usage?
> >
> > I have already confirmed my subscription to the mailing list.
> >
> > Thanks
> > Elisa
> >
> > -----Mensaje original-----
> > De: Stafford Horne <shorne@gmail.com>
> > Enviado el: miércoles, 9 de octubre de 2019 13:38
> > Para: ecalvo at 2se.es; Julius Baxter <juliusbaxter@gmail.com>
> > Asunto: Re: other doubt
> >
> > Hello Elisa,
> >
> > If you simulate with Icarus or modelsim you will be able to measure pretty much the same performance characteristics as FPGA. So there is no need to go straight to FPGA.
> >
> > In terms of my example C code is one option.  You can also read timer data directly from the tick timer in assembly and achieve the same thing.
> >
> > If you are interested we can CC the mailing list and get more opinions.
> >
> > -Stafford
> >
> > On Wed, Oct 9, 2019 at 5:09 PM <ecalvo@2se.es> wrote:
> > >
> > > Hi Stafford,
> > >
> > > Nice to meet you and, first of all, thanks a lot for your guidance. I am new on this, and although there is some documentation, sometimes it is difficult some point which maybe it is basic.
> > >
> > > Ok, to your comments. If.."A simulator like QEMU or or1ksim will not give and exact representation of the CPUs real time performance"...then...if I simulate directly the processor with modelsim, icarus or a similar tool...neither I get a real performance, don’t I? And values for the counters that you tell me to enable, neither are real, isn't it? should I execute it directly on the FPGA and it will depends on the implementation?
> > >
> > > Ok, to C code. I have understood the dependency with toolchain.
> > >
> > > Thanks a lot again.
> > > Best regards,
> > > Elisa
> > >
> > >
> > > -----Mensaje original-----
> > > De: Stafford Horne <shorne@gmail.com> Enviado el: martes, 8 de
> > > octubre de 2019 16:18
> > > Para: Julius Baxter <juliusbaxter@gmail.com>
> > > CC: ecalvo at 2se.es
> > > Asunto: Re: other doubt
> > >
> > > Hi Elisa,
> > >
> > > OpenRISC cpu's can run any algorithm, but how well it will perform depends on many things:
> > >
> > >   - Compiler optimization flags (i.e. -O3)
> > >   - Whether or not you are doing FPU instructions and have FPU enabled
> > >   - Whether or not you use multiply and divide and have these instructions
> > >     enabled
> > >   - The frequency you are running
> > >   - Cache settings Icache Dcache
> > >   - The type of algorithm, does it require lots of data which will cause many
> > >     cache misses?
> > >
> > > A simulator like QEMU or or1ksim will not give and exact representation of the CPUs real time performance.  It can tell you which intructions will be executed, but not how fast those will run or how many pipeline stalls of cache misses will happen.
> > >
> > > You can use the performance counters, they are supported in mor1kx if you enable them with the FEATURE_PERFCOUNTERS='ENABLED' parameter.  They can help count how many events happen between certain events.  Then you can combine them with a timer and watchpoints to detect how many times a loop can execute in 1000 clock cycles etc.  Please read about PCCRn and PCMRn in the architecture manual.
> > >
> > > It might be just as easy to use simple timing in a c program though, depending on the toolchain you use you can compare times between runs of your algorithm.
> > > i.e.
> > >
> > >     #include <time.h>
> > >     #include <stdio.h>
> > >
> > >     static long to_micro(struct timespec *time) {
> > >       return (time->tv_sec * 1000000) + (time->tv_nsec / 1000);
> > >     }
> > >
> > >     int main() {
> > >       int i, j = 0;
> > >
> > >       struct timespec before, after;
> > >
> > >       clock_gettime(CLOCK_MONOTONIC, &before);
> > >       /* Super complex algorithm */
> > >       for (int i = 0; i < 100; i++) {
> > >         j = (j+1) * (j+2);
> > >       }
> > >       clock_gettime(CLOCK_MONOTONIC, &after);
> > >
> > >       printf("time to run algorithm %ld uSecs\n", to_micro(&after) -
> > > to_micro(&before));
> > >
> > >       return 0;
> > >     }
> > >
> > > $ or1k-smh-linux-gnu-gcc timer.c
> > > $ ./glibc-build-scripts/qemu-or1k-libc ./a.out time to run algorithm
> > > 164 uSecs
> > >
> > > I hope it helps.
> > >
> > > -Stafford
> > >
> > > On Tue, Oct 08, 2019 at 10:54:29PM +1100, Julius Baxter wrote:
> > > > Hi,
> > > >
> > > > No problem.
> > > >
> > > > There are performance counters in the OpenRISC architecture but
> > > > whether they're implemented in a particular implementation is another matter.
> > > >
> > > > You can use these registers to measure various things the CPU is
> > > > doing while it's executing. If you read the ISA document it'll
> > > > tell you about them.
> > > >
> > > > I'm CCing Stafford because he's the main OpenRISC man these days
> > > > and probably knows about the state of the performance counter
> > > > registers in various simulators and RTL implementations.
> > > >
> > > > Cheers,
> > > > Julius
> > > >
> > > > On Tue., 8 Oct. 2019, 10:43 pm , <ecalvo@2se.es> wrote:
> > > >
> > > > > Hi Julius,
> > > > >
> > > > >
> > > > >
> > > > > Sorry for bothering you again ☹. Can I do you other fast
> > > > > question related to openrisc? If not, ignore the email please.
> > > > >
> > > > >
> > > > >
> > > > > Is there any way to characterize the type of application that I
> > > > > can run in openrisc? I mean, could you measure (with numbers) if
> > > > > an algorithm can be executed on it and the speed that it will achieve?
> > > > > Is it possible to do it using orksim?
> > > > >
> > > > >
> > > > >
> > > > > Sorry because maybe it is so basic and general ☹
> > > > >
> > > > >
> > > > >
> > > > > Thanks in advance
> > > > >
> > > > > Elisa
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > *De:* Julius Baxter <juliusbaxter@gmail.com> *Enviado el:*
> > > > > lunes,
> > > > > 16 de septiembre de 2019 13:11
> > > > > *Para:* ecalvo at 2se.es
> > > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > > >
> > > > >
> > > > >
> > > > > Also! To let you know, if you're in Spain, we will soon be
> > > > > having our ORConf conference in Europe, and it's in Bordeaux,
> > > > > France, just across the border. :)
> > > > >
> > > > >
> > > > >
> > > > > There are several people there who can help you get up to speed,
> > > > > one of whom is Stafford Horne who knows most about the OpenRISC
> > > > > IP lately. He will be presenting. If you can attend, it'd be helpful, I'm sure.
> > > > >
> > > > >
> > > > >
> > > > > All info at https://orconf.org
> > > > >
> > > > >
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Julius
> > > > >
> > > > >
> > > > >
> > > > > On Mon, 16 Sep 2019 at 21:09, Julius Baxter
> > > > > <juliusbaxter@gmail.com>
> > > > > wrote:
> > > > >
> > > > > Hi Elisa,
> > > > >
> > > > >
> > > > >
> > > > > Sorry for the delay in this response.
> > > > >
> > > > >
> > > > >
> > > > > You should be using an SoC toplevel. FPGAs have everything you
> > > > > need on board like memories and IO blocks and lots of other FPGA
> > > > > fabric for you to implement other pieces of hardware.
> > > > >
> > > > >
> > > > >
> > > > > FuseSoC provides a really nice and easy way to build an mor1kx
> > > > > design for the DE0 nano I believe:
> > > > >
> > > > >
> > > > >
> > > > > https://github.com/olofk/de0_nano
> > > > >
> > > > >
> > > > >
> > > > > That github page has a rough guide to getting it going.
> > > > >
> > > > >
> > > > >
> > > > > If you need help I recommend posting to the OpenRISC mailing
> > > > > list and people will respond probably more promptly than I. (I
> > > > > recommend getting to know how to use mailing lists.
> > > > > https://openrisc.io/community
> > > > >
> > > > >
> > > > >
> > > > > There are more resources here: https://openrisc.io/tutorials
> > > > >
> > > > >
> > > > >
> > > > > I hope that's helpful.
> > > > >
> > > > >
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Julius
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, 11 Sep 2019 at 20:09, <ecalvo@2se.es> wrote:
> > > > >
> > > > > Hi Julius,
> > > > >
> > > > >
> > > > >
> > > > > Thanks a lot for the quick answer.
> > > > >
> > > > >
> > > > >
> > > > > Yes, this is the problem: I am using as top level the mor1kx
> > > > > module itself. You mean that I need to synthetize also in
> > > > > reconfigurable logic these cores, don’t you? I thought that I
> > > > > could have these elements as external in a development board.
> > > > >
> > > > >
> > > > >
> > > > > Thanks again,
> > > > >
> > > > > Cheers
> > > > >
> > > > > Elisa
> > > > >
> > > > >
> > > > >
> > > > > *De:* Julius Baxter <juliusbaxter@gmail.com> *Enviado el:*
> > > > > miércoles, 11 de septiembre de 2019 12:02
> > > > > *Para:* ecalvo at 2se.es
> > > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > > >
> > > > >
> > > > >
> > > > > Hi Elisa,
> > > > >
> > > > >
> > > > >
> > > > > Thanks for getting in touch, that sounds like a cool project.
> > > > >
> > > > >
> > > > >
> > > > > Can you tell me about the toplevel - are you using a system
> > > > > toplevel, or is your toplevel the mor1kx module itself?
> > > > >
> > > > >
> > > > >
> > > > > If it's the latter, then that's not the best way to do it - you
> > > > > need a system toplevel which instantiates memories and some
> > > > > reset circuitry and likely some IO (UART, GPIO, JTAG debug, etc.) to talk to the outside world.
> > > > >
> > > > >
> > > > >
> > > > > Is that helpful?
> > > > >
> > > > >
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Julius
> > > > >
> > > > >
> > > > >
> > > > > On Wed, 11 Sep 2019 at 19:47, <ecalvo@2se.es> wrote:
> > > > >
> > > > > Dear Dr. Baxter,
> > > > >
> > > > >
> > > > >
> > > > > My name is Elisa Calvo Gallego. I am writing you because I have
> > > > > started to work with OpenRISC in the framework of a research
> > > > > project developed in the company where I am working (Space
> > > > > Submicron Electronics, 2SE), and I am having some basic troubles. Could you help me?
> > > > >
> > > > >
> > > > >
> > > > > Although the FPGA that we are planning to use is larger, I have
> > > > > synthetized mor1kx for a DE0 nano board as first step (this is
> > > > > the board used in the majority of guides and tutorials). My
> > > > > problem is that the results that I have obtained are similar in
> > > > > area and resources, except for IOBs, which are more than
> > > > > available IOBs in the device. Do you know what I am doing wrong?
> > > > > Should I comment debug lines or something like that? I apologize
> > > > > if the question is immediate. I didn't find the answer and I'm new in this.
> > > > >
> > > > >
> > > > >
> > > > > Thanks very much in advance.
> > > > >
> > > > > Best regards,
> > > > >
> > > > >
> > > > >
> > > > > Elisa
> > > > >
> > > > >
> > >
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [OpenRISC] PCCR and PCRM registers
       [not found]             ` <20191023210532.GI24874@lianli.shorne-pla.net>
@ 2019-10-23 23:05               ` Stafford Horne
  2019-10-24  7:18                 ` ecalvo
  0 siblings, 1 reply; 7+ messages in thread
From: Stafford Horne @ 2019-10-23 23:05 UTC (permalink / raw)
  To: openrisc

+ cc OpenRISC lost

On Thu, Oct 24, 2019, 6:05 AM Stafford Horne <shorne@gmail.com> wrote:

> Hi Elisa,
>
> On Wed, Oct 23, 2019 at 09:47:24AM +0200, ecalvo at 2se.es wrote:
> > Hi Stafford,
> >
> > Sorry for the delay in the answer. I have been out of the office.
> >
> > I am using mor1kx 4.1. What I have done is change FEATURE_PERFCOUNTERS
> to ENABLED in orpsoc_top.c in the library of FuseSoc.. I have also changed
> OPTION_PERFCOUNTERS_NUM but I have no seen changes in the output.
>
> The versions I am using are (as reported by busesoc):
>
> INFO: Preparing ::adv_debug_sys:3.1.0-r1
> INFO: Preparing ::cdc_utils:0.1
> INFO: Preparing ::elf-loader:1.0.2
> INFO: Preparing ::intgen:0
> INFO: Preparing ::jtag_tap:1.13-r1
> INFO: Preparing ::jtag_vpi:0-r4
> INFO: Preparing ::mor1kx:5.0-r3
> INFO: Preparing ::uart16550:1.5.5-r1
> INFO: Preparing ::verilog-arbiter:0-r2
> INFO: Preparing ::vlog_tb_utils:1.1
> INFO: Preparing ::wb_common:1.0.3
> INFO: Preparing ::wb_bfm:1.2.1
> INFO: Preparing ::wb_intercon:1.2.2
> INFO: Preparing ::wb_ram:1.1
> INFO: Preparing ::mor1kx-generic:1.1
>
> The changes I made in mor1kx-generic are:
>
> diff --git a/mor1kx-generic.core b/mor1kx-generic.core
> index afc3607..f3160b5 100644
> --- a/mor1kx-generic.core
> +++ b/mor1kx-generic.core
> @@ -10,7 +10,6 @@ filesets:
>    marocchino_modules:
>      depend:
>        - or1k_marocchino
> -
>    soc_files:
>      files:
>        - rtl/verilog/wb_intercon.vh: {is_include_file: true}
> @@ -31,7 +30,7 @@ filesets:
>      file_type: verilogSource
>      depend:
>        - elf-loader
> -      - "=jtag_vpi-r2"
> +      - ">=jtag_vpi-r2"
>        - ">=vlog_tb_utils-1.0"
>    verilator_tb_files:
>      files:
> diff --git a/rtl/verilog/orpsoc_top.v b/rtl/verilog/orpsoc_top.v
> index e2b04d6..4530e1d 100644
> --- a/rtl/verilog/orpsoc_top.v
> +++ b/rtl/verilog/orpsoc_top.v
> @@ -266,6 +266,8 @@ mor1kx #(
>         .OPTION_RF_NUM_SHADOW_GPR       (option_rf_num_shadow_gpr),
>         .IBUS_WB_TYPE                   ("B3_REGISTERED_FEEDBACK"),
>         .DBUS_WB_TYPE                   ("B3_REGISTERED_FEEDBACK"),
> +       .FEATURE_PERFCOUNTERS           ("ENABLED"),
> +       .OPTION_PERFCOUNTERS_NUM        (3),
>         .OPTION_CPU0                    (pipeline),
>         .OPTION_RESET_PC                (32'h00000100)
>  ) mor1kx0 (
>
>
> If using this version is not possible then it would be best that you look
> into
> traces and try to get some insight to where things are getting stuck.
>
> -Stafford
>
> > Output:
> > We have 1 program counters.
> >  (hanged)
> >
> > I have checked that features have been modified in the fuse build files.
> >
> > Elisa
> >
> >
> >
> > -----Mensaje original-----
> > De: Stafford Horne <shorne@gmail.com>
> > Enviado el: viernes, 18 de octubre de 2019 16:41
> > Para: ecalvo at 2se.es
> > CC: Openrisc <openrisc@lists.librecores.org>
> > Asunto: Re: PCCR and PCRM registers
> >
> > Hello,
> >
> > Which core are you using?  With mor1kx I did not have an issue with
> hanging.  I had to enable perf counters via parameters.
> >
> > I extended and posted my example code here:
> > https://gist.github.com/stffrdhrn/6343706cb1d8124bbac6bb579b6913b0
> >
> > The results looks like:
> >
> >  Compile: or1k-elf-gcc or1k-perfcounters.c
> >  Run: fusesoc run --target mor1kx_tb mor1kx-generic --elf-load ./a.out
> >
> >  Example Output:
> >
> >  We have 4 program counters.
> >  PCCR before setup 0
> >  Run a printf, to generate instructions..
> >  IF 1825
> >  ICS 237
> >
> >  This shows that the printf took 1825 instructions, 237 times the
> pipeline  stalled due to instruction cache misses.
> >
> > On Fri, Oct 18, 2019 at 7:51 PM <ecalvo@2se.es> wrote:
> > >
> > > Hi Stafford,
> > >
> > > Sorry for bothering you again.
> > >
> > > The program is hanged accessing to PCMR and PCCR registers. Should I
> change other features apart from perfcounters?
> > >
> > > Elisa
> > >
> > > -----Mensaje original-----
> > > De: Stafford Horne <shorne@gmail.com>
> > > Enviado el: jueves, 17 de octubre de 2019 23:42
> > > Para: ecalvo at 2se.es
> > > CC: Openrisc <openrisc@lists.librecores.org>
> > > Asunto: Re: PCCR and PCRM registers
> > >
> > > Hi Elisa,
> > >
> > > Right, these are the functions.  You only need to be concerned with:
> > >   OR1K_SPR_PERF_PCCR_ADDR(n)
> > >   OR1K_SPR_PERF_PCMR_ADDR(n)
> > >   OR1K_SPR_SYS_PCCFGR_ADDR
> > >   OR1K_SPR_SYS_PCCFGR_NPC_GET(x)
> > >
> > > The others are used internally.
> > > The PCCFGR is read only, it specifies how many performance counters
> your CPU has built in.  It may be 0-7.
> > >
> > > An example of how to use them:
> > >
> > > #include <or1k-support.h>
> > > #include <or1k-sprs.h>
> > > #include <stdio.h>
> > >
> > > #define PCMR_CISM 1<<3
> > > #define PCMR_CIUM 1<<3
> > > #define PCMR_IF   1<<6
> > >
> > > int main() {
> > >
> > >   int number_of_pcs;
> > >   uint32_t pcmr, pccr, upr, pccfgr;
> > >
> > >   /* Check if PCs are even available */
> > >   upr = or1k_mfspr(OR1K_SPR_SYS_UPR_ADDR);
> > >
> > >   if (OR1K_SPR_SYS_UPR_PCUP_GET(upr)) {
> > >
> > >     pccfgr = or1k_mfspr(OR1K_SPR_SYS_PCCFGR_ADDR);
> > >     number_of_pcs = OR1K_SPR_SYS_PCCFGR_NPC_GET(pccfgr) + 1;
> > >
> > >     printf ("We have %d program counters.\n", number_of_pcs);
> > >
> > >     pccr = or1k_mfspr(OR1K_SPR_PERF_PCCR_ADDR(0));
> > >     printf ("PCCR before setup %x\n", pccr);
> > >
> > >     /* Turn on counter and enable instruction fetch counting */
> > >     pcmr = or1k_mfspr(OR1K_SPR_PERF_PCMR_ADDR(0));
> > >     pcmr |= PCMR_CISM | PCMR_CIUM | PCMR_IF;
> > >     or1k_mtspr(OR1K_SPR_PERF_PCMR_ADDR(0), pcmr);
> > >
> > >     /* Read the PCCR after we are done */
> > >     printf ("Run a printf.");
> > >     pccr = or1k_mfspr(OR1K_SPR_PERF_PCCR_ADDR(0));
> > >     printf ("PCCR after printf %x\n", pccr);
> > >   } else {
> > >     printf ("No performance counters available.\n");
> > >   }
> > >
> > >   return 0;
> > > }
> > >
> > > On Fri, Oct 18, 2019 at 12:01 AM <ecalvo@2se.es> wrote:
> > > >
> > > > Hi Stafford,
> > > >
> > > >
> > > >
> > > > Yes, I am using newlib. I had discovered both files; these registers
> > > > in sprs.h
> > > >
> > > >
> > > >
> > > > /******************************/
> > > >
> > > > /* Performance Counters Group */
> > > >
> > > > /******************************/
> > > >
> > > > #define OR1K_SPR_PERF_GROUP 0x07
> > > >
> > > >
> > > >
> > > > /* Performance Counters Count Registers */
> > > >
> > > > #define OR1K_SPR_PERF_PCCR_BASE     OR1K_UNSIGNED(0x000)
> > > >
> > > > #define OR1K_SPR_PERF_PCCR_COUNT    OR1K_UNSIGNED(0x008)
> > > >
> > > > #define OR1K_SPR_PERF_PCCR_STEP     OR1K_UNSIGNED(0x001)
> > > >
> > > > #define OR1K_SPR_PERF_PCCR_INDEX(N) (OR1K_SPR_PERF_PCCR_BASE + ((N)
> > > > *
> > > > OR1K_SPR_PERF_PCCR_STEP))
> > > >
> > > > #define OR1K_SPR_PERF_PCCR_ADDR(N)  ((OR1K_SPR_PERF_GROUP <<
> > > > OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCCR_INDEX(N))
> > > >
> > > >
> > > >
> > > > /* Performance Counters Mode Registers */
> > > >
> > > > #define OR1K_SPR_PERF_PCMR_BASE     OR1K_UNSIGNED(0x008)
> > > >
> > > > #define OR1K_SPR_PERF_PCMR_COUNT    OR1K_UNSIGNED(0x008)
> > > >
> > > > #define OR1K_SPR_PERF_PCMR_STEP     OR1K_UNSIGNED(0x001)
> > > >
> > > > #define OR1K_SPR_PERF_PCMR_INDEX(N) (OR1K_SPR_PERF_PCMR_BASE + ((N)
> > > > *OR1K_SPR_PERF_PCMR_STEP))
> > > >
> > > > #define OR1K_SPR_PERF_PCMR_ADDR(N)  ((OR1K_SPR_PERF_GROUP <<
> > > > OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCMR_INDEX(N))
> > > >
> > > >
> > > >
> > > > /* Performance Counters Configuration */
> > > >
> > > > #define OR1K_SPR_SYS_PCCFGR_INDEX OR1K_UNSIGNED(0x008)
> > > >
> > > > #define OR1K_SPR_SYS_PCCFGR_ADDR  OR1K_UNSIGNED(0x0008)
> > > >
> > > >
> > > >
> > > > /* Number of Performance Counters */
> > > >
> > > > #define OR1K_SPR_SYS_PCCFGR_NPC_LSB    0
> > > >
> > > > #define OR1K_SPR_SYS_PCCFGR_NPC_MSB    2
> > > >
> > > > #define OR1K_SPR_SYS_PCCFGR_NPC_BITS   3
> > > >
> > > > #define OR1K_SPR_SYS_PCCFGR_NPC_MASK   OR1K_UNSIGNED(0x00000007)
> > > >
> > > > #define OR1K_SPR_SYS_PCCFGR_NPC_GET(X) (((X) >> 0) &
> > > > OR1K_UNSIGNED(0x00000007))
> > > >
> > > > #define OR1K_SPR_SYS_PCCFGR_NPC_SET(X, Y) (((X) &
> > > > OR1K_UNSIGNED(0xfffffff8)) | ((Y) << 0))
> > > >
> > > >
> > > >
> > > > And these functions in support.h
> > > >
> > > >
> > > >
> > > > static inline void or1k_mtspr (uint32_t spr, uint32_t value)
> > > >
> > > > static inline uint32_t or1k_mfspr (uint32_t spr)
> > > >
> > > >
> > > >
> > > > Despite this I don’t have clear how to use it.
> > > >
> > > >
> > > >
> > > > If I do: or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR , 0)  -> Does it allow
> me to configure the PCCFGR to one performance counter?
> > > >
> > > > Is This the same than or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR ,
> OR1K_SPR_SYS_PCCFGR_NPC_LSB ) or  Do OR1K_SPR_SYS_PCCFGR_NPC_LSB   ,
> OR1K_SPR_SYS_PCCFGR_NPC_MSB, etc. provide different functions on each one
> performance counter?
> > > >
> > > > What is the meaning of PCCR_BASE,  PCCR_COUNT, PCCR_STEP,
> > > > PCMR_INDEX(N), PCMR_ADDR(N) ? (the same for PCMR) (Is BASE the base
> address of all PCCR and ADDR the position of each one of them? ….why
> PCMR_BASE and COUNT hasta de same value OR1K_UNSIGNED(0x008)? ) Should I
> define first PCCFGR, second PCMR and last get PCCR?
> > > >
> > > >
> > > >
> > > > Thanks and sorry for the inconveniences.
> > > >
> > > > Elisa
> > > >
> > > >
> > > >
> > > > De: Stafford Horne <shorne@gmail.com> Enviado el: jueves, 17 de
> > > > octubre de 2019 7:14
> > > > Para: ecalvo at 2se.es; Openrisc <openrisc@lists.librecores.org>
> > > > Asunto: Re: PCCR and PCRM registers
> > > >
> > > >
> > > >
> > > > +cc mailing list,
> > > >
> > > >
> > > >
> > > > Hi Elisa,
> > > >
> > > >
> > > >
> > > > Which toolchain are you using? I guess newlib?
> > > >
> > > >
> > > >
> > > > It has functions like or1k_mfspr() and or1k_mtspr() see
> or1k-support.h and or1k-sprs.h headers for details.
> > > >
> > > >
> > > >
> > > > -Stafford
> > > >
> > > > On Wed, Oct 16, 2019, 8:40 PM <ecalvo@2se.es> wrote:
> > > >
> > > > Hi Stafford,
> > > >
> > > > I am with PCCR and PCRM registers. I have seen that I can access
> from asm language, but there is functions to access from C? Have you got
> any example about their usage?
> > > >
> > > > I have already confirmed my subscription to the mailing list.
> > > >
> > > > Thanks
> > > > Elisa
> > > >
> > > > -----Mensaje original-----
> > > > De: Stafford Horne <shorne@gmail.com> Enviado el: miércoles, 9 de
> > > > octubre de 2019 13:38
> > > > Para: ecalvo at 2se.es; Julius Baxter <juliusbaxter@gmail.com>
> > > > Asunto: Re: other doubt
> > > >
> > > > Hello Elisa,
> > > >
> > > > If you simulate with Icarus or modelsim you will be able to measure
> pretty much the same performance characteristics as FPGA. So there is no
> need to go straight to FPGA.
> > > >
> > > > In terms of my example C code is one option.  You can also read
> timer data directly from the tick timer in assembly and achieve the same
> thing.
> > > >
> > > > If you are interested we can CC the mailing list and get more
> opinions.
> > > >
> > > > -Stafford
> > > >
> > > > On Wed, Oct 9, 2019 at 5:09 PM <ecalvo@2se.es> wrote:
> > > > >
> > > > > Hi Stafford,
> > > > >
> > > > > Nice to meet you and, first of all, thanks a lot for your
> guidance. I am new on this, and although there is some documentation,
> sometimes it is difficult some point which maybe it is basic.
> > > > >
> > > > > Ok, to your comments. If.."A simulator like QEMU or or1ksim will
> not give and exact representation of the CPUs real time
> performance"...then...if I simulate directly the processor with modelsim,
> icarus or a similar tool...neither I get a real performance, don’t I? And
> values for the counters that you tell me to enable, neither are real, isn't
> it? should I execute it directly on the FPGA and it will depends on the
> implementation?
> > > > >
> > > > > Ok, to C code. I have understood the dependency with toolchain.
> > > > >
> > > > > Thanks a lot again.
> > > > > Best regards,
> > > > > Elisa
> > > > >
> > > > >
> > > > > -----Mensaje original-----
> > > > > De: Stafford Horne <shorne@gmail.com> Enviado el: martes, 8 de
> > > > > octubre de 2019 16:18
> > > > > Para: Julius Baxter <juliusbaxter@gmail.com>
> > > > > CC: ecalvo at 2se.es
> > > > > Asunto: Re: other doubt
> > > > >
> > > > > Hi Elisa,
> > > > >
> > > > > OpenRISC cpu's can run any algorithm, but how well it will perform
> depends on many things:
> > > > >
> > > > >   - Compiler optimization flags (i.e. -O3)
> > > > >   - Whether or not you are doing FPU instructions and have FPU
> enabled
> > > > >   - Whether or not you use multiply and divide and have these
> instructions
> > > > >     enabled
> > > > >   - The frequency you are running
> > > > >   - Cache settings Icache Dcache
> > > > >   - The type of algorithm, does it require lots of data which will
> cause many
> > > > >     cache misses?
> > > > >
> > > > > A simulator like QEMU or or1ksim will not give and exact
> representation of the CPUs real time performance.  It can tell you which
> intructions will be executed, but not how fast those will run or how many
> pipeline stalls of cache misses will happen.
> > > > >
> > > > > You can use the performance counters, they are supported in mor1kx
> if you enable them with the FEATURE_PERFCOUNTERS='ENABLED' parameter.  They
> can help count how many events happen between certain events.  Then you can
> combine them with a timer and watchpoints to detect how many times a loop
> can execute in 1000 clock cycles etc.  Please read about PCCRn and PCMRn in
> the architecture manual.
> > > > >
> > > > > It might be just as easy to use simple timing in a c program
> though, depending on the toolchain you use you can compare times between
> runs of your algorithm.
> > > > > i.e.
> > > > >
> > > > >     #include <time.h>
> > > > >     #include <stdio.h>
> > > > >
> > > > >     static long to_micro(struct timespec *time) {
> > > > >       return (time->tv_sec * 1000000) + (time->tv_nsec / 1000);
> > > > >     }
> > > > >
> > > > >     int main() {
> > > > >       int i, j = 0;
> > > > >
> > > > >       struct timespec before, after;
> > > > >
> > > > >       clock_gettime(CLOCK_MONOTONIC, &before);
> > > > >       /* Super complex algorithm */
> > > > >       for (int i = 0; i < 100; i++) {
> > > > >         j = (j+1) * (j+2);
> > > > >       }
> > > > >       clock_gettime(CLOCK_MONOTONIC, &after);
> > > > >
> > > > >       printf("time to run algorithm %ld uSecs\n", to_micro(&after)
> > > > > - to_micro(&before));
> > > > >
> > > > >       return 0;
> > > > >     }
> > > > >
> > > > > $ or1k-smh-linux-gnu-gcc timer.c
> > > > > $ ./glibc-build-scripts/qemu-or1k-libc ./a.out time to run
> > > > > algorithm
> > > > > 164 uSecs
> > > > >
> > > > > I hope it helps.
> > > > >
> > > > > -Stafford
> > > > >
> > > > > On Tue, Oct 08, 2019 at 10:54:29PM +1100, Julius Baxter wrote:
> > > > > > Hi,
> > > > > >
> > > > > > No problem.
> > > > > >
> > > > > > There are performance counters in the OpenRISC architecture but
> > > > > > whether they're implemented in a particular implementation is
> another matter.
> > > > > >
> > > > > > You can use these registers to measure various things the CPU is
> > > > > > doing while it's executing. If you read the ISA document it'll
> > > > > > tell you about them.
> > > > > >
> > > > > > I'm CCing Stafford because he's the main OpenRISC man these days
> > > > > > and probably knows about the state of the performance counter
> > > > > > registers in various simulators and RTL implementations.
> > > > > >
> > > > > > Cheers,
> > > > > > Julius
> > > > > >
> > > > > > On Tue., 8 Oct. 2019, 10:43 pm , <ecalvo@2se.es> wrote:
> > > > > >
> > > > > > > Hi Julius,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Sorry for bothering you again ☹. Can I do you other fast
> > > > > > > question related to openrisc? If not, ignore the email please.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Is there any way to characterize the type of application that
> > > > > > > I can run in openrisc? I mean, could you measure (with
> > > > > > > numbers) if an algorithm can be executed on it and the speed
> that it will achieve?
> > > > > > > Is it possible to do it using orksim?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Sorry because maybe it is so basic and general ☹
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Thanks in advance
> > > > > > >
> > > > > > > Elisa
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > *De:* Julius Baxter <juliusbaxter@gmail.com> *Enviado el:*
> > > > > > > lunes,
> > > > > > > 16 de septiembre de 2019 13:11
> > > > > > > *Para:* ecalvo at 2se.es
> > > > > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Also! To let you know, if you're in Spain, we will soon be
> > > > > > > having our ORConf conference in Europe, and it's in Bordeaux,
> > > > > > > France, just across the border. :)
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > There are several people there who can help you get up to
> > > > > > > speed, one of whom is Stafford Horne who knows most about the
> > > > > > > OpenRISC IP lately. He will be presenting. If you can attend,
> it'd be helpful, I'm sure.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > All info at https://orconf.org
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Julius
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, 16 Sep 2019 at 21:09, Julius Baxter
> > > > > > > <juliusbaxter@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > Hi Elisa,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Sorry for the delay in this response.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > You should be using an SoC toplevel. FPGAs have everything you
> > > > > > > need on board like memories and IO blocks and lots of other
> > > > > > > FPGA fabric for you to implement other pieces of hardware.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > FuseSoC provides a really nice and easy way to build an mor1kx
> > > > > > > design for the DE0 nano I believe:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > https://github.com/olofk/de0_nano
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > That github page has a rough guide to getting it going.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > If you need help I recommend posting to the OpenRISC mailing
> > > > > > > list and people will respond probably more promptly than I. (I
> > > > > > > recommend getting to know how to use mailing lists.
> > > > > > > https://openrisc.io/community
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > There are more resources here: https://openrisc.io/tutorials
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I hope that's helpful.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Julius
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, 11 Sep 2019 at 20:09, <ecalvo@2se.es> wrote:
> > > > > > >
> > > > > > > Hi Julius,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Thanks a lot for the quick answer.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Yes, this is the problem: I am using as top level the mor1kx
> > > > > > > module itself. You mean that I need to synthetize also in
> > > > > > > reconfigurable logic these cores, don’t you? I thought that I
> > > > > > > could have these elements as external in a development board.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Thanks again,
> > > > > > >
> > > > > > > Cheers
> > > > > > >
> > > > > > > Elisa
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > *De:* Julius Baxter <juliusbaxter@gmail.com> *Enviado el:*
> > > > > > > miércoles, 11 de septiembre de 2019 12:02
> > > > > > > *Para:* ecalvo at 2se.es
> > > > > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Hi Elisa,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Thanks for getting in touch, that sounds like a cool project.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Can you tell me about the toplevel - are you using a system
> > > > > > > toplevel, or is your toplevel the mor1kx module itself?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > If it's the latter, then that's not the best way to do it -
> > > > > > > you need a system toplevel which instantiates memories and
> > > > > > > some reset circuitry and likely some IO (UART, GPIO, JTAG
> debug, etc.) to talk to the outside world.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Is that helpful?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Julius
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, 11 Sep 2019 at 19:47, <ecalvo@2se.es> wrote:
> > > > > > >
> > > > > > > Dear Dr. Baxter,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > My name is Elisa Calvo Gallego. I am writing you because I
> > > > > > > have started to work with OpenRISC in the framework of a
> > > > > > > research project developed in the company where I am working
> > > > > > > (Space Submicron Electronics, 2SE), and I am having some basic
> troubles. Could you help me?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Although the FPGA that we are planning to use is larger, I
> > > > > > > have synthetized mor1kx for a DE0 nano board as first step
> > > > > > > (this is the board used in the majority of guides and
> > > > > > > tutorials). My problem is that the results that I have
> > > > > > > obtained are similar in area and resources, except for IOBs,
> > > > > > > which are more than available IOBs in the device. Do you know
> what I am doing wrong?
> > > > > > > Should I comment debug lines or something like that? I
> > > > > > > apologize if the question is immediate. I didn't find the
> answer and I'm new in this.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Thanks very much in advance.
> > > > > > >
> > > > > > > Best regards,
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Elisa
> > > > > > >
> > > > > > >
> > > > >
> > >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.librecores.org/pipermail/openrisc/attachments/20191024/e74c375f/attachment-0001.html>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [OpenRISC] PCCR and PCRM registers
  2019-10-23 23:05               ` Stafford Horne
@ 2019-10-24  7:18                 ` ecalvo
  0 siblings, 0 replies; 7+ messages in thread
From: ecalvo @ 2019-10-24  7:18 UTC (permalink / raw)
  To: openrisc

Okkk. I will check version numbers first and, if not, I will look into traces. 

I will come back to you if I discover the problem. 

Thanks a lot Stafford.  

-Elisa

 

De: Stafford Horne <shorne@gmail.com> 
Enviado el: jueves, 24 de octubre de 2019 1:06
Para: ecalvo at 2se.es
CC: Openrisc <openrisc@lists.librecores.org>
Asunto: Re: PCCR and PCRM registers

 

+ cc OpenRISC lost

 

On Thu, Oct 24, 2019, 6:05 AM Stafford Horne <shorne at gmail.com <mailto:shorne@gmail.com> > wrote:

Hi Elisa,

On Wed, Oct 23, 2019 at 09:47:24AM +0200, ecalvo at 2se.es <mailto:ecalvo@2se.es>  wrote:
> Hi Stafford, 
> 
> Sorry for the delay in the answer. I have been out of the office. 
> 
> I am using mor1kx 4.1. What I have done is change FEATURE_PERFCOUNTERS to ENABLED in orpsoc_top.c in the library of FuseSoc.. I have also changed OPTION_PERFCOUNTERS_NUM but I have no seen changes in the output. 

The versions I am using are (as reported by busesoc):

INFO: Preparing ::adv_debug_sys:3.1.0-r1
INFO: Preparing ::cdc_utils:0.1
INFO: Preparing ::elf-loader:1.0.2
INFO: Preparing ::intgen:0
INFO: Preparing ::jtag_tap:1.13-r1
INFO: Preparing ::jtag_vpi:0-r4
INFO: Preparing ::mor1kx:5.0-r3
INFO: Preparing ::uart16550:1.5.5-r1
INFO: Preparing ::verilog-arbiter:0-r2
INFO: Preparing ::vlog_tb_utils:1.1
INFO: Preparing ::wb_common:1.0.3
INFO: Preparing ::wb_bfm:1.2.1
INFO: Preparing ::wb_intercon:1.2.2
INFO: Preparing ::wb_ram:1.1
INFO: Preparing ::mor1kx-generic:1.1

The changes I made in mor1kx-generic are:

diff --git a/mor1kx-generic.core b/mor1kx-generic.core
index afc3607..f3160b5 100644
--- a/mor1kx-generic.core
+++ b/mor1kx-generic.core
@@ -10,7 +10,6 @@ filesets:
   marocchino_modules:
     depend:
       - or1k_marocchino
-
   soc_files:
     files:
       - rtl/verilog/wb_intercon.vh: {is_include_file: true}
@@ -31,7 +30,7 @@ filesets:
     file_type: verilogSource
     depend:
       - elf-loader
-      - "=jtag_vpi-r2"
+      - ">=jtag_vpi-r2"
       - ">=vlog_tb_utils-1.0"
   verilator_tb_files:
     files:
diff --git a/rtl/verilog/orpsoc_top.v b/rtl/verilog/orpsoc_top.v
index e2b04d6..4530e1d 100644
--- a/rtl/verilog/orpsoc_top.v
+++ b/rtl/verilog/orpsoc_top.v
@@ -266,6 +266,8 @@ mor1kx #(
        .OPTION_RF_NUM_SHADOW_GPR       (option_rf_num_shadow_gpr),
        .IBUS_WB_TYPE                   ("B3_REGISTERED_FEEDBACK"),
        .DBUS_WB_TYPE                   ("B3_REGISTERED_FEEDBACK"),
+       .FEATURE_PERFCOUNTERS           ("ENABLED"),
+       .OPTION_PERFCOUNTERS_NUM        (3),
        .OPTION_CPU0                    (pipeline),
        .OPTION_RESET_PC                (32'h00000100)
 ) mor1kx0 (


If using this version is not possible then it would be best that you look into
traces and try to get some insight to where things are getting stuck.

-Stafford

> Output: 
> We have 1 program counters. 
>  (hanged) 
> 
> I have checked that features have been modified in the fuse build files. 
> 
> Elisa
> 
> 
> 
> -----Mensaje original-----
> De: Stafford Horne <shorne at gmail.com <mailto:shorne@gmail.com> > 
> Enviado el: viernes, 18 de octubre de 2019 16:41
> Para: ecalvo at 2se.es <mailto:ecalvo@2se.es> 
> CC: Openrisc <openrisc at lists.librecores.org <mailto:openrisc@lists.librecores.org> >
> Asunto: Re: PCCR and PCRM registers
> 
> Hello,
> 
> Which core are you using?  With mor1kx I did not have an issue with hanging.  I had to enable perf counters via parameters.
> 
> I extended and posted my example code here:
> https://gist.github.com/stffrdhrn/6343706cb1d8124bbac6bb579b6913b0
> 
> The results looks like:
> 
>  Compile: or1k-elf-gcc or1k-perfcounters.c
>  Run: fusesoc run --target mor1kx_tb mor1kx-generic --elf-load ./a.out
> 
>  Example Output:
> 
>  We have 4 program counters.
>  PCCR before setup 0
>  Run a printf, to generate instructions..
>  IF 1825
>  ICS 237
> 
>  This shows that the printf took 1825 instructions, 237 times the pipeline  stalled due to instruction cache misses.
> 
> On Fri, Oct 18, 2019 at 7:51 PM <ecalvo at 2se.es <mailto:ecalvo@2se.es> > wrote:
> >
> > Hi Stafford,
> >
> > Sorry for bothering you again.
> >
> > The program is hanged accessing to PCMR and PCCR registers. Should I change other features apart from perfcounters?
> >
> > Elisa
> >
> > -----Mensaje original-----
> > De: Stafford Horne <shorne at gmail.com <mailto:shorne@gmail.com> >
> > Enviado el: jueves, 17 de octubre de 2019 23:42
> > Para: ecalvo at 2se.es <mailto:ecalvo@2se.es> 
> > CC: Openrisc <openrisc at lists.librecores.org <mailto:openrisc@lists.librecores.org> >
> > Asunto: Re: PCCR and PCRM registers
> >
> > Hi Elisa,
> >
> > Right, these are the functions.  You only need to be concerned with:
> >   OR1K_SPR_PERF_PCCR_ADDR(n)
> >   OR1K_SPR_PERF_PCMR_ADDR(n)
> >   OR1K_SPR_SYS_PCCFGR_ADDR
> >   OR1K_SPR_SYS_PCCFGR_NPC_GET(x)
> >
> > The others are used internally.
> > The PCCFGR is read only, it specifies how many performance counters your CPU has built in.  It may be 0-7.
> >
> > An example of how to use them:
> >
> > #include <or1k-support.h>
> > #include <or1k-sprs.h>
> > #include <stdio.h>
> >
> > #define PCMR_CISM 1<<3
> > #define PCMR_CIUM 1<<3
> > #define PCMR_IF   1<<6
> >
> > int main() {
> >
> >   int number_of_pcs;
> >   uint32_t pcmr, pccr, upr, pccfgr;
> >
> >   /* Check if PCs are even available */
> >   upr = or1k_mfspr(OR1K_SPR_SYS_UPR_ADDR);
> >
> >   if (OR1K_SPR_SYS_UPR_PCUP_GET(upr)) {
> >
> >     pccfgr = or1k_mfspr(OR1K_SPR_SYS_PCCFGR_ADDR);
> >     number_of_pcs = OR1K_SPR_SYS_PCCFGR_NPC_GET(pccfgr) + 1;
> >
> >     printf ("We have %d program counters.\n", number_of_pcs);
> >
> >     pccr = or1k_mfspr(OR1K_SPR_PERF_PCCR_ADDR(0));
> >     printf ("PCCR before setup %x\n", pccr);
> >
> >     /* Turn on counter and enable instruction fetch counting */
> >     pcmr = or1k_mfspr(OR1K_SPR_PERF_PCMR_ADDR(0));
> >     pcmr |= PCMR_CISM | PCMR_CIUM | PCMR_IF;
> >     or1k_mtspr(OR1K_SPR_PERF_PCMR_ADDR(0), pcmr);
> >
> >     /* Read the PCCR after we are done */
> >     printf ("Run a printf.");
> >     pccr = or1k_mfspr(OR1K_SPR_PERF_PCCR_ADDR(0));
> >     printf ("PCCR after printf %x\n", pccr);
> >   } else {
> >     printf ("No performance counters available.\n");
> >   }
> >
> >   return 0;
> > }
> >
> > On Fri, Oct 18, 2019 at 12:01 AM <ecalvo at 2se.es <mailto:ecalvo@2se.es> > wrote:
> > >
> > > Hi Stafford,
> > >
> > >
> > >
> > > Yes, I am using newlib. I had discovered both files; these registers 
> > > in sprs.h
> > >
> > >
> > >
> > > /******************************/
> > >
> > > /* Performance Counters Group */
> > >
> > > /******************************/
> > >
> > > #define OR1K_SPR_PERF_GROUP 0x07
> > >
> > >
> > >
> > > /* Performance Counters Count Registers */
> > >
> > > #define OR1K_SPR_PERF_PCCR_BASE     OR1K_UNSIGNED(0x000)
> > >
> > > #define OR1K_SPR_PERF_PCCR_COUNT    OR1K_UNSIGNED(0x008)
> > >
> > > #define OR1K_SPR_PERF_PCCR_STEP     OR1K_UNSIGNED(0x001)
> > >
> > > #define OR1K_SPR_PERF_PCCR_INDEX(N) (OR1K_SPR_PERF_PCCR_BASE + ((N) 
> > > *
> > > OR1K_SPR_PERF_PCCR_STEP))
> > >
> > > #define OR1K_SPR_PERF_PCCR_ADDR(N)  ((OR1K_SPR_PERF_GROUP <<
> > > OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCCR_INDEX(N))
> > >
> > >
> > >
> > > /* Performance Counters Mode Registers */
> > >
> > > #define OR1K_SPR_PERF_PCMR_BASE     OR1K_UNSIGNED(0x008)
> > >
> > > #define OR1K_SPR_PERF_PCMR_COUNT    OR1K_UNSIGNED(0x008)
> > >
> > > #define OR1K_SPR_PERF_PCMR_STEP     OR1K_UNSIGNED(0x001)
> > >
> > > #define OR1K_SPR_PERF_PCMR_INDEX(N) (OR1K_SPR_PERF_PCMR_BASE + ((N)
> > > *OR1K_SPR_PERF_PCMR_STEP))
> > >
> > > #define OR1K_SPR_PERF_PCMR_ADDR(N)  ((OR1K_SPR_PERF_GROUP <<
> > > OR1K_SPR_GROUP_LSB) | OR1K_SPR_PERF_PCMR_INDEX(N))
> > >
> > >
> > >
> > > /* Performance Counters Configuration */
> > >
> > > #define OR1K_SPR_SYS_PCCFGR_INDEX OR1K_UNSIGNED(0x008)
> > >
> > > #define OR1K_SPR_SYS_PCCFGR_ADDR  OR1K_UNSIGNED(0x0008)
> > >
> > >
> > >
> > > /* Number of Performance Counters */
> > >
> > > #define OR1K_SPR_SYS_PCCFGR_NPC_LSB    0
> > >
> > > #define OR1K_SPR_SYS_PCCFGR_NPC_MSB    2
> > >
> > > #define OR1K_SPR_SYS_PCCFGR_NPC_BITS   3
> > >
> > > #define OR1K_SPR_SYS_PCCFGR_NPC_MASK   OR1K_UNSIGNED(0x00000007)
> > >
> > > #define OR1K_SPR_SYS_PCCFGR_NPC_GET(X) (((X) >> 0) &
> > > OR1K_UNSIGNED(0x00000007))
> > >
> > > #define OR1K_SPR_SYS_PCCFGR_NPC_SET(X, Y) (((X) &
> > > OR1K_UNSIGNED(0xfffffff8)) | ((Y) << 0))
> > >
> > >
> > >
> > > And these functions in support.h
> > >
> > >
> > >
> > > static inline void or1k_mtspr (uint32_t spr, uint32_t value)
> > >
> > > static inline uint32_t or1k_mfspr (uint32_t spr)
> > >
> > >
> > >
> > > Despite this I don’t have clear how to use it.
> > >
> > >
> > >
> > > If I do: or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR , 0)  -> Does it allow me to configure the PCCFGR to one performance counter?
> > >
> > > Is This the same than or1k_mtspr (OR1K_SPR_SYS_PCCFGR_ADDR , OR1K_SPR_SYS_PCCFGR_NPC_LSB ) or  Do OR1K_SPR_SYS_PCCFGR_NPC_LSB   , OR1K_SPR_SYS_PCCFGR_NPC_MSB, etc. provide different functions on each one performance counter?
> > >
> > > What is the meaning of PCCR_BASE,  PCCR_COUNT, PCCR_STEP, 
> > > PCMR_INDEX(N), PCMR_ADDR(N) ? (the same for PCMR) (Is BASE the base address of all PCCR and ADDR the position of each one of them? ….why PCMR_BASE and COUNT hasta de same value OR1K_UNSIGNED(0x008)? ) Should I define first PCCFGR, second PCMR and last get PCCR?
> > >
> > >
> > >
> > > Thanks and sorry for the inconveniences.
> > >
> > > Elisa
> > >
> > >
> > >
> > > De: Stafford Horne <shorne at gmail.com <mailto:shorne@gmail.com> > Enviado el: jueves, 17 de 
> > > octubre de 2019 7:14
> > > Para: ecalvo at 2se.es <mailto:ecalvo@2se.es> ; Openrisc <openrisc at lists.librecores.org <mailto:openrisc@lists.librecores.org> >
> > > Asunto: Re: PCCR and PCRM registers
> > >
> > >
> > >
> > > +cc mailing list,
> > >
> > >
> > >
> > > Hi Elisa,
> > >
> > >
> > >
> > > Which toolchain are you using? I guess newlib?
> > >
> > >
> > >
> > > It has functions like or1k_mfspr() and or1k_mtspr() see or1k-support.h and or1k-sprs.h headers for details.
> > >
> > >
> > >
> > > -Stafford
> > >
> > > On Wed, Oct 16, 2019, 8:40 PM <ecalvo at 2se.es <mailto:ecalvo@2se.es> > wrote:
> > >
> > > Hi Stafford,
> > >
> > > I am with PCCR and PCRM registers. I have seen that I can access from asm language, but there is functions to access from C? Have you got any example about their usage?
> > >
> > > I have already confirmed my subscription to the mailing list.
> > >
> > > Thanks
> > > Elisa
> > >
> > > -----Mensaje original-----
> > > De: Stafford Horne <shorne at gmail.com <mailto:shorne@gmail.com> > Enviado el: miércoles, 9 de 
> > > octubre de 2019 13:38
> > > Para: ecalvo at 2se.es <mailto:ecalvo@2se.es> ; Julius Baxter <juliusbaxter at gmail.com <mailto:juliusbaxter@gmail.com> >
> > > Asunto: Re: other doubt
> > >
> > > Hello Elisa,
> > >
> > > If you simulate with Icarus or modelsim you will be able to measure pretty much the same performance characteristics as FPGA. So there is no need to go straight to FPGA.
> > >
> > > In terms of my example C code is one option.  You can also read timer data directly from the tick timer in assembly and achieve the same thing.
> > >
> > > If you are interested we can CC the mailing list and get more opinions.
> > >
> > > -Stafford
> > >
> > > On Wed, Oct 9, 2019 at 5:09 PM <ecalvo at 2se.es <mailto:ecalvo@2se.es> > wrote:
> > > >
> > > > Hi Stafford,
> > > >
> > > > Nice to meet you and, first of all, thanks a lot for your guidance. I am new on this, and although there is some documentation, sometimes it is difficult some point which maybe it is basic.
> > > >
> > > > Ok, to your comments. If.."A simulator like QEMU or or1ksim will not give and exact representation of the CPUs real time performance"...then...if I simulate directly the processor with modelsim, icarus or a similar tool...neither I get a real performance, don’t I? And values for the counters that you tell me to enable, neither are real, isn't it? should I execute it directly on the FPGA and it will depends on the implementation?
> > > >
> > > > Ok, to C code. I have understood the dependency with toolchain.
> > > >
> > > > Thanks a lot again.
> > > > Best regards,
> > > > Elisa
> > > >
> > > >
> > > > -----Mensaje original-----
> > > > De: Stafford Horne <shorne at gmail.com <mailto:shorne@gmail.com> > Enviado el: martes, 8 de 
> > > > octubre de 2019 16:18
> > > > Para: Julius Baxter <juliusbaxter at gmail.com <mailto:juliusbaxter@gmail.com> >
> > > > CC: ecalvo at 2se.es <mailto:ecalvo@2se.es> 
> > > > Asunto: Re: other doubt
> > > >
> > > > Hi Elisa,
> > > >
> > > > OpenRISC cpu's can run any algorithm, but how well it will perform depends on many things:
> > > >
> > > >   - Compiler optimization flags (i.e. -O3)
> > > >   - Whether or not you are doing FPU instructions and have FPU enabled
> > > >   - Whether or not you use multiply and divide and have these instructions
> > > >     enabled
> > > >   - The frequency you are running
> > > >   - Cache settings Icache Dcache
> > > >   - The type of algorithm, does it require lots of data which will cause many
> > > >     cache misses?
> > > >
> > > > A simulator like QEMU or or1ksim will not give and exact representation of the CPUs real time performance.  It can tell you which intructions will be executed, but not how fast those will run or how many pipeline stalls of cache misses will happen.
> > > >
> > > > You can use the performance counters, they are supported in mor1kx if you enable them with the FEATURE_PERFCOUNTERS='ENABLED' parameter.  They can help count how many events happen between certain events.  Then you can combine them with a timer and watchpoints to detect how many times a loop can execute in 1000 clock cycles etc.  Please read about PCCRn and PCMRn in the architecture manual.
> > > >
> > > > It might be just as easy to use simple timing in a c program though, depending on the toolchain you use you can compare times between runs of your algorithm.
> > > > i.e.
> > > >
> > > >     #include <time.h>
> > > >     #include <stdio.h>
> > > >
> > > >     static long to_micro(struct timespec *time) {
> > > >       return (time->tv_sec * 1000000) + (time->tv_nsec / 1000);
> > > >     }
> > > >
> > > >     int main() {
> > > >       int i, j = 0;
> > > >
> > > >       struct timespec before, after;
> > > >
> > > >       clock_gettime(CLOCK_MONOTONIC, &before);
> > > >       /* Super complex algorithm */
> > > >       for (int i = 0; i < 100; i++) {
> > > >         j = (j+1) * (j+2);
> > > >       }
> > > >       clock_gettime(CLOCK_MONOTONIC, &after);
> > > >
> > > >       printf("time to run algorithm %ld uSecs\n", to_micro(&after) 
> > > > - to_micro(&before));
> > > >
> > > >       return 0;
> > > >     }
> > > >
> > > > $ or1k-smh-linux-gnu-gcc timer.c
> > > > $ ./glibc-build-scripts/qemu-or1k-libc ./a.out time to run 
> > > > algorithm
> > > > 164 uSecs
> > > >
> > > > I hope it helps.
> > > >
> > > > -Stafford
> > > >
> > > > On Tue, Oct 08, 2019 at 10:54:29PM +1100, Julius Baxter wrote:
> > > > > Hi,
> > > > >
> > > > > No problem.
> > > > >
> > > > > There are performance counters in the OpenRISC architecture but 
> > > > > whether they're implemented in a particular implementation is another matter.
> > > > >
> > > > > You can use these registers to measure various things the CPU is 
> > > > > doing while it's executing. If you read the ISA document it'll 
> > > > > tell you about them.
> > > > >
> > > > > I'm CCing Stafford because he's the main OpenRISC man these days 
> > > > > and probably knows about the state of the performance counter 
> > > > > registers in various simulators and RTL implementations.
> > > > >
> > > > > Cheers,
> > > > > Julius
> > > > >
> > > > > On Tue., 8 Oct. 2019, 10:43 pm , <ecalvo at 2se.es <mailto:ecalvo@2se.es> > wrote:
> > > > >
> > > > > > Hi Julius,
> > > > > >
> > > > > >
> > > > > >
> > > > > > Sorry for bothering you again ☹. Can I do you other fast 
> > > > > > question related to openrisc? If not, ignore the email please.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Is there any way to characterize the type of application that 
> > > > > > I can run in openrisc? I mean, could you measure (with 
> > > > > > numbers) if an algorithm can be executed on it and the speed that it will achieve?
> > > > > > Is it possible to do it using orksim?
> > > > > >
> > > > > >
> > > > > >
> > > > > > Sorry because maybe it is so basic and general ☹
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thanks in advance
> > > > > >
> > > > > > Elisa
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > *De:* Julius Baxter <juliusbaxter at gmail.com <mailto:juliusbaxter@gmail.com> > *Enviado el:* 
> > > > > > lunes,
> > > > > > 16 de septiembre de 2019 13:11
> > > > > > *Para:* ecalvo at 2se.es <mailto:ecalvo@2se.es> 
> > > > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > > > >
> > > > > >
> > > > > >
> > > > > > Also! To let you know, if you're in Spain, we will soon be 
> > > > > > having our ORConf conference in Europe, and it's in Bordeaux, 
> > > > > > France, just across the border. :)
> > > > > >
> > > > > >
> > > > > >
> > > > > > There are several people there who can help you get up to 
> > > > > > speed, one of whom is Stafford Horne who knows most about the 
> > > > > > OpenRISC IP lately. He will be presenting. If you can attend, it'd be helpful, I'm sure.
> > > > > >
> > > > > >
> > > > > >
> > > > > > All info at https://orconf.org
> > > > > >
> > > > > >
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Julius
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, 16 Sep 2019 at 21:09, Julius Baxter 
> > > > > > <juliusbaxter at gmail.com <mailto:juliusbaxter@gmail.com> >
> > > > > > wrote:
> > > > > >
> > > > > > Hi Elisa,
> > > > > >
> > > > > >
> > > > > >
> > > > > > Sorry for the delay in this response.
> > > > > >
> > > > > >
> > > > > >
> > > > > > You should be using an SoC toplevel. FPGAs have everything you 
> > > > > > need on board like memories and IO blocks and lots of other 
> > > > > > FPGA fabric for you to implement other pieces of hardware.
> > > > > >
> > > > > >
> > > > > >
> > > > > > FuseSoC provides a really nice and easy way to build an mor1kx 
> > > > > > design for the DE0 nano I believe:
> > > > > >
> > > > > >
> > > > > >
> > > > > > https://github.com/olofk/de0_nano
> > > > > >
> > > > > >
> > > > > >
> > > > > > That github page has a rough guide to getting it going.
> > > > > >
> > > > > >
> > > > > >
> > > > > > If you need help I recommend posting to the OpenRISC mailing 
> > > > > > list and people will respond probably more promptly than I. (I 
> > > > > > recommend getting to know how to use mailing lists.
> > > > > > https://openrisc.io/community
> > > > > >
> > > > > >
> > > > > >
> > > > > > There are more resources here: https://openrisc.io/tutorials
> > > > > >
> > > > > >
> > > > > >
> > > > > > I hope that's helpful.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Julius
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, 11 Sep 2019 at 20:09, <ecalvo at 2se.es <mailto:ecalvo@2se.es> > wrote:
> > > > > >
> > > > > > Hi Julius,
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thanks a lot for the quick answer.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Yes, this is the problem: I am using as top level the mor1kx 
> > > > > > module itself. You mean that I need to synthetize also in 
> > > > > > reconfigurable logic these cores, don’t you? I thought that I 
> > > > > > could have these elements as external in a development board.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thanks again,
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > > Elisa
> > > > > >
> > > > > >
> > > > > >
> > > > > > *De:* Julius Baxter <juliusbaxter at gmail.com <mailto:juliusbaxter@gmail.com> > *Enviado el:* 
> > > > > > miércoles, 11 de septiembre de 2019 12:02
> > > > > > *Para:* ecalvo at 2se.es <mailto:ecalvo@2se.es> 
> > > > > > *Asunto:* Re: Starting with OpenRISC - IOBs
> > > > > >
> > > > > >
> > > > > >
> > > > > > Hi Elisa,
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thanks for getting in touch, that sounds like a cool project.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Can you tell me about the toplevel - are you using a system 
> > > > > > toplevel, or is your toplevel the mor1kx module itself?
> > > > > >
> > > > > >
> > > > > >
> > > > > > If it's the latter, then that's not the best way to do it - 
> > > > > > you need a system toplevel which instantiates memories and 
> > > > > > some reset circuitry and likely some IO (UART, GPIO, JTAG debug, etc.) to talk to the outside world.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Is that helpful?
> > > > > >
> > > > > >
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Julius
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, 11 Sep 2019 at 19:47, <ecalvo at 2se.es <mailto:ecalvo@2se.es> > wrote:
> > > > > >
> > > > > > Dear Dr. Baxter,
> > > > > >
> > > > > >
> > > > > >
> > > > > > My name is Elisa Calvo Gallego. I am writing you because I 
> > > > > > have started to work with OpenRISC in the framework of a 
> > > > > > research project developed in the company where I am working 
> > > > > > (Space Submicron Electronics, 2SE), and I am having some basic troubles. Could you help me?
> > > > > >
> > > > > >
> > > > > >
> > > > > > Although the FPGA that we are planning to use is larger, I 
> > > > > > have synthetized mor1kx for a DE0 nano board as first step 
> > > > > > (this is the board used in the majority of guides and 
> > > > > > tutorials). My problem is that the results that I have 
> > > > > > obtained are similar in area and resources, except for IOBs, 
> > > > > > which are more than available IOBs in the device. Do you know what I am doing wrong?
> > > > > > Should I comment debug lines or something like that? I 
> > > > > > apologize if the question is immediate. I didn't find the answer and I'm new in this.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thanks very much in advance.
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > >
> > > > > >
> > > > > > Elisa
> > > > > >
> > > > > >
> > > >
> >
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.librecores.org/pipermail/openrisc/attachments/20191024/6dba270a/attachment-0001.html>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-10-24  7:18 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <009d01d58416$7125fe80$5371fb80$@2se.es>
2019-10-17  5:13 ` [OpenRISC] PCCR and PCRM registers Stafford Horne
2019-10-17 15:01   ` ecalvo
2019-10-17 21:41     ` Stafford Horne
2019-10-18 10:51       ` ecalvo
2019-10-18 14:40         ` Stafford Horne
     [not found]           ` <013f01d58976$1c2c7700$54856500$@2se.es>
     [not found]             ` <20191023210532.GI24874@lianli.shorne-pla.net>
2019-10-23 23:05               ` Stafford Horne
2019-10-24  7:18                 ` ecalvo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.