Section Processors - Category LEON CPU Family

LEON-RTG4

Introduction

The GRLIB IP library has support for Microsemi RTG4 devices. This support consists of a techmap layer that wraps RTG4 specific technology elements such as memory macros and pads. GRLIB also contains a template design for the RTG4 Development Kit, bridges that allow to use the Microsemi FDDR memory controller and SerDes IP together with a LEON/GRLIB system, and infrastructure that automatically builds project files for Libero SoC. More information about GRLIB and our IP cores is available on the SoC library page.

Example designs

We provide prebuilt bitstreams of the Microsemi RTG4 Development Kit LEON3 and LEON4 template design. These bitstreams are intended for evaluation of software running on a LEON3 or LEON4 SoC implemented in RTG4. To evaluate these designs, the following items are required:

  • Microsemi RTG4 Development Kit FPGA board
  • Workstation with GNU/Linux or Microsoft Windows
  • Microsemi software to program FPGA
  • Bitstream, available further down on this page
  • GRMON2 or GRMON3 debug monitor - The GRMON3 evaluation version supports LEON-RTG4-EX from version 3.0.6

Example system block diagram

The example design range is called LEON-RTG4-EX and includes the following IP cores:

  • LEON3FT and LEON4FT multicore systems with 16 KiB instruction cache and 16 KiB data cache. The processors are also implemented with MMU and FPUs.
  • LEON Debug Support Unit
  • Bridge for Microsemi FDDR DDR3 SDRAM controller
  • SPI Flash memory controller for boot-ROM
  • Interrupt controller for 15 interrupts
  • Timer module with two 32-bit timers and watchdog
  • UART with FIFO
  • General purpose I/O port

Further documentation can be found in the user's manual below.

LEON3FT Fault-tolerant processor

Introduction

The LEON3FT is a fault-tolerant version of the standard LEON3 SPARC V8 Processor. It has been designed for operation in the harsh space environment, and includes functionality to detect and correct (SEU) errors in all on-chip RAM memories. The LEON3FT processor support most of the functionality in the standard LEON3 processor, and adds the following features:

  • Register file SEU error-correction of up to 4 errors per 32-bit word
  • Cache memory error-correction of up to 4 errors per tag or 32-bit word
  • Autonomous and software transparent error handling
  • No timing or performance impact due to error detection and correction

The following features of the standard LEON3 processor are NOT supported by LEON3FT

  • Local scratch pad RAM (I and D)
  • Cache locking
  • LRR cache replacement algorithm

Fault-tolerance scheme

The fault-tolerance in LEON3FT is implemented using ECC coding of all on-chip RAM blocks. The ECC codes are adapted to the type of RAM blocks that are available for a given target technology, and to the type of data that is stored in the RAM blocks. The general scheme is to be able to detect and correct up to four errors per 32-bit RAM word. In RAM blocks where the data is mirrored in a secondary memory area (e.g. cache memories), the ECC codes are tuned for error-detection only. A correction cycle consists then of reloading the faulty data from the mirror location. In the cache memories, this equals to an invalidation of the faulty cache line and a cache line reload from main memory.

In RAM blocks where no secondary copy of the data is available (e.g. register file), the ECC codes are tuned for both error-detection and correction. The focus is placed on fast encoding/decoding times rather than minimizing the number of ECC bits. This approach ensures that the FT logic does not affect the timing and performance of the processor, and that LEON3FT can reach the same maximum frequency as the standard non-FT LEON3. The ECC encoding/decoding is done in the LEON3FT pipeline in parallel with normal operation, and a correction cycle is fully transparent to the software without affecting the instruction timing.

The ECC protection of RAM blocks is not limited to the LEON3FT processor. In a SOC design based on LEON3FT, any IP core using block RAM will have the RAM protected in a similar manner. This includes for instance the FIFOs in the SpaceWire IP core (GRSWP) and the buffer RAM in the CAN-2.0 IP core (CAN_OC).

Simulation and synthesis

The LEON3FT is simulated and synthesized in the same manner as the standard LEON3 processor. The area overhead for the FT logic is less than 15% on both ASIC and FPGA implementations. The table below shows some typical area figures for ASIC and Microchip RTAX technologies:

 Core RTAX cells
RTAX RAM blocks
ASIC gates
 LEON3 8 + 8 Kbyte cache 6,500 40 20,000
 LEON3FT 8 + 8 Kbyte cache 7,500 40 22,000
 LEON3FT 8 + 4 Kbyte cache 7,500 31 22,000

Distribution

The LEON3FT core is distributed together with a special FT version of the GRLIP IP library, distributed as encrypted RTL.

Software development

Software development for LEON3FT is identical to the standard LEON3. The fault-tolerance implementation is fully software transparent, and no software drivers are necessary for its operation. See the LEON3 software page for more details. The LEON3 simulators (TSIM and GRSIM) as well as the GRMON debug monitor are fully compatible with LEON3FT.

Radiation-hardened devices based on LEON3FT

 Device  Manufacturer
 Frequency  MIPS
 GR712RC Dual-Core SOC Gaisler  100 MHz  200 DMIPS
 UT699 Single-Core SOC Colorado Springs   66 MHz   75 DMIPS
 LEON3FT-RTAX (discontinued) Gaisler / Microchip   25 MHz   20 DMIPS

LEON3 Processor

The LEON3 is a synthesisable VHDL model of a 32-bit processor compliant with the SPARC V8 architecture. The model is highly configurable and particularly suitable for system-on-a-chip (SOC) designs.

LEON3 supports both asymmetric and symmetric multiprocessing (AMP/SMP). Up to 16 CPUs can be used in a multiprocessing configuration.

LEON3 is also available in a fault-tolerant version, the LEON3FT. You can find more information here.

 

Architecture

The LEON3 integer unit implements the full SPARC V8 manual, including hardware multiply and divide instructions. The number of register windows is configurable within the limit of the SPARC manual (2 - 32), with a default setting of 8. The pipeline consists of 7 stages with a separate instruction and data cache interface (Harvard architecture).

LEON3 has a highly configurable cache system, consisting of a separate instruction and data cache.
Both caches can be configured with 1 - 4 ways, 1 - 256 KiB/way, 16 or 32 bytes per line. The instruction cache maintains one valid bit per 32-bit word and uses streaming during line-refill to minimize refill latency. The data cache has one valid bit per cache line, uses write-through policy and implements a double-word write-buffer. Bus-snooping on the AHB bus can be used to maintain cache coherency for the data cache. Local scratch pad ram can be added to either of the instruction and data caches to allow 0-waitstates access instruction or data memory without any AHB bus access.

The LEON3 integer unit provides interfaces for a floating-point unit (FPU), and a custom co-processor.
Two FPU controllers are available, one for the high-performance GRFPU and one for the GRFPU-Lite core. The floating-point processors and co-processor execute in parallel with the integer unit, and does not block the operation unless a data or resource dependency exists.

 

Quick links

- (LINKS will be fixed when the webpage goes online)

- Documentation

- Detailed feature set

- Software Ecosystem Overview

- Download open-source code (GPL license)

- Excel sheet for SOC area estimation

- DISCOURSE community (for open-source users)

- LEON-RTG4 example bitstreams

- GR716 - Rad-Hard LEON3FT Microcontroller

- GR712RC - Dual-core LEON3FT Processor

 

 

Availability and licensing

LEON3 is part of the GRLIB IP library. The open-source version of the library is distributed under the GNU GPL license and can be downloaded here.

The LEON3  can also be obtained under commercial licensing conditions, enabling proprietary designs and taking advantage of a support agreement. Please see the GRLIB IP Core User's Manual - Processor license overview for the license types.

Contact us if you want to use LEON3 in a commercial product.


 

Synthesis

The LEON3 processor can be synthesised with common synthesis tools from vendors such as Synopsys, Mentor, Xilinx, Microsemi, Lattice and NanoXplore.

The GRLIB IP library contains LEON3 template designs for several popular FPGA prototyping boards. Pre-synthesized FPGA programming files are also provided, see LEON-RTG4.

 

Software Ecosystem

Being SPARC V8 conformant, compilers and kernels for SPARC V8 can be used with LEON3 (kernels will need a LEON BSP). To simplify software development, We provide several toolchains and operating systems.

Check the software overview webpage for all the details.

Debugging is generally done using the GDB debugger, and a graphical front-end such as DDD or Eclipse. The GRMON monitor interfaces to the LEON5 on-chip debug support unit (DSU), implementing a large range of debug functions as well as a GDB gateway.

The LEON3 processor is also supported by our TSIM3 and GRSIM simulators.

SPARC Conformance

LEON3 has been certified by SPARC International as being SPARC V8 conformant. The certification was completed on May 1, 2005.

 

Detailed Feature set

 The LEON3 processor has the following features:

  • SPARC V8 instruction set with V8e extensions
  • Advanced 7-stage pipeline
  • Hardware multiply, divide and MAC units
  • Hardware floating-point support
  • Separate instruction and data cache (Harvard architecture) with snooping
  • Configurable caches: 1 - 4 ways, 1 - 256 kbytes/way. Random, LRR or LRU replacement
  • Local instruction and data scratchpad RAM, 1 - 512 Kbytes
  • AMBA 2.0 AHB bus interface
  • High Performance: 1.4 DMIPS/MHz, 1.8 CoreMark/MHz (gcc -4.1.2)
   
  • Advanced on-chip debug support with instruction and data trace buffer
  • SPARC Reference MMU (SRMMU) with configurable TLB
  • Symmetric Multi-processor support (SMP)
  • Power-down mode and clock gating
  • Robust and fully synchronous single-edge clock design
  • Up to 125 MHz in FPGA and 400 MHz on 0.13 um ASIC technologies
  • Fault-tolerant and SEU-proof version available for space applications
  • Extensively configurable
  • Large range of software tools: compilers, kernels, simulators and debug monitors

Configuration

The LEON3 processor is fully parametrizable through the use of VHDL generics and does not rely on any global configuration package. It is thus possible to instantiate several processor cores in the same design with different configurations. The LEON3 template designs can be configured using a graphical tool built. This allows new users to quickly define a suitable custom configuration. The configuration tool not only configures the processor, but also other on-chip peripherals such as memory controllers and network interfaces.

   

Probabilistic platform

The LEON3 processor was extended within the PROXIMA project to build a platform with hardware support that enables probabilistic timing analysis. These extensions, including extensions for GRLIB's Level-2 cache, can also be obtained from us.

   

 

Documentation

Item File
LEON3 IP core documentation

GRLIB IP Core User's Manual,  see LEON3

 

LEON4 Processor

The LEON4 is a synthesizable VHDL model of a 32-bit processor compliant with the SPARC V8 architecture. The model is highly configurable, and particularly suitable for system-on-chip (SOC) designs.

The LEON4 processor can be enhanced with fault-tolerance features against SEU errors.

The processor can be efficiently implemented on FPGA and ASIC technologies and uses standard synchronous memory cells for cache and register file.

The LEON4 processor is fully parameterizable through the use of VHDL generics, and does not rely on any global configuration package. It is thus possible to instantiate several processor cores in the same design with different configurations.
 

Architecture

The LEON4 is interfaced using the AMBA 2.0 AHB bus and supports the IP core plug&play method provided in the GRLIB IP library.  The processor supports the MUL, MAC and DIV instructions and an optional IEEE-754 floating-point unit (FPU) and Memory Management Unit (MMU).

The LEON4 cache system consists of separate I/D multi-set Level-1 (L1) caches with up to 4 ways per cache, and an optional Level-2 (L2) cache for increased performance in data-intensive applications.

The LEON4 pipeline uses 64-bit internal load/store data paths, with an AMBA AHB interface of either 64- or 128-bit. Branch prediction, 1-cycle load latency and a 32x32 multiplier results in a performance of 1.7 DMIPS/MHz, or 2.1 CoreMark/MHz.

 

Availability and licensing

The LEON4 can be obtained under commercial licensing conditions, enabling proprietary designs and taking advantage of a support agreement. Please see the GRLIB IP Core User's Manual - Processor license overview for the license types.

Contact us if you want to use LEON4 in a commercial product.

Synthesis

The LEON4 processor can be synthesized with common synthesis tools such as Synplify, Synopsys DC and Cadence RC. The core area (pipeline, cache controllers and mul/div units) requires only 30 kgates or 4000 LUT, depending on the configuration. The LEON4 processor can also be synthesized with tools from Mentor, Xilinx or Microsemi.

Detailed Feature Set

  • SPARC V8 instruction set with V8e extensions
  • Advanced 7-stage pipeline, with branch prediction
  • 64-bit single-clock load/store operation
  • 64-bit 4-port register file
  • Hardware multiply, divide and MAC units
  • High-performance, fully pipelined IEEE-754 FPU
  • Separate instruction and data L1 cache (Harvard architecture) with snooping
    • Configurable caches L1: 1 - 4 ways, 1 - 256 kbytes/way. Random, LRR or LRU replacement
  • Configurable L2 cache: 256-bit internal, 1-4 ways, 16 Kbyte - 8 Mbyte
  • SPARC Reference MMU (SRMMU) with configurable TLB
   
  • SPARC Reference MMU (SRMMU) with configurable TLB
  • AMBA 2.0 AHB bus interface, 64- or 128-bit wide
  • Advanced on-chip debug support with instruction and data trace buffer, and performance counter
  • Symmetric Multi-processor support (SMP)
  • Power-down mode and clock gating
  • Robust and fully synchronous single-edge clock design
  • Up to 150 MHz in FPGA and 1500 MHz on 32 nm ASIC technologies
  • Extensively configurable
  • Large range of software tools: compilers, kernels, simulators and debug monitors
  • High performance: 1.7 DMIPS/MHz, 2.1 CoreMark/MHz, 0.35 (estimated) SPECint2000/MHz
 

Software Ecosystem

Being SPARC V8 conformant, compilers and kernels for SPARC V8 can be used with LEON4 (kernels will need a LEON BSP). To simplify software development, We provide several toolchains and operating systems.

Check the software overview webpage for all the details.

Debugging is generally done using the GDB debugger, and a graphical front-end such as DDD or Eclipse. It is possible to perform source-level symbolic debugging, either on a simulator or using real target hardware.

We provide TSIM, a high-performance LEON4 simulator which seamlessly can be attached to gdb and emulate a LEON4 system at more than 30 MIPS. The GRMON monitor interfaces to the LEON4 on-chip debug support unit (DSU), implementing a large range of debug functions as well as a GDB gateway. For multi-processor and/or advanced SOC designs, the GRSIM multi-core simulator is available for early software development.

 

Documentation

- GRLIB IP Core User's Manual,  see LEON4