Epiphany Multicore IP
Epiphany – A breakthrough in parallel processing
The Epiphany multicore coprocessor is a scalable shared memory architecture, featuring up to 4,096 processors on a single chip connected through a high-bandwidth on-chip network. Each Epiphany processor core includes a tiny high performance floating point RISC processor built from scratch for multicore processing, a high bandwidth local memory system, and an extensive set of built in hardware features for multicore communication. The Epiphany coprocessor is ANSI-C and OpenCL programmable and works in cooperation with standard microprocessors to provide unprecedented level of real-time processing to performance and power constrained mobile devices like smartphones and tablet computers, as well as improving performance levels for an array of other parallel computing platforms.
Features
- Complete multicore solution featuring a high performance microprocessor ISA, Network-On-Chip, and distributed memory system
- Fully-featured ANSI-C programmable GNU/Eclipse based tool chain
- Scalable to 1000’s of cores and TFLOPS of performance on a single chip
- 1GHz superscalar RISC processor cores
- IEEE Floating Point Instruction Set
- Shared memory architecture with up to 128KB memory at each processor node
- Zero startup-cost messaging passing
- Vector Interrupt Controller
- Distributed Multicore Multidimensional DMAs
- 32 GB/sec local memory bandwidth per core
- 8GB/sec per processor network bandwidth
- 72 GFLOPS/Watt energy efficiency
- Processor tile size of 0.5mm^2 at 65nm, 0.128mm^2 at 28nm
Epiphany Benefits
- Out-of-the box floating point C programs enables significantly faster time to market and lower development costs compared to ASIC or FPGA based solutions.
- Up to 100X advantage in energy efficiency compared to traditional multicore floating point processors offers breakthrough improvements in battery life, cost of ownership, and reliability.
- Unparalleled performance, as much as 5 TFLOPs on a single chip, enables a new set of high performance applications.
- Low latency zero-overhead inter-core communication simplifies parallel programming.
- Scalable architecture allows code reuse across a wide range of markets and applications from smart-phones all the way to leading edge supercomputers.
Mobile Applications
- Are your customers complaining that their mobile device runs out of battery too fast?
- Do you lack the money, team, or time needed to convert your floating point C-based reference application to a fixed point FPGA/ASIC hardware implementation?
- Do you have a killer app in mind that won’t become practical until 2016 based on existing mobile processor roadmaps?
High Performance Applications:
- Would you benefit from reducing your processing latencies to microseconds and still being able to program in ANSI-C?
- Do you lack the electrical and cooling infrastructure needed to operate a state of the art high performance system?
- Are you only seeing 10-15% of the advertised maximum performance of your current vendor’s manycore solution?
- Are you frustrated with the steep learning curve and proprietary development environments of existing floating pointaccelerator technologies?
Example Configurations: