UK startup wins gong for accelerating CPU memory access

UK startup BlueShift Memory has won an FMS 2023 Most Innovative Memory Technology award for its memory accelerating FPGA that it claims offers up to 1,000 times speed up for structured data set access.

Fabless semiconductor company BlueShift was founded in 2016 in Cambridge, England, by CTO and ex-CEO Peter Marosan, a mathematician with experience in cloud and grid computing. He said he realized that a memory wall was developing between processors and memory with cores sitting idle between memory accesses because there was not enough CPU-memory bandwidth, with caching being rendered less useful as data set sizes increase. 

Marosan said: “We have worked extremely hard over the past couple of years to prove the concept of our disruptive non-Von Neumann architecture and to develop our self-optimizing memory solution, and it is very rewarding to have our efforts acknowledged by the Flash Memory Summit Awards Program.”

He has devised the ‘Cambridge’ architecture to follow on from current cached CPU-memory systems developed from researchers at Harvard/Von Neumann/Princetonm who created architectures specifying CPU-memory interactions. 

BlueShift diagrams

The Cambridge architecture way to provide more bandwidth is for the memory system to understand some data structures and stream data as it’s needed to the CPU cores instead of waiting for them to finish an instruction, move on to the one in cache, rinse and repeat, coping with cache misses and going to DRAM, etc.

BlueShift has developed an FPGA featuring an integrated RISC-V Processor with a modified Codasip core to maximise memory bandwidth, and accelerate CPU core-DRAM access. It does this for HPC, AI, augmented and virtual reality machine vision, 5G Edge and IoT applications where large datasets have to be processed in as short a time as possible.

There are video demos of this FPGA accelerating image processing, data sorting and computer vision in self-driving cars.

The sort job compared two systems sorting an array of random coloured squares into a spectrum. One was a 100 MHz CPU/DRAM system doing the job along with a 100MHz BlueShift Memory (BSM) system shipping data to its CPU. The BSM system completed the job in 2 minutes 34 secs while the unaided CPU system took 19 minutes and 34 secs, 8.3x longer.

Data sort demo with BSM system completed in 2 mons and 34 seconds.

BlueShift presented a paper, titled “Novel memory-efficient computer architecture integration in RISC-V with CXL” at FMS 2023. It reported that its demonstration BSM device had achieved an acceleration factor of 16 to 128 times for processing image data, along with ultra-low power consumption.

The BlueShift Memory IP can be integrated into either memory chips or processors, or can be used in a stand-alone memory controller. The company says it can cut memory energy access costs in half, and provide zero-latency memory accesses.