Overcoming barriers to automatic parallelization


Running any program on a parallel architecture

  • Running any program on a parallel architecture currently rests a full challenge. Sciensys has achieved proposing a general solution to this problem by implementing a mix of data flow and control flow principles. This provides automatic parallelization during run time.
  • The goal of the company is to implement a disruptive technology in the field of automatic parallelization with a view to develop systems dedicated to high performance computing and, as well, to achieve the required energy efficiency that the Internet of Things (IoT) autonomous data processing systems require.



April 2017 University of L’Aquila - Center of Excellence DEWS and Sciensys entered into partnership with a view to have Sciensys architecture implemented and tested on specific complex computations requiring intense parallelization. This collaborative work gave birth to the article in Hipeac Info #50 (page 26).

Hipeac Info #50 cover Read more


Sciensys’ advanced technology is protected by various patent filings.

Data-flow machine with hardware associative memory
Patent of V.S.Burtsev

Main features

  • Multigrain parallelism. Coarse grain (on procedure level). Fine grain (on instruction level)
  • Automatic parallelization on multicore processors during run time, based on a disruptive technology
  • The performances achieved are a quasi linear function of the number of the cores of the microprocessor
  • Automatic allocation to any arbitrary number of processor cores coupled to the end user’s choice of any SMP processor
  • Traditional programming
  • Absence of cache memory without loss of performance
  • Absence of 'memory wall' problem. Computing speed is limited only by the communication network
  • Easy scaling
  • Ability to work without OS
  • Control Flow and Data Flow calculations support


The system operates upon two levels of concurrency

Coarse-grain, at the procedures level. The procedure can be performed automatically on any free processor, and the code that called it doesn’t wait for the procedure to return, but keeps on with the performance.

Fine-grain, at the level of special co-processor instructions. We call it Arithmetic Dataflow Processor (ADFP). This processor allows performing calculations when the data is available. This processor comprises associative storage, arithmetic device for operations on integers and real numbers, and a network controller to integrate multiple processors into a network.

ADFP is a kind of “active” memory since it can run calculation and the set of such processors can be considered as global active memory with CPU connecting to an array of processors, even if different CPUs have separate physical memory. This provides automatic parallelization on low level.

Breakthrough Architecture based on the concept of computation upon data availability

Calculations Calculations both in the instruction flow and data flow: hybrid architecture

Hardware Hardware concurrency of calculations at the level of arithmetic operators

Sciensys' architecture superiorities

Ability to reach GPUs performance executing scalar operations

GPUs are single instruction / many data. Ours are many instructions / many data. System can be used for universal calculations with unprecedented flexibility.

GPUs vs. our processor: with a given number of cores, a GPU and our system will have the same performances. Though they will be equal, the kinds of application which can be executed with Sciensys co-processor is much more important compared to GPUs.

A GPU operates only with vectors and Sciensys ‘ co-processor operates on scalar calculations and therefore can provide performances equivalent to GPU. We will enjoy having a great flexibility in comparison with GPUs to implement complex algorithms.

IEEEE CS 2022 Report Read more

Added value of Sciensys’ hybrid architecture

Easiness to implement on FPGA board.

Minimum programmer’s work: define the procedures to be run in parallel, chose the model of computation (data flow / control flow). In most cases, data flow calculations on level of operators will automatically be done by the compiler.


  • Tuning / behavior of the system is easily controlled in the program code
  • Modification - the system is easily configured via a graphical interface or VHDL-code
  • Use of a variety of platforms - as a base, you can take any multiprocessor system, including not symmetrical systems


  • Arbitrary number of CPUs - depends only on the density of on-chip FPGA, PCB
  • Various means of communication - may be hardware bus protocols, classical data transmission between computing devices and others
  • Almost linear boost of performance with the addition of CPUs (boost is only limited by the level of possible program parallelization)


Digital signal processing Digital signal processing

Matrix algebra Matrix algebra

Image processing Image processing

IoT (Internet of Things) IoT (Internet of Things)

Cyber-Physical Systems Cyber-Physical Systems

Medical imaging Medical imaging

Artificial intelligence Artificial intelligence

Image recognition Image recognition

Other applications requiring high speed of numeric calculation Other applications requiring high speed of numeric calculation

Internet of Things (IoT) growth

IoT growth

  • For the past ten years, parallelism has become a much talked about topic. A new phenomenon is to be considered which is the rise of the Internet of Things, which will call for the introduction of significant computing power at the level of the elementary things (sensors and things that people will wear) along with very stringent energy consumption constraints. A huge energy efficiency gain for such information systems has to be reached. This kind of application requires a 1/100 to 1/1,000 processor power reduction.
  • Parallelism happens to be the unique solution in the middle run to meet this gain target since it allows replacing a major system working at the frequency f by N minor systems operating at the f/N frequency.


Sciensys has elaborated a powerful disruptive technology in the field of automatic parallelization

  • Sciensys is a high technology company incorporated in France, based in Paris.
  • Sciensys has elaborated an advanced computing technology geared to real time applications demanding intense computation to be run on multi-core processors.
  • We have designed an original architecture – especially for dynamic applications – and a new processor generation. We aim to work with industrial companies to have them implement their complex applications through hardware integration on GPU, FPGA and multi-core standard market processors.

Company team

Contact us

Please fill in the contact form, we will come back to you as soon as possible.

* Required information