# Semiautomatic Implementation of a Bioinspired Reliable Analog Task Distribution Architecture for Multiple Analog Cores

Julius von Rosen, Markus Meissner, Lars Hedrich Electronic Design Methodology, Department of Computer Science, University of Frankfurt/Main, Germany Email: {vonrosen, meissner, hedrich}@em.cs.uni-frankfurt.de

Abstract—In this paper we present a silicon implementation of a bioinspired analog task distribution system for enabling reliable analog multi-core systems. The increase in reliability is achieved by a dependable task distribution architecture using a hormone based mechanism. The specifications are generated by a feasibility analysis of the algebraic description of the architecture. Starting from the specifications, an automated analog synthesis framework is used to fasten the time-consuming design of the needed analog amplifiers. The complete system with the designed amplifiers has been layouted and fabricated. We present measurements of two different architectures of task distribution system on silicon showing the full functionality of the system and the design methodology.

### I. INTRODUCTION

The rising demand for embedded systems interacting with its environment points to the need of reliable architectures. Those systems are meant to be robust and execute their assigned task dependable and within real-time. This paper evolves the principle idea of an Artificial Hormone System (AHS) [3] into the design of an analog task distribution system (AHS). In [12] an approach of AAHS is presented, adopting the robustness, real-time capability and decentralization of AHS and designing a dependable task distribution system as reliability-aware architecture. The reliability is constituted by a slightly overhead in functional cores of the multi-core system, a self-organizing dependable task distributor on these cores. Until now that approach lacks the prove of the functionality and reliability under real-world conditions.

The major contributions of this paper are the fully, from scratch designed reliable, analog architecture. The design starts by generating the specifications, synthesizing all components and layouting the full system, while focusing on performances and robustness. Measurements show the full functionality of the synthesis process and the task distribution architecture with good performances of the silicon. The rest of the paper is organized as follows: Section II introduces the state of the art. In Section IV the newly designed components according to the given specifications are presented. While, the layouted system is presented in Section V, followed by the measurements in Section VI. The paper closes with a conclusion.

# II. STATE OF THE ART

# A. Reliable Architectures

Failures of analog and/or digital circuits during operation become more and more of interest. These failure modes are

fed by the decreasing technology size, increased temperature and power requirements and the constantly increasing process variations. [4], [6], [11] examined failure mechanisms to e.g. "Hot Carrier Injection" (HCI), "Negative Bias Temperature Instability" (NBTI) or "Electromigration" (EM) and show for instance the threshold voltage drifts affecting the circuits over time. Additional transient failure or degradation modes are high temperatures caused by e.g. a digital core with high clock frequency or by an analog power stage with high load.

In this context many approaches to construct reliable system have been proposed, mainly for digital System on Chips (SoCs). E.g. [2] depend on an agent based distribution system. The distribution of tasks is done by a scheduling algorithm. One agent as a centralized unit, assigns the tasks to the different cores, but each agent equals one single point of failure. Contrary to agent based approaches and as one of the very few analog approaches, in [1] an analog voting algorithm is presented. Its purpose is to raise the robustness of the circuit. However, invalidating the single point of failure by an N-modular redundancy methodology leads to a major overhead, since cores are specialized, this approach disables the re-usability of cores for different tasks.

With our approach we allow the use of cores for different tasks, for example having 3 ADCs for 2 measurement gives only a 50% overhead compared to a 200% overhead in 3-modular redundancy. Compared to the agent based approach our architecture has very less single points of failure.

## B. Automatic Synthesis of Analog Circuits

Any mixed signal approach suffers from the design bottleneck created by the need to design the analog parts from scratch. Until now, the analog designers have no technology independent characterized libraries, because the specifications of analog circuits vary largely. The specification is highly depended upon the technology, the needed accuracy and defined ranges. The reliability suffers from both, high accuracy and wide ranges. Though, in the last years, research has identified sensitive parts of analog circuits concerning the reliability [8], [13]. Additionally, recent design flows slightly increased the amount of automation during analog design, supporting the engineers [5]. This paper proposes and uses a technique described in [10] to synthesize many different OPs and OTAs from scratch.



# III. DECENTRALIZED AND ANALOG TASK DISTRIBUTION

Mapping the AHS to analog circuits requires remodeling the concept of the hormone system in order to use analog components (see Fig. 1). Such a system is called Analog Artificial Hormone System (AAHS). Task allocations are not synchronized with clock and a waiting period, they happen instantaneously. Whenever, in a continuous manner, the hormone level ( $G_i$ ) meets the taking task trigger  $\theta_{\gamma,i}$  of core  $\gamma$ , it takes the task *i*. Once taken, the core has to suppress the global hormone level. However, locally the core  $\gamma$  needs a feedback loop to keep the hormone level just above  $-\theta_{\gamma,i}$  in order to keep himself working on task i. Without the local loop, core  $\gamma$  would throw the task away again, leading to an oscillation.  $G_i$  represents the sum of all suppressors concerning this very task. It is for now a single point of failure, though replications of this hormone bus would eliminate it, trading reliability for area. Important to notice, this process of deciding, keeping tasks and/or throwing them away is done for every hormone the core is in contact with. One hormone loop equals one task, a second task equals a second hormone loop, allowing a linear scaling of the whole required control hardware to any number of hormones and tasks, respectively.

Fig. 2(a) illustrates an implementation of the AAHS core decision block (see Fig. 1) for each core and each hormone using operational amplifiers (OPs) and voltages as hormone signals. The slightly different architecture using currents as hormone signal is given in Fig. 2(b) and uses operational transconductance amplifiers (OTAs). The left block in Fig. 2(a) is an analog adder consisting of an OP and a resistive feedback network. The middle block is a Schmitt trigger circuit with predefined threshold voltages. This block is also used in the OTA implementation with slightly different threshold voltages. The desired specifications for the automated synthesis framework are generated by a feasibility analysis of the alge-

Table II: Simulation results of the synthesized OPs

|                   | 3                      |                         |  |  |
|-------------------|------------------------|-------------------------|--|--|
| OP                | Outer                  | Inner                   |  |  |
| Performances      | adder                  | adder                   |  |  |
| Gain              | 35.2dB                 | 27.3dB                  |  |  |
| Overshoot falling | 0.017%                 | 0.015%                  |  |  |
| Overshoot rising  | 0.009%                 | 0.027%                  |  |  |
| OVR               | 2.48V                  | 2.6V                    |  |  |
| Offset            | $36.6\mu V$            | $-2.06 \mu V$           |  |  |
| Slew rate falling | $14.1 \frac{V}{\mu s}$ | $74.78 \frac{V}{\mu s}$ |  |  |
| Slew rate rising  | $13.6 \frac{V}{\mu s}$ | $67.45 \frac{V}{\mu s}$ |  |  |
| Phase margin      | 70.8°                  | 66.1°                   |  |  |
| Power             | 1.16mA                 | 0.31mA                  |  |  |

Table III: Simulation results of the synthesized OTAs

|   |                   | -                     |                        |                        |  |  |
|---|-------------------|-----------------------|------------------------|------------------------|--|--|
|   | OTA               | Shunt                 | Measure                | Res.                   |  |  |
|   | Performances      | OTA                   | OTA                    | OTA                    |  |  |
| 1 | $G_m$             | $9.76\mu S$           | $10.03 \mu S$          | $18.09 \mu S$          |  |  |
|   | IVR               | 1.17V                 | 1.18V                  | 1.4V                   |  |  |
|   | Output Resistance | $56.23M\Omega$        | $40.02M\Omega$         | $15.2M\Omega$          |  |  |
|   | Offset            | 0.25mV                | 0.09mV                 | $-9.5\mu V$            |  |  |
|   | Slew rate falling | $0.4 \frac{V}{\mu s}$ | $0.41 \frac{V}{\mu s}$ | $80.9 \frac{V}{\mu s}$ |  |  |
|   | Slew rate rising  | $0.4\frac{V}{\mu s}$  | $0.42 \frac{V}{\mu s}$ | $80.8\frac{V}{\mu s}$  |  |  |
|   | Phase margin      | 84.4°                 | 81.4°                  | 68.9 <sup>°°</sup>     |  |  |
|   | Power             | 0.24mA                | 0.15mA                 | 0.8mA                  |  |  |
|   |                   |                       |                        |                        |  |  |

braically description of the task distribution architecture. The feasible set defines the dependencies of the parameters of the hormone system. Those dependencies allow the derivation of system specifications to automate the system design. They are generated using a technology with  $V_{\text{DD}} = 3.3V$ . The resulting specifications are listed in Table I. For stability we require a phase margin of at least 35°. A wanted task switching time of  $0.5\mu s$  leads to a minimum slew rate SR<sub>min</sub> of  $27.5 \frac{V}{\mu s}$  for the Schmitt Trigger, while the slew rate of the Outer adder should be slower than  $27.5 \frac{V}{\mu s}$ . The analysis has been done with Maple [7].

#### IV. SEMI-AUTOMATED DESIGN

In order to verify the technical feasibility of the proposed AAHS and the therefore generated specifications, a fully automated analog synthesis framework [10] was used to avoid the extremely time consuming task of designing amplifiers from scratch. Defined by a specification derived from the analysis for the AAHS, a fully sized, transistor level circuit is automatically synthesized for a provided process node. In this contribution the synthesis framework, illustrated in [10, Fig. 2], is only roughly described and the reader is encouraged to get further information from [9], [10]. The synthesis runs have been carried out on an 8 core Intel Xeon E5520 with 16GB RAM and were done within three hours for each specification. The final circuits have been chosen by the smallest offset. The process technology used is the AMS Design Hitkit v4.10 for a  $0.35\mu m$  bulk CMOS process with a supply voltage of 3.3V. The challenging specifications are the low overshoot due to preventing spurious switching and the relatively low load resistance for the outer adder.

Given the set of specifications of Table I, the automated analog synthesis framework is able to produce a wide range of usable OPs and OTAs for the two methodologies. The simulation results of the synthesized OPs and OTAs are shown in Table II and Table III, respectively. Since the common mode input voltage range (CMR) is less than 10% compared to  $V_{\rm DD}$  due to the closed loop operation, it is negligible for the measurement of the OPs. Contrary, the IVR and the output resistance of OTAs are important, while overshoots and the output voltage range are of less interest. The  $G_m$  of the Shunt OTA and the Measure OTA are closely related.

Overall we can state that in simulation all wanted performances were fulfilled. A direct measurement on chip of the performances is in our case not possible due to a closed loop operation and a pin limitation. However, if some of the



Figure 2: Schematics of the AAHS decision block of the OP-based 2(a) and of the OTA-based 2(b) hormone system Table I: Set of derived specifications

|                                                                           | Outer adder                   | Inner adder                   | Shunt OTA                     | Measure OTA                   | Res. OTA                      |  |  |
|---------------------------------------------------------------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|-------------------------------|--|--|
| $Gain/G_m$                                                                | $\geq 26.2dB$                 | $\geq 26.2dB$                 | $[8.891\mu S, 10.075\mu S]$   | $[8.891\mu S, 10.075\mu S]$   | $[16.443\mu S, 19.989\mu S]$  |  |  |
| R <sub>Load</sub>                                                         | $12.5k\Omega$                 | $100k\Omega$                  | $55.2k\Omega$                 | $102.25k\Omega$               | $55.2k\Omega$                 |  |  |
| C <sub>Load</sub>                                                         | 10 fF                         | 2.5 pF                        | 500 fF                        | 500 fF                        | 500 fF                        |  |  |
| Overshoot                                                                 | $\leq 0.03\%$                 | $\leq 0.03\%$                 | -                             | -                             | -                             |  |  |
| CMR(OP)/IVR(OTA)                                                          | $\geq 1.09V$                  | $\geq 1.09V$                  | $\geq 1.11V$                  | $\geq 1.11V$                  | $\geq 1.11V$                  |  |  |
| OVR                                                                       | $\geq 1.95V$                  | $\geq 1.95V$                  | -                             | -                             | -                             |  |  |
| Output Resistance                                                         | -                             | -                             | $\geq 4.14M\Omega$            | $\geq 7.66 M \Omega$          | $\geq 4.14M\Omega$            |  |  |
| Offset                                                                    | $\leq 40.3mV$                 | $\leq 40.3mV$                 | $\leq 0.45mV$                 | $\leq 0.45mV$                 | $\leq 0.24mV$                 |  |  |
| Slew rate                                                                 | $SR \leq \frac{27.5V}{\mu s}$ | $\frac{27.5V}{\mu s} \leq SR$ |  |  |
| Phase margin                                                              | $\geq 35^{6^{10}}$            | $\geq 35^{6^{10}}$            | $\geq 35^{\delta^{20}}$       | $\geq 35^{6^{10}}$            | $\sim 35^{\circ}$             |  |  |
| OVR: Output voltage range CMR: Common mode range IVR: Input voltage range |                               |                               |                               |                               |                               |  |  |



Figure 3: The fully layouted AAHS built with OTAs and OPs

performances will fail significantly, we would immediately notice that due to a failing behavior of the AAHS.

## V. LAYOUT

After simulation and yield analysis both versions of the task distribution system has been layouted manually. A straightforward place and route methodology has been conducted. Fig. 3 shows the fully layouted architecture for three cores able to apply for two tasks. According the enumerations the various components are:

- The AAHS decision block of one core applying for two tasks build with OTAs, containing the Measure OTA, the Res. OTA and a Schmitt trigger,
- The AAHS decision block of one core applying for two tasks build with OPs, containing the Local Adder and a Schmitt trigger,
- 3) The two Shunt OTAs, with an area of  $86.5\mu m \cdot 98.5\mu m$  each.
- 4) The two Global Adders, with an area of  $158.8\mu m \cdot 91.5\mu m$  each.
- 5) The current mirrors for the bias sources, which are greyed out (property of AMS).

The OTA implementation of the whole task distribution system has a total of 14 OTAs and 6 Schmitt trigger circuits, while the OP implementation uses 8 OPs and 6 Schmitt trigger circuits. The layout is framed at the top by 18 transmission gates to switch between both task distribution systems due to a limited number of pins of the prototype chip. Fig. 4 shows a photograph of the used part of the multi-project test chip in an AMS  $0.35\mu m$  analog technology, bonded and ready for measurements. The OP-variant uses slightly less area.



Figure 4: Photograph of test chip. For the layout see Fig. 3.

# VI. MEASUREMENTS

The measurements of the silicon shows that the proposed architectures are robust, reliable and functional. Within the scope of this paper, we show

- the reliability and dependability,
- the speed of allocation in seconds,



Figure 5: Measured task migration from Core 1 to Core 3 and later from Core 2 to the recovered Core 1

• the power consumption.

However, due to the small scope, the results are shown only for the current-based architecture. The reliability and dependability, also applying for the voltage-based approach and proving the assumption indicated by the simulation results of [12], is shown in Fig. 5. 4 of the 6 core enable signals (see Fig. 1) are plotted. The right side shows the loss of task 1 at core 1, due to the loss of the eager value (not shown). Core 3 as still available resource allocates task 1. Shortly afterwards, the AAHS decision block of core 2 decreases its eager value, the task transfer issued, since core 1 recovered in the mean time. The left image shows the reallocation process of a task between two AAHS decision blocks. The solid red box at the upper left side points to the (re-)allocation speed of the distribution system. Less than 500 ns pass between the task drop of the AAHS decision block of core 1 and the allocation by the AAHS decision block of core 3.

We measured also any other combinations of task transfers. We could prove that the prohibited take of one task by two cores will not appear or will result in an immediate unload of one of the tasks from one core. Additionally the same tests are successfully conducted for the OP based task distribution system. Hence both task distribution systems work as desired.

The power consumption measured of the silicon coincidences perfectly with the simulation. For the total 3 core 2 task OTA-task distribution system we need a total of 2.9 mA resulting in a total power consumption of 9.57 mW. For the OP-based AAHS System we need 2.1 mA resulting in 6.93 mW in total. Further, the chip area of the AAHS decision blocks, build with the AMS  $0.35 \mu m$  bulk CMOS technology, measures for 3 cores and 2 tasks each:

- The OP-based AAHS:  $0.3109 mm^2$ ,
- The OTA-based AAHS:  $0.4571 mm^2$ .

#### VII. CONCLUSION AND SUMMARY

With the measurements of the prototype chip we showed that an analog task distribution system based on an artificial hormone system is realizable with medium overhead. The measurement reveals quite robust systems with medium complexity which could help to build reliable systems in the presence of many-fold degradations in the near future. Additionally we could prove that the used synthesis tools significantly reduce the design time for larger projects and give reliable results. Hence we could show that an overall system optimization can easily been carried out as we do not use the same over designed OPs or OTAs for all tasks in such an analog system.

In future work we would try to hook up the AAHS task distribution system with off chip analog cores to show an overall reliable system and measure its performance.

#### REFERENCES

- S. Askari and M. Nourani, "Highly reliable analog filter design using analog voting," in *Electronics, Communications and Photonics Conference (SIECPC), 2011 Saudi International*, April 2011, pp. 1 – 6.
- [2] L. F. Bittencourt, E. R. M. Madeira, F. R. L. Cicerre, and L. E. Buzato, "A path clustering heuristic for scheduling task graphs onto a grid," in *Middleware for Grid Computing (MGC05)*, 2005.
- [3] U. Brinkschulte, M. Pacher, and A. von Renteln, "An artificial hormone system for self-organizing real-time task allocation," *Springer*, 2008.
- [4] G. G. D. Lorenz and U. Schlichtmann, "Aging analysis of circuit timing considering nbti and hci," *IOLTS09*, 2009.
- [5] G. Gielen, "CAD tools for embedded analogue circuits in mixed-signal integrated systems on chip," *IEE Proceedings - Computers and Digital Techniques*, vol. 152, no. 3, pp. 317–332, May 2005.
- [6] S. V. Kumar, C. H. Kim, and S. S. Sapatnekar, "An analytical model for negative bias temperature instability," in *Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design*, ser. ICCAD '06. ACM, 2006, pp. 493–496.
- [7] Maplesoft, "www.maplesoft.com."
- [8] E. Maricau, D. De Jonghe, and G. Gielen, "Hierarchical analog circuit reliability analysis using multivariate nonlinear regression and active learning sample selection," in *Proceedings of the Conference on Design*, *Automation and Test in Europe*, ser. DATE '12, 2012, pp. 745–750.
- [9] M. Meissner, O. Mitea, L. Luy, and L. Hedrich, "Fast isomorphism testing for a graph-based analog circuit synthesis framework," in *Design*, *Automation Test in Europe Conference Exhibition (DATE)*, 2012, march 2012, pp. 757 –762.
- [10] O. Mitea, M. Meissner, and L. Hedrich, "Automated Constraint-driven Topology Synthesis for Analog Circuits," in *Proc. of the Conference on Design, Automation and Test in Europe*, 2011.
- [11] F. Salfelder and L. Hedrich, "An NBTI model for efficient transient simulation of analogue circuits," in *Proc. edaWorkshop 11*. VDE Verlag, 2011, pp. 27 – 32.
- [12] J. von Rosen, F. Salfelder, L. Hedrich, B. Betting, and U. Brinkschulte, "A highly dependable self-adaptive mixed-signal multi-core system-onchip architecture," *Integration, the {VLSI} Journal*, vol. 48, no. 0, pp. 55 – 71, 2015.
- [13] B. Yan, Q. Fan, J. Bernstein, J. Qin, and J. Dai, "Reliability simulation and circuit-failure analysis in analog and mixed-signal applications," *IEEE Trans. on Device and Materials Reliability*, vol. 9, no. 3, pp. 339–347, Sept. 2009.