Слайд 110/14/2005
Caltech
Reliable State Machines
Dr. Gary R Burke
California Institute of Technology
Jet Propulsion
Laboratory
Слайд 210/14/2005
Caltech
outline
Background
JPL MER example
JPL FPGA/ASIC Process
Procedure
Guidelines
State machines
Traditional
Highly Reliable
Comparison
Слайд 610/14/2005
Caltech
MER Mission example
Large number of FPGAs
Mostly fuse programmable – but
at least one RAM programmable FPGA
Several ASICs
Many standard parts eg Microprocessor, RAM chips.
Слайд 1410/14/2005
Caltech
FPGA/ASIC Process
JPL needs to ensure design process is sound
A bug in
an FPGA/ASIC can halt a billion dollar mission
Tight schedules can result in inadequate testing
Inadequate version control can result in the wrong code
First Pass success important for ASIC design
Слайд 1510/14/2005
Caltech
FPGA/ASIC Process
To ensure a quality product:
Requirements are correct and do not
change
Specification is complete
Design will meet the specification and requirements
Testing has covered all possible cases
Слайд 1610/14/2005
Caltech
FPGA/ASIC Process
Peer reviews by experts to check the design and design
approach
Formal Reviews to ensure design process is adequate, and to sign off on the design
Documentation for review and archiving
Check-lists to ensure all problems are fixed
Слайд 1710/14/2005
Caltech
FPGA/ASIC Process
Configuration Management to ensure correct versions are used
Verification Matrix –
which documents all testing
Checking tools e.g. Lint, DRC; all errors, and warnings documented
Слайд 2010/14/2005
Caltech
Guidelines
Define set of rules for HDL design
Reduce ambiguity
Clarify design to be
easily checked and reviewed
Implement most reliable design techniques
Слайд 2110/14/2005
Caltech
Fault Tolerant State Machines
The state machine needs to be tolerant of
single event upsets
State machine should not hang
State machine should always be in a defined state
No asynchronous inputs to state machine
Default state must be specified
Слайд 2210/14/2005
Caltech
State Machines
A state machine is a sequential machine that when built
into an FPGA or ASIC controls the sequencing of actions in the digital logic
The current state of a machine is held in a state register which is updated on a clock
The next value of the state register (next state) is derived from the current state and the inputs
Outputs from the state machine are decoded from the state register and can also be combined with the inputs
Слайд 2310/14/2005
Caltech
State-Machine (SM) Encoding
Each distinct state of the SM is represented by
a unique code
The allocation of these binary codes to states is the Encoding
The simplest encoding is Binary
In Binary encoding each state is given the next available binary number in sequence.
Слайд 2410/14/2005
Caltech
Other SM Encoding
1-hot encoding
The number of bits in the code is
equal to the number of states. Each encoded state has just 1 bit in the encoded word set to a 1 (the rest are 0)
The advantage is that when optimized for non-reliable use, the amount of logic needed is less than Binary encoding, and it can be faster. One bit change with a SEU will result in a bad code which can be detected.
The disadvantage is the increased number of bits results in more flip/flops and therefore more targets for SEUs. The SEU advantage is lost when the 1-hot encoding is optimized.
The simplest encoding is Binary
In Binary encoding each state is given the next available binary number in sequence.
Слайд 2510/14/2005
Caltech
Other SM Encoding- cont
Grey-code
Similar to binary encoding, except the codes are
chosen so that in the main state-machine sequence only 1 bit changes at a time
No major advantage over binary with this code. Decoded outputs from the state register can make use of the nature of the encoding to simplify producing a glitch free output.
Слайд 2610/14/2005
Caltech
Other SM Encoding- cont
H2-code
This variation on Binary encoding uses one extra
bit to ensure all codes are separated by a Hamming distance of 2. That is, it will take 2 changes in the state register to reach another known state.
The advantage is that it has less bits and so less SEU targets than 1-hot, but retains the fault tolerance of the un-optimized 1-hot encoding.
Слайд 2710/14/2005
Caltech
Other SM Encoding- cont
H3-code
This extension on H2 encoding uses additional bits
to ensure all codes are separated by a Hamming distance of 3. That is, it will take 3 changes in the state register to reach another known state.
The advantage is that the SM can be designed such that a single change in the state register has no effect on the state.
The disadvantage is that it requires more logic to implement
Слайд 2810/14/2005
Caltech
Synthesis
To check the overhead of each of the state machines, they
were individually synthesized
Finite state machine optimization is turned off
A clock frequency of 50 MHz is used
Target device is a Xilinx Spartan 2, speed grade 6
Error injection circuitry is not included
Слайд 2910/14/2005
Caltech
Synthesis Results
Слайд 3010/14/2005
Caltech
Four Bit State Encoding
Слайд 3110/14/2005
Caltech
Eight Bit State Encoding
Слайд 3210/14/2005
Caltech
Twelve Bit State Encoding
Слайд 3310/14/2005
Caltech
Sixteen Bit State Encoding
Слайд 3410/14/2005
Caltech
Twenty-Four Bit State Encoding
Слайд 3510/14/2005
Caltech
Thirty-Two Bit State Encoding
Слайд 3610/14/2005
Caltech
Fault Injection Test
A test circuit is generated with an example of
each state machine executing the same task, plus a reference state machine
The task chosen requires a16-state state machine, to detect a 16-bit pattern in a serial input stream
An error generator injects faults into all state machines except the reference state machine
Слайд 3710/14/2005
Caltech
Error Injection Test Continued
The outputs of each state machine are compared
to the reference output
A set of counters tallies the comparison outputs
2 types of failure are logged for each state machine:
Failure to detect pattern
False detection of pattern (false-positive)
Слайд 3810/14/2005
Caltech
Error Injection Test Continued
Non-key patterns are 1-bit different from the key
pattern, to increase the likelihood of a false match
Error rate can vary, set to 1:199 clocks in example
Errors are weighted by distributing them pseudo-randomly over 16 bits. A state machine with a word size of n, receives n/16 of the total faults
Synchronous fault injection is before the state register
Asynchronous fault injection is after the state register
All results are from actual implementation of the test circuits in a Spartan 2 FPGA
Слайд 3910/14/2005
Caltech
Error Rate – Synchronous Faults
Слайд 4010/14/2005
Caltech
Error Rate – Asynchronous Faults
Слайд 4110/14/2005
Caltech
Error Rate – Asynchronous Pulse Faults
Слайд 4210/14/2005
Caltech
Results: Binary Encoding
Lowest resources used
Second fastest speed after One Hot
Fastest
for small number of states
Second-most sensitive to errors
Generates false-positive errors i.e. reports false pattern matches
Слайд 4310/14/2005
Caltech
Results: One Hot Encoding
No false-positive errors (single faults)
Fastest speed except for
small number of states and large number of states
Uses more resources than Binary
Inefficient for large number of states
Worst fault tolerance of all encoding tested
Has 2x the error rate of binary encoding
Слайд 4410/14/2005
Caltech
Results: Hamming Distance of 2 (H2) Encoding
No false-positive errors (single faults)
Better
Fault Tolerance than Binary
More resources needed than One Hot, except for large number of states
Слайд 4510/14/2005
Caltech
Results: Hamming Distance of 3 (H3) Encoding
Zero single-fault errors
Immune to synchronous
and asynchronous errors
Lowest double-fault errors
Most resources used (*)
~2x binary encoding
Slowest speed (*)
(*) Except for large number of states