A carry-skip adder (also known as a carry-bypass adder) is an adder implementation that improves on the delay of a ripple-carry adder with little effort compared to other adders. The improvement of the worst-case delay is achieved by using several carry-skip adders to form a block-carry-skip adder.
Unlike other fast adders, carry-skip adder performance is increased with only some of the combinations of input bits. This means, speed improvement is only probabilistic.
The worst case for a simple one level ripple-carry adder occurs, when the propagate-condition is true for each digit pair
(ai,bi)
n
\tauCRA(n) ≈ n ⋅ \tauVA
For each operand input bit pair
(ai,bi)
pi=ai ⊕ bi
c0
The n-bit-carry-skip adder consists of a n-bit-carry-ripple-chain, a n-input AND-gate and one multiplexer.Each propagate bit
pi
cn
c0
cout
s=pn-1\wedgepn-2\wedge...\wedgep1\wedgep0=p[0:n-1]
This greatly reduces the latency of the adder through its critical path, since the carry bit for each block can now "skip" over blocks with a group propagate signal set to logic 1 (as opposed to a long ripple-carry chain, which would require the carry to ripple through each bit in the adder).The number of inputs of the AND-gate is equal to the width of the adder. For a large width, this becomes impractical and leads to additional delays, because the AND-gate has to be built as a tree. A good width is achieved, when the sum-logic has the same depth like the n-input AND-gate and the multiplexer.
The critical path of a carry-skip-adder begins at the first full-adder, passes through all adders and ends at the sum-bit
sn-1
n
n
\tauCSA(n)=\tauCRA(n)
m
TSK=TAND(m)+TMUX
TCSK=TMUX=2D
Block-carry-skip adders are composed of a number of carry-skip adders. There are two types of block-carry-skip addersThe two operands
A=(an-1,an-2,...,a1,a0)
B=(bn-1,bn-2,...,b1,b0)
k
(mk,mk-1,...,m2,m1)
Fixed size block-carry-skip adders split the
n
m
k=
n | |
m |
TFCSA(n)=
T | |||||
|
(m)+TCSK+(k-2) ⋅ TCSK+TCRA(m)=3D+m ⋅ 2D+(k-1) ⋅ 2D+(m+2)2D=(2m+k) ⋅ 2D+5D
dTFCSA(n) | |
dm |
=0
2D ⋅ \left(2-n ⋅
1 | |
m2 |
\right)=0
⇒ m1,2=\pm\sqrt{
n | |
2 |
⇒ m=\sqrt{
n | |
2 |
The performance can be improved, i.e. all carries propagated more quickly by varying the block sizes. Accordingly the initial blocks of the adder are made smaller so as to quickly detect carry generates that must be propagated the furthers, the middle blocks are made larger because they are not the problem case, and then the most significant blocks are again made smaller so that the late arriving carry inputs can be processed quickly.
By using additional skip-blocks in an additional layer, the block-propagate signals
p[i:i+3]
p[i:i+15]=p[i:i+3]\wedgep[i+4:i+7]\wedgep[i+8:i+11]\wedgep[i+12:i+15]
The problem of determining the block sizes and number of levels required to make the physically fastest carry-skip adder is known as the 'carry-skip adder optimization problem'. This problem is made complex by the fact that a carry-skip adders are implemented with physical devices whose size and other parameters also affects addition time.
The carry-skip optimization problem for variable block sizes and multiple levels for an arbitrary device process node was solved by Thomas W. Lynch. This reference also shows that carry-skip addition is the same as parallel prefix addition and is thus related to, and for some configurations identical to the Han–Carlson, the Brent–Kung, the Kogge-Stone adder and a number of other adder types.
Breaking this down into more specific terms, in order to build a 4-bit carry-bypass adder, 6 full adders would be needed. The input buses would be a 4-bit A and a 4-bit B, with a carry-in (CIN) signal. The output would be a 4-bit bus X and a carry-out signal (COUT).
The first two full adders would add the first two bits together. The carry-out signal from the second full adder (
C1
C1
C1
The multiplexers then control which output signal is used for COUT,
X2
X3