Repetition Vs FEC in General
One way of increasing reliability of a communication network is to repeat the same packet multiple times. In this post we compare two cases:
Repetition, where the same packet is transmitted multiple times in order to increase the probability of reception.
ECC/FEC where we use erasure coding of a block of packets to increase the probability of receiving all of these.
TL;DR: The results
In the table below, we calculate the residual packet loss (i.e. the loss after either repetition or ECC/FEC) where both strategies use the same bandwidth.
Residual Loss (lower is better) | ||
---|---|---|
Repetition (k=3 i.e. repeating each packet 3 times) | ECC/FEC (n=5, m=10 i,e. for every 5 data packets we send 10 repair packets | |
1% network loss | 0.0001% | 0.000000000000022% |
5% network loss | 0.0125% | 0.00000055% |
10% network loss | 0.1% | 0.000093% |
A few key takeaways:
In this case the Repetition and ECC/FEC strategy use the same bandwidth.
ECC/FEC works over a block of packets which may cause latency build-up, but this is typically not an issue in applications like video streaming where a single video frame often spans several network packets. (Steinwurf’s modern FEC codes based on RLNC encoding and decoding do not have this issue.)
Using the same bandwidth we can obtain a significantly improved reliability of the communication.
The reliability “gained” by using an ECC/FEC can be used to obtain bandwidth savings i.e. we can operate at the same reliability as the Repetition strategy but using less bandwidth. We show some results of this further down in the “Overhead vs. Residual loss” graph.
The results obtained here are for a block based ECC/FEC (since these are simpler to analyze). However, Steinwurf’s sliding window based ECC/FEC can significantly increase the reliability of the system further.
As a final comment it may be worthwhile to consider how different levels of residual packet loss affects the actual application. In the case of video we can estimate this by assuming that every packet loss will result in some form of glitch in the video. Based on this a compressed video of 4 Mbit/s using 1360 byte packets table maps 2 the residual packet loss to time between glitches.
Residual loss | Dropping one packet in: | Produces a glitch every |
---|---|---|
0.1% | 1,000 packets | 2.6 seconds |
0.01% | 10,000 packets | 26 seconds |
0.001% | 100,000 packets | 4 minutes and 23 seconds |
0.0001% | 1,000,000 packets | 44 minutes |
0.00001% | 10,000,000 packets | 7 hours 19 minutes |
We map this to the comparison between repetition and ECC/FEC in the below table :
Regularity of Video Glitches | ||
---|---|---|
Repetition (k=3) | ECC/FEC (n=5, m=10) | |
1% network loss | 44 minutes | 379,113 years |
5% network loss | 21 seconds | 5 days, 12 hours and 55 minutes |
10% network loss | 2.6 seconds | 47 minutes and 10 seconds |
Single or Multiple Network Paths
In both the repetition and ECC/FEC case we may use either a single path, or multiple networks paths.
In the case of multiple network paths we may decide on different strategies for how to transmit our packets:
Send all packets over all links
Send a fraction of the packets over each link
Only send over the link with minimum loss or latency
Etc.
The "right" choice here really depends on the specific use-case and application requirements such as residual packet loss, bandwidth-usage, latency etc.
To simplify the analysis we may start with the following basic assumptions:
All links have the same packet loss probability, p, and
each packet loss is independent of other packet loss and identically distributed.
In this case the actual strategy of how to allocate transmissions over the different links become less important since residual packet loss and bandwidth-usage becomes identical; i.e. the same amount of packets will, on average, be lost no matter if we send all packet overs one link or 1/4 of the packets over four different links. Latency may improve when sending over multiple links in parallel but we'll ignore this benefit for now.
With the basic assumption in place we can now compare the two different cases (1) repetition and (2) ECC/FEC.
Repetition
Lets calculate the residual packet loss pk after repeating each packet k times.
Each transmission is essentially a biased coin flip where the packet is received with probability 1-p and lost with probability p. The probability that a packet is not received in any of the k attempts is then:
Example
If k = 3 we would get the following residual packet loss for different values of p:
p | 10% | 5% | 1% |
---|---|---|---|
pk | 0.1% | 0.0125% | 0.0001% |
To put the numbers into perspective this means that for p = 10% we will observe a single packet loss for every 1000 packets sent. For p = 1% we see a loss for every millionth packet sent.
ECC/FEC
A different strategy than using repetition would be to use an ECC/FEC algorithm. Many different ECC/FEC algorithms exist with different properties, however their fundamental function is the same: From a set of original packets they generate a certain amount of repair packets. In case any of the original packets are lost they try to utilize the repair packets to recover.
Certain algorithms (e.g. Reed-Solomon) may impose limitations on the generation of repair packets e.g. no repair can be generated until more original packets are received. Other algorithms (e.g. sliding window RLNC) do not impose such limitations on the application. It is therefore important to carefully consider the choice of algorithm in a given application.
To simplify analysis we'll consider here a simple block code (although in practice sliding window base algorithms will often perform better).
A block code is an ECC/FEC code that uses a fixed number of packets - called a block - and creates new packets from these by combining them with mathematical operations (encoding). These packets contain no new information, however, if one of the original packets are lost, one can use these redundant packets to recover the lost packet by inverting the combination process (decoding).
Residual packet loss
Let's look at the probabilities for these:
We consider a block code that takes n packets, adds m redundancy and transmits these packets over a channel with packet loss probability p.
This is a series of n + m of these biased coin flips. In general we call such a series of biased coin flips a binomial experiment. What we need, to ensure that no packet is lost, is that no more than m packets are lost in transmission. The probability of this happening is:
The probability of more than packets being lost would then be the remaining probability:
Example
If we choose n = 5, m = 10, which has the same bandwidth consumption as the repetition code in the example above, we get for different values of p :
p | 10% | 5% | 1% |
---|---|---|---|
P (X > m) | 0.000093% | 0.00000055% | 0.000000000000022% |
Clearly we have a much lower residual packet loss compared to the repetition code.
Comparison: Repetition and ECC/FEC
In the above examples we saw the ECC/FEC approach seems to provide a
significant improvement. To compare the two approaches let's look at two extremes:
1. Maximize reliability given the same bandwidth consumption.
2. Minimize bandwidth consumption given the same target residual packet loss.
Note that using ECC/FEC we can also target different trade-offs between reliability and bandwidth-usage.
The below chats plots the trade-offs. The parameters of the block code are adapted to the chosen parameters of the repetition code. The residual packet loss is the probability of failure we modelled in the two cases above:
The plot shows that for different settings of the repetition code, the block code graph is always below this. So the reliability provided by the block code model is better for the same overheads.
If we draw a horizontal line parallel to the x-axis and compare the x-values of the two points at which the horizontal line intersects the graphs, we see for example that 50% block-overhead provides the same reliability as ~75% repetition overhead. Since the two graphs never intersect, there will always be a lower block-overhead than repetition-overhead for the same reliability. You can pick any y-value to draw the horizontal line from and use the same approach to compare.
Example
Let’s choose y=10-6 . The horizontal line would intersect the block graph at around x1 ≅ 61% and the repetition graph at x2 ≅ 87%.
So you can save about 20 percentage points of overhead by using the block code for this reliability.
What about the latency?
One of the metrics we've ignored is the latency incurred by the two different approaches. Let's have a look at the implications on latency.
Repetition
No latency overhead. We are essentially just repeating the same packet multiple times.
ECC/FEC
The latency overhead of ECC/FEC depends largely on the parameters chosen and the application properties. In general if we have packets in an ECC/FEC block we need to wait for those packets to arrive. However, depending on the application this may be very low. If we consider something like a video streaming application we'll typically have very bursty traffic where each frame is being produced.
Video Case Example
Let's have a look at a video-stream using block codes. Each frame produced consists of several packets. Let's say there are 15 packets in each frame and a frame is produced at 60 fps so about every 17 milliseconds.
The repetition code will add no extra latency to each packet as these will arrive almost immediately to the sender.
The block code will not add extra latency either if the size of the blocks is chosen to be such that all blocks exactly fit the frame in packets. (For example a block size of 5 will make 3 blocks fit perfectly in each frame). This is not always possible for block codes, however this can be achieved with for example a Sliding-Window Code. See content aware coding for more on this.
Uniform Traffic Case Example
Say that we send n = 5 packets over a channel. All packets arrive at the sender with the same time interval, say 10 ms. Let's say packet 3 is lost. Then packet 3 cannot be repaired until packet 4 and packet 5 have arrived, when repair can then be generated. Hence packet 3 is delayed by 2 x 10 ms = 20 ms.
In this case, there will be added latency by ECC algorithms.
Conclusion
Compared to a repetition code, the block code is
- More reliable given the same overhead
- Requires less overhead for the same reliability as the repetition code
- Sometimes adds latency to packets depending on the traffic-type and loss pattern.
As an even better alternative, one could use a Sliding-Window code for less added latency and even better reliability than the block codes.
If you're interested in Sliding-Window codes, you should definitely check out our low latency FEC solution, Rely, here.