Guest post - Investigating reliability and playback delay for live video multicasting from a drone
Foreword
In this post we invited Martin Eriksen, Rasmus Suhr Mogensen, Rasmus Liborius Bruun and Kasper Wissing Mortensen to present the work they did in their 7th semester project, fall 2016 at Aalborg University (with Steinwurf providing co-supervision).
The project investigated some of the technical challenges in providing live and reliable multicast video streaming from a flying drone to multiple users on the ground. Such a system has a multitude of exciting applications from enabling spectators at e.g. sports events to have a much better overview of the action or e.g. in civil services such as police and firefighters where drones can provide live view of a situation. In Steinwurf we see a great potential for live streaming from drones and encourage people interested in such applications to checkout our Score (reliable multicast protocol) and Kodo (erasure correcting codes) software libraries.
Watching football at the stadium is a captivating experience, which draws thousands of people, despite the fact that the viewing angle and replays you get from your TV at home often are superior. But it does not need to be this way. It is quite easy to imagine that a drone could spot the right angle and feed the video directly to every spectators’ smartphone. This requires multicasting which in its standard form is inherently unreliable and therefore unfit for applications such as live video streaming.
During the past semester, we examined how a reliable multicasting scheme for 802.11g can be obtained. The setup we propose is seen below and how it correlates with the TCP/IP stack. In this setup, the transmission rate (TxR) and Random Linear Network Coding (RLNC) are adjusted based on the packet error rate at the receivers to achieve reliability. The settings of these two components impose limitations to the achievable video data throughput, this will in turn effect the viewing experience in terms of playback delay and quality. Since requirements for these two parameters can vary depending on the application, our goal was to devise policies for adaptation of video, transmission and erasure coding rate is such that the delay is minimized while having the best video quality.
To form these policies, we used traces describing the actual data throughput to the receivers obtained during flight sessions with the drone. One of such flight session can be seen in the video below with data traces for (from left to right) throughput, elevation, distance to receivers and the flight route. For our specific scenario, which is an open field we discovered that adapting the transmission rate would not improve the reliability and therefore only a 36 Mbit/s rate is used in further analysis.
When we did the analysis, we chose to simplify our system, since a semester is only about 3 month and it exist of both courses and a project. The simplified system model can be seen below.
This simplification implies no overhead for RLNC, no switching delay for settings nor feedback. These assumptions are of course not valid in a real system, however the policies based on the simplified version still provides useful information. Using this model, we took a 36 Mb/s throughput trace from a single receiver and used this as our service rate for video data.
The outcome of this can be seen in the large figure below, where the y-axis is the per frame delay and the x-axis is the frame number. When the analysis was performed, we considered two types of video playback: Continuous playback and Delay bounded playback.
We define Continuous playback as a method where every frame captured by the drone is played. This means that if a new frame is delayed more than the previous one the overall playback delay perceived by the user increases. The magnitude of the increases are illustrated by the vertical boxes in the figure and the accumulated delay by the red dotted line. Delay bounded playback is defined based on the requirement of the real-timeliness of the video i.e. what is the maximum allowable lag from the capture time to the video being displayed at the user. All frames that exceed the maximum are dropped, which result in periods where no video frames are available. These periods are denoted as horizontal boxes and the boundary is given by the black dotted line which in our case is 500 ms. Using the two playback methods the following results were obtained.
From these results, we saw that switching the rates at critical moments it is possible to obtain the lowest total freeze period and accumulated delay while still delivering the best video quality most of the time. Where we in the case of Continuous playback switch to the video rate with the lowest contribution to the accumulated playback delay. For the delay bounded switch to lowest rate that does not exceed the boundary which is 500 ms, if none of the available rates fulfill this requirement the frames are dropped until it is fulfilled. Both policies can be seen in Figure e-f.
For more information refer to our AAU conference paper and worksheets: . “Drone for live video multicasting: Understanding how reliability parameters affect playback delay”.