“A Thousand and One Problems with Live Streaming”
Thanh Nguyen Huu - Senior Software Engineer (US)
I am currently working on the Video Infra team, responsible for the infrastructure of Livestream and Video.
Before, I shared about an open-source project on Cloud Gaming that received a lot of interest. Many of the knowledge I gained from this side project is also related to my current main job in the Video field. Tech enthusiasts may have heard of Cloud Gaming, which has technical similarities to Live-streaming. Cloud Gaming was expected to be the game platform of the future, where games would be hosted on remote servers instead of our own computers. When we play a game, the images from the remote server will stream to our computers, while our input will be sent back to the server through the network. This is an extremely challenging technical task because game actions are real-time, and the latency between the user and the server must be small enough to avoid lag and provide a smooth gaming experience. For example, if you press a button and it takes 2-3 seconds for your character to respond, it is unacceptable. The average human reaction time, from image to action, is around 250ms, and the standard frame rate (FPS) is 30fps. Cloud Gaming is the problem of minimizing latency while still ensuring good image quality. In addition, designing a Cloud Gaming system requires knowledge of Kernel, Sandbox, and Windows API to create a separate virtual game environment for each player, and also to scale and optimize infrastructure costs because games require a large amount of resources. I wrote CloudRetro before the introduction of cloud gaming platforms, and at that time I thought Cloud Gaming was too impossible with such stringent constraints.
Similarly, when working on Live Stream, the problem is also a trade-off between latency, stability, and video quality. For example, a fundamental platform problem is the format of the Video. The current platform formats for Video compress the Video in the most optimal way: small enough to be transmitted over the network and simple enough for users to decode within an acceptable time frame while maintaining the desired image quality. Therefore, frames with a lot of motion changes will significantly affect the output quality. For example, if you are in a video call or live stream and continuously switch scenes, the video data will increase significantly and may cause lag. This is just one of the many problems encountered in Live Streaming.
Recently, I have also been responsible for the Video on Demand (VOD) side. Video does not prioritize latency but is a problem of optimizing the size and experience of uploading and viewing videos. A video file can weigh tens of GBs, and on our side, we have to optimize the upload of these videos, such as applying machine learning to find the best upload configuration. I also work with network protocols to improve the upload speed of Videos at the network layer. On the viewer side, it is also a challenging problem. Depending on the user’s conditions, different video configurations are required to ensure no freezing or poor image quality while watching the Video.
What I have shared is just a small part and simplification of the vast issues in the Video Infrastructure. There is always ongoing development in algorithms and infrastructure to serve the Video needs of users. The video playground is becoming more and more lively with the popularity of Short Form Videos, more people wanting to watch videos, more streaming platforms, and more content creators producing content. Video is a part of the modern life platform, and there are always infrastructures ready to embrace new waves.