“If someone had told me more than 10 years ago that one day I would be at the forefront of global AI development, I would have thought they were crazy.”
Tuan Vu - Senior Software Engineer at Quora & Admin at Viet Tech
Upon leaving Meta, I joined Quora to pioneer the emerging domain of machine learning infrastructure, or ML Ops. This is a new field, requiring me to synthesize my expertise: machine learning from my first job and infrastructure from my previous job at Meta. My daily work involves optimizing machine learning operations (ML Ops) and custom-building infrastructure specifically for ML workflow. Currently, I am developing a system to provide large language models such as Poe (https://poe.com/) or ChatGPT to billions of users in a fast, easy and cost-effective manner.
Although Meta achieved enormous profits, the possibility of bankruptcy remained as a “scenario” if not managed carefully. Meanwhile, ML companies like OpenAI still lacked fully realized business models, as they continued burning through investors’ capital in an effort to expand their market share by accepting ongoing losses. I am an expert in “cost efficiency” for technical infrastructure—optimizing models to run faster and building cheaper, more cost-effective infrastructure. Increasing user numbers is not my strong suit. However, reducing infrastructure costs, creating cheaper technical products, serving more users through efficient systems, and lowering operating expenses to maximize profitability for the company - that is my specialty.
This poses a difficult challenge as LLM (Large Language models) requirements grow exponentially with more data and compute. For example, answering a 30-word question costs ChatGPT approximately one cent. With 10 million users asking an average of 10 questions daily, the cost is $1 million per day or $30 million monthly - extremely expensive. At 1 billion users, annual costs could exceed $36 billion, excluding model training, research, and development. As a result, reports show OpenAI lost $540 million in 2022, with costs multiplying as users and model complexity increase. Optimizing both hardware and software while reducing costs, increasing speeds, and balancing user load and budget constraints is immensely complex. This explains why ChatGPT sometimes responds slowly or hits capacity limits. These challenges both keep me up at night yet also excite me to go to work each day. For whoever can solve this problem will pave the way to deploying advanced AI technologies globally.
If someone had told me more than 10 years ago that one day I would be at the forefront of global AI development, I would have thought they were crazy. This reminds me of Steve Jobs’s quote: “You can’t connect the dots looking forward; you can only connect them looking backward.” My career has been a journey full of surprises. I have no specific goals for the next 10 years. But everyday, I will continue to challenge assumptions, to look at the world with fresh eyes, to experiment, to fail, to plot my own path, and to test the limits of my abilities. While the road ahead holds many unknowns and challenges, I have faith that in time it will come together to form a meaningful journey. Most importantly, each new day offers the opportunity to follow my own path and freely engage in pursuits which bring me greatest joy and fulfillment.
If you would like to connect, you can follow me on:
Blog: https://www.tuanavu.com/
Youtube: https://www.youtube.com/@tuan-vu
Profile: https://www.linkedin.com/in/tuanavu/