“In the data industry, having a background in Computer Science is certainly a bit better. But there are lots of people who don’t have a tech background, and they are ready to live and die with the profession…”
I graduated in Finance and Credit in 2012. Still, I did not feel suitable for the slow and heavy corporate environment, so I decided to find my own way and left the university degree behind. At that time, I was inspired by the book Steve Jobs, Hooked and wanted to create such valuable and functional products. I dabbled in trying out a few startups, which also marked my first step into the technology world. After a few years of still feeling unfit, I spent months focusing on myself. I think more carefully about choosing a career instead of a workplace. I did everything, personality tests, MBTI, psychological counseling, … all kinds. Then I found myself suitable for numerical and logical work, so I pursued learning about the data field. At that time, the data industry in our country was still pretty new. I got wet feet and rushed to study from day to night. Learn from the beginning about probability statistics, math, machine learning… When I finished studying, I thought I needed data to practice, and fate made me stumble across a post of current Cho Tot’s CEO on Facebook, and so applied. Wow, I have worked there until now. However, my job at that time was a Business Analyst, not a data specialist. So every year at Performance Review, I reminded the company that I wanted to do all the data 😈. Finally, after three years, I was satisfied. We started with a group of friends who often use data called Data Chapter.
Regarding the process of building a data team, I was fortunate to go through most of the stages of the hybrid decentralized data model and create a data-driven culture. Most companies have to go through such phases; it’s complicated to jump.
- When starting to build a Data Chapter team, there are many internal problems. Each group uses different data; the data warehouse system and datamart are cumbersome, or each person uses different metrics. The task now is to make everyone able to sync with each other and still maintain a level of initiative. We first review the datamart systems; then, the metrics must use the exact definition. A data glossary must be built so that each metric has only one name, a business definition, and a technical definition, and this glossary must be public for all data users to refer to.
- The second is to build a data decentralization system to ensure that data types are authorized, avoid being leaked, and ensure that everyone can access the data easily.
- Step 3 is to build a knowledge repository, where we encourage everyone to share how to use data and then report to each other. For example, do talk sessions to share insights, how to process data for each other, etc.
- Step 4, we build out the data platforms to serve the company’s analytical capabilities. For example, in 2021, the team built an A/B testing system, making it easier for Cho Tot to do testing. The system also supports automatic analysis of results in different areas, helping people do A/B testing and actively get insights faster. This has enabled Cho Tot both better analyze the cause and impact of features and analyze features much quicker than before.
- After ensuring the analysis and usage need, the next step will focus more on the applied AI/ML to take advantage of the current amount of data. We continue building data products to help solve business problems and data infra systems to serve the company’s data usage needs. At my company, some of the first solutions the team focused on was predicting the selling price of a product so that buyers could more confidently determine the car’s value. The following products are the recommendation system and auto-review system.
In terms of people, the first difficulty is how to build a strong team. I interviewed a lot and saw many data scientists, AI engineers only focused on models. I often meet others with a lot of modeling knowledge; almost all can say exactly deep learning, transformer, RNN, LSTM, etc. However, very few of them know project management, understand and clarify the problem, and understand an ML product’s life cycle. In the situation of applying AI for businesses, modeling is usually only a tiny factor, contributing only about 15% - 20% for a successful data product. Much will depend on: how well we understand the problem, what the data is at hand, how clean the information is, how to design the operating system to meet the needs of the business, how to monitor and handle it. When the problem occurs, make sure the data and model do not drift. Such factors will be highly appreciated when recruiting because of the reality and needs of Cho Tot in particular and other companies in general.
A second problem is the team’s introverted personality. Our mission is to ensure that everyone is kept on fire, always feels inspired at work, makes meaningful and valuable products. The hardest thing I think is how to quantify the output of the data team with particular metrics. For example, with a data engineer team, it could be the response time to the request ticket, the number of data incidents occurring, the time to resolve the incident, the adoption rate for new datasets, the awareness rate, etc. As for the AI/ML team, it can be metrics directly related to user behaviors and translate those metrics into direct results, such as increasing retention rate, increasing user engagement level, and increasing DAU…
In the end, I just want to convey that I studied in a non-related Technology major. My study and work process were not smooth as well, but then perseverance led me to where I am today. In the data industry, having a background in Computer Science is, of course, a bit better. But there are still many people who don’t have, and they are always ready to live and die with the profession. So be patient, don’t give up!