This article is a cross post from medium.
We have Vladimir Iglovikov for today’s interview. Vladimir is currently working at Lyft Level 5 as a Senior Computer Vision Engineer. He is experienced in pursuing research from both industrial and academic fronts.
Vladimir is also an ardent Kaggler, in fact, a Kaggle Grandmaster. He holds a Ph.D. in Physics and previously he was in Russian Airborne Forces. In order to know more about him, you can follow him on Twitter and LinkedIn.
Sayak: Hi Vladimir! Thank you for doing this interview. It’s a pleasure to have you here today.
Vladimir: Thank you for your questions. It is a pleasure to answer them.
Sayak: Maybe you could start by introducing yourself — what your current interests are, what is your current job and what are your responsibilities over there?
Vladimir: For the past 2.5 years, I am working at Lyft. Lyft is a ride-sharing company that operates in the United States and in a few cities in Canada. Right now, all taxi and ride-sharing vehicles are operated by human drivers. It would be nice to extend this fleet by self-driving cars. Hence all big ride-sharing companies are investing in self-driving technology. Lyft is not an exception, the department that works on this technology is called Level5. And that is where I work now.
Initially, I joined Lyft as a Research Scientist, but after six months moved to a Machine Learning engineer position. Both are about Machine Learning, but in the former, you work in Jupyter Notebooks and deal with the data. In the latter, you are closer to production and work both on models and on the infrastructure. One of the reasons why I moved from academia to the industry was the frustration with the code that I was writing and systems that I was building. It was an excellent opportunity to improve my software engineering skills.
I would not go into details of what I am doing right now at work, just mention that it is Deep Learning for various tasks :)
Years in academia made me value publications. I try not to forget how to write papers. Hence I also invest time in publications and blog posts.
I did not do any Machine Learning competitions for a while, except helping Lyft to host the competition at Kaggle, but I think that it could be an excellent time to get back and get my Machine Learning muscles back to shape. :)
Sayak: That was very detailed, thanks, Vladimir! You were in the forces. How did you become interested in pursuing machine learning?
Vladimir: These are two different things :)
My service was in 2002–2004, and I learned about the fact that Machine Learning exists only in 2015, many years later, when I was finishing my doctorate program at UC Davis.
In early 2015 my graduation date was approaching fast, and I needed to decide what to do next. I was debating about different directions and attended a lot of various career events.
In one of them, the speaker talked about scientific paradigms and how they evolved with time. As a graduate student, I was comfortable with the first three paradigms: empirical science, model-based theoretical science, and computational science. During my academic journey, I was learning one paradigm after another, and each of these steps was mind-blowing. Data Science promised to be the fourth one and getting data science skills and mindset looked like the only possible step in self-development.
From a more practical perspective, I needed to decide what to do after graduation. Staying in academia did not look very exciting. Becoming a software engineer did not attract me, either. Data Science looked good. Hence I started looking for Data Science jobs.
In one of the online courses at Coursera, the lecturer mentioned Kaggle as a platform for practicing Machine Learning skills. I joined the same day and got hooked.
Sayak: The way you combine your academic background and software engineering skills is very appealing. When you were starting what kind of challenges did you face? How did you overcome them?
Vladimir: Machine Learning is an applied science. Unlike Physics, where laws are unbreakable, in Machine Learning, there is an exception for every rule. Machine Learning is also a relatively new and rapidly evolving field.
When I decided to learn what machine learning is and how to tame this beast, the first problem that I faced was the lack of a clear starting point. You read the internet, and the volume of the unknown terms is enormous. I took a couple of introductory online classes, and this added the original structure. From that point, it became easier.
It was five years ago, and the amount and the quality of the information that you could find online are much better. I hope people that try to move to machine learning do not struggle as much as I did.
The next problem that I faced was how to practice my theoretical knowledge. I tend to forget what I learn if I do not apply it somewhere. There is a big gap between the understanding of how SVM works and how to train it on a given dataset to get the highest possible generalization score. Here Kaggle became very handy. I believe machine learning competitions are the best way to develop an intuition for different algorithms and their pros and cons.
The next thing that I needed to learn in my machine learning journey is how to write my ML pipelines in such a way that my iteration speed would be high. Since ML is an applied discipline, the quality of your final solution is proportional to the number of ideas that you check per unit of time. Hence any part of the pipeline should be easily replaceable. It is closer to system design rather than ML, and it was not something that we were taught in the Physics department, but for ML, it is an essential skill.
Finally, I got a job where I needed to use Machine Learning not for fun, but as a part of my work responsibilities. There are many differences, and I needed to learn them.
Sayak: Those challenges beautifully helped you to navigate your way through in the journey, I would assume. You have engaged yourself in a number of different domains. How have you managed to keep your sanity?
Vladimir: I have papers in Physics, Medical Imaging, Forensics, Satellite imagery, and some other domains. The main reason why it is possible to get results in different areas is that Machine Learning is a potent tool that could be applied to various tasks without having a deep understanding of the domain.
At the same time, I could answer a bit broader question: “What makes me try to engage with different domains?”
I would say, my main driver, is curiosity. I want to know how things work, why things work this way and not the other, why people do things way they do them? I have definite information hunger that I need to satisfy.
When I was 18, I decided to join airborne forces for two years just because I have seen a lot of war movies, and I wanted to know how that part of life looks in reality.
We live in a very interesting time. A lot of available information, a lot of places in the world to visit, a lot of new people to meet.
None of my friends approved my decision to go to the military, but I think these two years worth it. That experience changed me a lot, plus it was the most interesting part of my life.
Getting out of your traditional comfort zone is essential for progress. And in addition, it would be so boring to fully focus on one thing and ignore everything else.
Sayak: So far, I have been focused in one direction and the things that come with that to stay focused. I would say that has not been much boring, but yes, it varies! You are an ardent Kaggler. Would you like to share a favorite ML hack of yours?
Vladimir: In competitions: If you are stuck and cannot get higher at the leaderboard — share your solution publicly, people will analyze it, modify and generate ideas that you could develop further. For ML problems that you face at work: try to find machine learning competition that solved a similar question. The knowledge and ideas that you will get from winning solutions will be an excellent starting point for your work problem.
Sayak: This is nothing less than gold, thanks for sharing! What’s your opinion on doing machine learning on Kaggle as opposed to doing it out there in the wild?
Vladimir: There is some overlap between ML in Academia, industry, and competitions, but there are many differences. I like to think about Kaggle as a weight lifting gym but for machine learning muscles. People go to the gym because it is the best way to get stronger. Same here. Kaggle is really good for ML muscles, and I feel slightly out of shape because I did not participate in competitions for a while.
I would not say that doing only competitions is enough, but as an addition to your industrial or scientific work, it could be highly beneficial. Precisely the same as being fit helps you with your life in general.
Sayak: I am in agreement. These fields like machine learning are rapidly evolving. How do you manage to keep track of the latest relevant happenings?
Vladimir: I would not say that I keep up. The approach that works well for me is to focus on the problem at hand and dig deep into it. After the project is over, I move to the next one and dig there. The field is too broad and is growing so fast that trying to keep up with everything is unfeasible. For example, I do not really know what happened in Natural Language Processing in the last couple of years, but I am planning to find a project that will make me fill this gap.
Sayak: That is nice — the idea of project-based learning. Being a practitioner, one thing that I often find myself struggling with is learning a new concept. Would you like to share how do you approach that process?
Vladimir: When I need to learn a new concept, I read everything related to it. I write all the questions that I have. After this, I find someone who understands the topic really well and consult on the questions that I have. In academia, I tried to figure out everything by myself. It is not the most efficient approach. Collaboration with other people is the key. We have strengths in different fields. She helps me here, I will help someone else in the topics that I know tomorrow.
Sayak: So true! Collaborations make it very easier and I appreciate the communities that come to support this cause. Any advice for the beginners?
Vladimir: Later often means never. If you are asking yourself when you should join Kaggle, when should you start looking for a machine learning job, or similar, the answer is: “Yesterday.” Do not wait for the perfect moment. Start now. You are losing time. You can start making mistakes and learn from them today.
Sayak: Thank you so much, Vladimir, for doing this interview and for sharing your valuable insights. I hope they will be immensely helpful for the community.
Vladimir: Thanks to you for organizing it and everyone who will have the patience to read till the end.