Exploring Data Science with Microsoft's Applied AI Engineer
Cameron Wilson
Posted on December 20, 2019
Originally posted on Educative.io
At Educative, we get to chat with developers from all over the world, get to know their story, who they are, and what inspired them to become developers and teach those around them. Today, we sat down with Samia Khalid and got to learn more about her career and the exciting world of data science, applied AI, and machine learning.
About: Samia Khalid
Samia is a Sr. Applied AI Engineer at Microsoft. She is working with large-scale distributed systems to power new intelligent experiences in Office 365 and Outlook. She believes that learning should be easy, fun and intuitive. Seeing the challenges faced by her friends and colleagues in learning data science and machine learning led her to start her own machine learning blog as well (towardsml.com). She is passionate about education and sharing what she learns to pave the way for others.
In her free time, she can be found blogging about Machine Learning, leading women empowerment efforts, at the gym, learning some new skill, or around the world trying out some crazy adventure!
Samia Khalid is the creator of our latest course, Grokking Data Science.
Tell me a little about yourself. How did you get your start in programming and what led to your love of data science and machine learning?
When I first learned about Machine Learning, I was totally awestruck. It seemed like a magic wand that can help us answer questions to many complex problems. And the best thing is that it has so many diverse applications – from cancer treatments to flying rockets to providing movie recommendations to predicting earthquakes, we are just bound by our creativity! No matter which field you are interested in, it’s waiting for you to create some magic. Isn’t this fascinating?! I really believe that with a sprinkle of creativity, some data skills, and the willingness to make a difference, we can make this world a better place for everyone.
I got the first real taste of Data Science and Machine Learning during my Master thesis project. I worked on predictive vehicle maintenance of heavy vehicles at Scania. During that time, I learned hands-on that data has so much potential. And then as I kept diving deeper and deeper into the field, my fascination has just kept on increasing.
What exciting projects are you getting to work on at Microsoft? What problems are these projects designed to solve?
I am working with large-scale distributed systems to power new intelligent experiences in Office 365 and Outlook. Some of the problems I get to address are, for example, “How can we infuse intelligence into our existing tech stack?” or “How can we make our current experiences more and more personalized?”
Could you go into a little detail on what Applied AI is and what problems are being solved in this space?
My day to day job revolves around contributing to Microsoft’s next wave of AI driven productivity solutions based on our strategic advantage of having tons of enterprise data. This is an already hard problem, but to make things even more exciting, we have an additional challenge on top of an already hard problem. We ensure that our solutions are compliant and that we are giving utmost importance to our users’ privacy.
Basically, I deal with the practical aspects of AI. For example, while an AI researcher’s job might be to come up with new language models, my job is to experiment with them to create deeply personalized experiences.
What is one tip you’d give to someone who’s starting their career in data science?
Always keep learning!
Where do you see the field of machine learning and AI going over the next 5-10 years?
Companies will be increasingly wrestling with data in varieties and volumes never encountered before. This means they will be needing more and more professionals who can dive into those “data oceans” to extract pearls from their depths – reason why data scientists are going to keep on becoming the most sought out breed of professionals.
ML and AI have tremendous potential for value creation. The return on investment is really high, so large companies are already investing heavily in democratizing AI and they are going to double down on their efforts in the coming years. This is to ensure that the barrier to entry in this field is reduced. People with the right drive to create great things can come from diverse backgrounds and they do not need to know all the nitty gritty details of AI. For example, from floor lamps to lighting in the oven, you can create great things with a light bulb. But do you really need to know how does a bulb operate in order to do so? Nope.
One more really important thing that I want to mention is that “party” with the data but do NOT forget about privacy because the future is privacy + AI/ML.
What is the biggest hurdle developers should be aware of when they start their data science career?
Every other person is trying to put together the pieces of the puzzle that can land them the hottest job of the century. I mean who doesn’t want to be the most sought out professional in the industry. There are tons of data science resources out there too. Still not many succeed. Why?
- Information overload
- Siloed resources
- Non-methodical approach
There is so much information. But chances are that either it’s written as if for an academic publication – “I need to understand a concept and not do research on the topic. Keep it simple and intuitive!” – or just for the sake of writing that article with some fancy jargon thrown here and there. You know how many articles I had to read before understanding something simple like p-value in an intuitive way?
Of course there are many excellent resources as well. But the problem there is of siloed information – one ends up wasting too much time browsing from one article to another. Having all the relevant information in one place makes the learning curve wayy faster.
What has inspired you to teach those around you?
Many of my friends and colleagues have been either trying to learn Machine Learning/Data Science or are planning to start doing so. What I have observed from the challenges they have been facing is that there are so many great resources out there but, most often than not, their explanations tend to be too academic or theoretical, which can be intimidating for starters. So I asked myself, “What am I doing to help out?” This introspection led me to start my own blog, but a practical and ‘inspire to make a difference’ blog!
The practical part: I want to share what I learn and what I have learned so far. Basically, I want them to save time in figuring out things where I have already spent hours and hours. Also, as the famous saying goes: “The best way to learn is to teach”.
The inspiring part: Machine Learning can be applied to solve so many real-world problems; it can be used as a tool that can help change lives in a positive way. I want to highlight inspiring positive developments because, I will say this again, I really believe that with a sprinkle of creativity, some data skills, and the willingness to make a difference, we can make this world a better place for everyone.
“Don’t try to learn everything!”. What are the fundamental skills and technologies that one should be familiar with before they start applying for data science jobs? What can they learn on the job?
Having a strong foundation is fundamental. If you have the basics covered, you can keep adding the rest along the way. Think about buildings. If a building has strong foundations, we can keep adding more stories to it. But if the foundations are weak then we can’t keep on building on top of it.
When starting your data science journey, you found a lot of the info out there to be too abstract and wanted to make it more accessible. Why is it important to make this material more accessible to more people?
I love learning and then I love sharing what I learn in a way that makes it easier for the next person in line. Helping others and giving back brings true fulfillment in life.
In your course Grokking Data Science, could you explain why you chose the material to include and what material is not included that aspiring data scientists should explore further?
This course covers the fundamentals for you to kick-start your journey in the Data Science world. Definitely, it’s not “everything” you need to know about DS (that’s not even possible). The field of DS is very new and rapidly evolving. New cool and simplified APIs, models, and approaches are being introduced “every other day”. The goal with this course is to give you a strong foundation. From here onward it’s about diving into the more advanced concepts, getting your hands dirty with real projects and learnings along the way. I have also included some recommendations in the “Further Study material” and “Getting that high-paying job” sections at the end of the course.
Here are two things that I really want my students to remember as they continue on their journey as a Data Scientist:
- “The path to becoming a great Data Scientist is not a sprint, but a marathon.”
- “Don’t’ be a know-it all; be a learn-it-all.”
What’s next? Keep learning
A career in data science requires constant learning, but it you’ll also need to focus your time and energy on the right material, especially if you want to kickstart your data science career as fast as possible.
Samia has taken her years of industry knowledge and condensed in a comprehensive course, Grokking Data Science. This is the core material you’ll need to learn to start your career in data science.
You’ll explore concepts like:
- Python fundamentals for data science
- The fundamentals of statistics
- Machine learning 101
Where at the end you’ll get to work on end-to-end machine learning projects.
Posted on December 20, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.