My hardest week yet as a Data Scientist. Ever heard of Machine Learning?

Otema Yirenkyi
3 min readDec 10, 2020

1 month, 3 weeks and 2 days. That’s how long I’ve been a trainee data scientist. But who’s counting? No one. Right?

I’ll round it up till about a week or two ago. My task at work massively involved Machine learning. I was identifying trends and generating predictions to make sense of data that will guide business decisions. Of course, I didn’t have to do this alone. I got help from some trusted Machine Learning, and daresay friendly, algorithms. In a span of a few days, I had to read and master machine learning algorithms that I had only heard about and did not know the depths of. For some, it was my first time hearing of it.

I’ll tell you how I got through my week and methods I’m putting in place to make this ride enjoyable. Look at your workplace as a canvas — a learning canvas. From the office building to your desk, etc. This environment is supposed to provide a conducive space for you to learn and add to your knowledge. Yourself, your colleagues, and supervisors are the paintbrushes. Your mind, your suggestions, your mistakes, their mind, their suggestions and their experiences are the paint buckets of different colours. Incorporating all these parts adds to creating a beautiful picture. Maybe not a Da vinci type, but it’s worth the try.

A tired dog
Photo by Lauren Kay on Unsplash

What am I driving at? You should see everywhere you’re at as a chance to learn. You’d encounter challenges because you do not know it all. There’s going to be people who came before you, hence know more than you do from continuous practice or learning. Don’t be afraid to ask questions.

It is paramount to understand the problem you’re trying to solve as a team. Understand the basis before you jump into using mathematical models just because. Know what you’re trying to achieve and then you can decide on whether a randomforestregressor will solve your problem.

I was trying regression analysis, ensemble machine learning methods among other algorithms. Now, what is an Otema medium article without useful resources being shared. Scroll to the bottom of this article for the juice.

Throughout the week, we go to the drawing board, and the boss suggests,

“Why don’t you take out these variables and do X, Y, Z?”

My team and I do exactly that, present the new insights and then we get hit with,

“Okay now that the data is behaving this way, let’s take out Z and observe what the new behaviour will be.”

It’s a back and forth process between team members. There’s a point for all this iteration because it leads somewhere. At the end of the day, the team and the clients have goals that need to be achieved. All this iteration is directed even if at some point it doesn’t seem like it.

I guess my main headache was wondering how I was expected to understand different machine learning models and implement them in a day. It took a bit of patience and having my basics right. This is very important. You need to know your stuff so that adding on does not become hell when you’re to implement new solutions in a matter of hours.

But you’ve got this! Know your stuff and you’ll do just fine.

Ask your colleagues and supervisors questions and you’ll do just fine.

Take a breather, and you’ll do just fine.

Bis später !

Useful resources:

A practical guide to starting with machine learning

Randomforest introduction

Randomforest

Scikit-learn Machine learning documentation

Introduction to Support Vector Regression

Confidence, Prediction, Tolerance Intervals

Improving Random Forest in Python

--

--