If you are reading this, I assume you are either (1) a computer scientist who is considering entering the world of machine learning (ML) or (2) a practitioner who is considering dipping their toes in the ML waters for the first time.
In either case, welcome to the ML fray!
What is ML?
ML is a field of computer science that focuses on training machines to perform tasks that humans can do. In other words, ML is about creating algorithms that can learn on their own (without explicitly being programmed by humans) and then performing some task (or actions) on their own (without explicit instruction).
Many might say that ML is not a single field but rather a collection of subfields that explore the theory and practice of training machines to perform tasks. These subfields include (but are not limited to):
- Machine Learning (theory)
- Neural Network
- Evolutionary Algorithms
- MACHINE LEARNING ALGORITHMS
- Support Vector Machines
- Reinforcement Learning
- Optimization
- Random Forests
- Fuzzy Logic
What is the relationship between ML and data science?
Well, data science brings all of the above together (and more!) by applying ML techniques to extract value from data. And just like that, you have entered the thrilling world of data science. Who knows, maybe one day you will work for me… 😉
For the rest of this article, we will explore the various facets of ML, including theory, methods, and applications.
Theory
When doing research in any field, especially one as broad as ML, it is crucial to understand the theory and concepts beneath the field. After all, without a strong theoretical foundation, it is extremely hard (if not impossible) to develop accurate models or make informed decisions about the field.
It is critical to understand how ML relates to other fields, including but not limited to:
- Statistics
- Probability
- AI
- Neuroscience
- Web Science
- Computer Science
- Operations Research
- Management Science
- Bioinformatics
If you are completely new to the world of ML, then it would be best to start by taking a course in theoretical statistics. A good course will not only teach you the theory but also show you how to apply it. A good theoretical foundation will prepare you for more advanced courses in the field. Of course, you don’t have to take my word for it, just visit the Coursera website to see what other universities are offering.
Methods
Now that you have a basic understanding of the field, it’s time to learn some methods. There are many different techniques used by data scientists (and researchers in general) to explore, test, and analyze data. These methods range from the classical (e.g., regression, classification) to more modern (e.g., deep learning) techniques.
A quick note: for the remainder of this article, when I use the word method, I am referring to the technique(s) used to explore and analyze data. For instance, regression analysis (one of the most basic and common methods) is a statistical technique that involves fitting a line (or curve) to a set of data.
Learning objectives: By the end of this article, you will know the following about the ML methods:
- The different types of ML models (e.g., regression, classification, clustering) and what each one is used for
- How to perform certain tasks (e.g., prediction, feature selection) using the different ML models
- How to perform certain tasks (e.g., prediction, feature selection) using specific ML algorithms (e.g., logistic regression, k-means clustering)
- How to choose the right ML algorithm for a particular problem (e.g., linear regression vs. Gaussian Processes)
- How to interpret and visualize the results of your analysis (e.g., a bar chart or line graph)
- How to choose the right ML model for your data (e.g., a Linear Regression model vs. a Neural Network model)
- What are the pros and cons of each method (e.g., linear regression vs. neural networks)
A good place to start would be the Python library Scikit-learn, which provides a rich array of machine learning algorithms for you to choose from. Alternatively, you could use one of the many web service APIs (application programming interfaces) that can perform various ML analyses for you (e.g., Google’s Prediction API, bing’s Machine Learning API, etc.).
Applications
Once you understand the theory and know the methods, it’s time to put it into practice. This is arguably one of the most exciting parts of the ML journey as you get to see the methods you learned about so far in action. Here are some examples of tasks that can be performed using the different ML models and methods mentioned above: