Apr 17, 20172 min read

How Things Work: Machine Learning

Pick 20 cases from the pile, read through them carefully, and manually label them as either BGIE or FIN2.
Feed those 20 cases to a computer, and set an initial cutoff page count (say, 15). Any case with fewer than 15 pages will be classified as FIN2, otherwise BGIE.
Allow the computer to update the cutoff value (doing so is trivial in the machine learning field) so that the highest percentage of the 20 cases are classified correctly, using your manual labels in Step 1 as the answer key.
Let’s say the computer settles on number 10 as the optimal cutoff page count.
Starting from the 21st case, you can just blindly feed it to your computer algorithm. Any case with fewer than 10 pages will be automatically classified as FIN2, otherwise BGIE. You have just implemented a machine learning algorithm for classifying HBS cases! It should be quite precise in classifying most BGIE vs. FIN2 cases, barring some outliers (e.g., the 20-page FIN2 case on Burger King would be incorrectly classified as BGIE). By labeling the first 20 cases, you were essentially injecting human insights into the machine learning algorithm, which can then deduce the optimal cutoff page count and automate the remaining classification tasks. In other words, you were “supervising” the machine to learn from the first 20 labeled samples. Such machine learning paradigm is called, unsurprisingly, supervised learning.

Ask about the input data. Data is the new oil. If a machine learning algorithm is a jet engine, high-volume and high-quality data is the premium oil for the algorithm to unleash its maximum power. In the technology industry, more and more strategic acquisitions are focusing on acquiring “big data” assets with long-lasting business values. Given the mission-critical nature of input data, you can never go wrong by prompting your counterparts to talk more about their data source. Commonly used jargons in data-related discussions include sampling, bias, variance, noise, normalization, and dimensionality reduction. Wikipedia provides excellent information for each concept.
Ask about how they measure the effectiveness of their machine learning algorithms.

“Do you believe in Symbolism or Neural Networks?” Our HBS case classifier belongs to the Symbolism tribe, where each HBS case is reduced into a human-readable symbolic representation (i.e., total page count). The Neural Network tribe, on the other hand, tackles machine learning problems by modeling the human brain and nervous system. The tribes of Symbolism and Neural Networks are like the Democratic and Republican parties of AI. Bonus point if you run into a symbolic and a neural network folk at a dinner table - ask the restaurant if they have some popcorn. Now that you know machine learning… How would you teach a machine to differentiate between FIN1 vs. FIN2 cases? Shallow symbolic features such as total page count may no longer suffice. Instead, we will need to leverage the semantics of our input data: a FIN1 case is more likely to talk about CAPM and portfolio theory, while a FIN2 case will probably touch upon debt financing and options. Unfortunately, we are running out of spaces.

Recent Posts