Introduction to Hierarchical Clustering with College Scorecard Data
Clustering is an unsupervised machine learning technique where data need not be labeled. The goal of clustering is to find like-items such as similar customers, similar products, or similar students, just to name a few. Popular clustering algorithms include K-means and hierarchical clustering, as well as DBSCAN, PAM, and more.
In this session, participants will learn how hierarchical clustering methods build clusters from the ground up. The benefits of a hierarchical clustering approach will be shared, and a few disadvantages will be covered. Our example use case focuses on using The College Scorecard data to cluster similar schools together based on various demographic and institution characteristics – to help students identify similar schools when they are exploring their multiple options.
Senior Data Scientist at Nelnet
We are looking for passionate people willing to cultivate and inspire the next generation of leaders in tech, business, and data science. If you are one of them get in touch with us!