Data Science
Using the Past to Predict the Future

Data Science

Using the Past to Predict the Future

Data Science

Analytics is generally bifurcated into two primary categories: Descriptive Analytics and Predictive Analytics. Descriptive Analytics is backward looking and typically the work done in Business Intelligence while Predictive Analytics is completed by Data Scientists who use data to forecast the future. Data Science is the practice of taking historical data, organizing it in a specific way, and feeding it into mathematical models. The models are trained on the data to recognize patterns and can then use new data to predict outcomes with different levels of certainty. The impact can be very powerful when used correctly.

The terms Artificial Intelligence (AI) and Machine Learning (ML) have become major buzz words that are used everywhere. Ironically, most of the underlying base models have actually not changed much in decades. Deep Neural Networks (DNN) were originally conceptualized in the 1960s, but it wasn’t until modern computing where their full potential was unlocked. These tools are so powerful that even 60 years later, the world is still learning how these models can be used. Chat GPT and Large Language Models are the latest iteration of using deep learning capabilities with modern computing.


Use Cases

Stock Market Predictions

The stock market is inherently unpredictable, but that doesn’t mean people haven’t tried!

As a form of practice and education, Tony spent several months building different Machine Learning models to try and identify trading strategies that could prove profitable. He wrote custom software using API’s to automate the daily trading activity, and then made this software available to the general public for both ETrade and TD Ameritrade. He used Linear Regression, Logistic Regression, Decision Trees, and Deep Neural Networks. By tracking the results daily, he learned that he was not doing better than 51% across any of his strategies.

Before he shut down the strategies, he realized that his overnight positions were doing quite well even though this was not part of his ML models. He redirected his resources to the overnight positions, but this is a story for another day!

Customer Clustering

A major asset manager was having trouble cross-selling their products across customers. It was hard to know which customers would be open to whcih products.

Tony built a K-means clustering analysis using existing customer data to see which product pairs were appearing together most frequently. This is the method Netflix uses to suggest new movies and TV shows to their customers. The clustering analysis for the asset manager could then be sent to sales teams, helping them approach clients with better and more relevant information.