Data Science Portfolio
Project 1: Breast Cancer Analysis: Project Overview
Objective
Classifying the breast cancer type: Malignant and Benign using real-valued features such as radius, texture, perimeter etc.
Methodology
PART 1
Data cleaning and exploration is done before applying the machine learning models. KNN (1NN, 3NN) is used to classify the two class for the dataset and their accuracy has been recorded. Futher, Linear discriminant classification and FDA is applied and compared with the KNN model and the detailed observations are noted in the report.
PART 2
Unsupervised machine learning algorithms such as PCA and K-Means (k=2, 3, 5) clustering is applied to the dataset. Davies Bouldin Index is calculated for the clusters to select the appropriate model. The results are compared and detailed observations are recorded in the report.
Learning Outcome
Understanding the mechanics of behind supervised and un-supervised algorithms for a simple two-classed dataset.
Project 2: Cryptocurreny time-series analysis: Project Overview
Dataset Obtained from the year 2018 to MAY 2021:
- Tesla
- Ethereum
- Bitcoin
- Dogecoin Datasets are obtained from Yahoo Finance Website
Objective
To do EDA of the time series and to find out the correlation between several cryptocurrencies. Finding out whether Tesla stock price has an effect on cryptocurrencies. Trying to find a way to predict the stock price of cryptocurrencies in he future.
Results are yet to be documented
Project 3: PUBG’s Downfall and E-Sports Industry
Abstact
This project focus on widening the active gamers count for an e-Sports title. This study gives strategies to the game developers to make the game more interesting and to the new gamers entering an E-Sports title by giving them an introductory guide. The main objective is to address the skill gap between players and giving satisfaction to the gamers while they are gaming.
Proposed Idea
- Recommendations and suggestions for game developers to retain the entertainment of the game.
- Introductory game guide for new comers to the game and helping them to have a decent gun fight time.
Interactive Visualizations
Future Scope and Conclusion
- Comparing the player’s skill gap of a trending game say fortnite, apex legends with PUBG can help us understand the real reason behind PUBG’s downfall.
- Analytics can help an E-Sports title retaining the players and also will help to modify some gameplay decision that can make an E-Sports title more welcoming.
Results are documented in GitHub.
Project 4: Finanical Serives: Analysis on CAPEX and OPEX of a company’s software investment
Objective
Report
Project 5: New York Airbnb Data Exploration: Project Overview
Since 2008, guests and hosts have used Airbnb to expand on traveling possibilities and present more personalized ways of experiencing the world. This dataset contains information on 2019 listings in New York and its geographical information, prices, number of reviews, and more.
- Which hosts are the busiest and why?
- What areas have more traffic than others and why is that the case?
- Are there any relationships between prices, number of reviews, and the number of days that a given listing is booked?
Project 6: Electric Load Forecasting: Project Overview
Objective
We are very much dependent to the Electricity. It is our reponsibility to prevent from a catastrophic power failure. To prevent this we need to do some precautionary measures.
- We must have a balance between power consumption and production, when this equilibrium falls apart we experience power failure at the particular locality.
- It is important to steady the usage of the customers to prevent any faults occuring in the power system.