Create Your First Project
Start adding your projects to your portfolio. Click on "Manage Projects" to get started
Restaurant Ratings Data Analysis Based on Proximity
Project type
Explanatory Data Analysis
Date
Mar. 2024
Githublink
Tools
Python, Pandas, Numpy, Scikit-learn
This project investigates whether the proximity of restaurants to university campuses influences their Yelp ratings and examines if student reviewers rate these establishments differently compared to non-student patrons.
-Data Collection:
Utilized the Yelp Open Dataset, which includes real-world data on businesses, reviews, and user information.
-Data Processing and Analysis:
Employed the Haversine formula to compute distances between universities and nearby restaurants, determining proximity based on a 5-kilometer threshold.
Conducted linear regression analysis to assess the effect of distance from universities on restaurant ratings.
Performed permutation tests to evaluate differences in ratings between student and non-student reviewers.
-Review Classification:
Developed a keyword-based approach to identify reviews likely authored by students, facilitating a comparative analysis of rating behaviors.