Featured projects
Project: Climate Analysis and Exploration
A basic climate analysis and data exploration of climate database.
Features: Python, SQLAlchemy ORM, SQLite, Pandas, Matplotlib, Flask (JSON API endpoints)
The analysis includes the following:
- Data Preparation Climate data for Hawaii was provided in two CSV files. The content of these files was scrubbed.
- Jupyter Notebook WeatherPy_w_comments.ipynb takes care of data preparation / cleanup tasks. Pandas dataframes are created from the measurement and station CSV files. NaNs / missing values are cleaned from the data, and cleaned CSV files are saved.
- Database Engineering Using SQLAlchemy to model database schema, sqlite tables for "measurements" and "stations" are created.
- Jupyter Notebook WeatherPy_w_comments.ipynb used for database engineering work. Pandas used to read cleaned measurements and stations CSV data. Database called hawaii.sqlite created, using declarative_base to create ORM classes for each table, and used create_all to populate database.
- Jupyter Notebook file called VacationPy.ipynb used to complete climate analysis and data exporation. Start date and end date determine "vacation" range. Used SQLAlchemy create_engine to connect to sqlite database, and automap_base() to reflect tables into classes. Referenced those classes as Station and Measurement.
- Flask Web Application Flask web app with routes (endpoints) displaying JSON data results from each of the above queries.
- Queries dates and temperature observations from the last year. Converts query results to a dictionary using date as the key and tobs as the value. Returns the json representation of dictionary. /api/v1.0/stations
- Returns a json list of stations from the dataset. /api/v1.0/tobs
- Returns a json list of temperature observations (tobs) for the previous year /api/v1.0/ and /api/v1.0//
- Returns a json list of the minimum temperature, the average temperature, and the max temperature for a given start or start-end range. Given the start only, calculates TMIN, TAVG, and TMAX for all dates greater than and equal to the start date. Given the start and the end date, calculates the TMIN, TAVG, and TMAX for dates between the start and end date inclusive.