Featured projects
Project: ETL
Perform ETL on the data scraped from Quotes to Scrape website. The type of transformation
performed for this data (cleaning, joining, filtering, aggregating, etc). The final
production database to load the data into (relational or non-relational). The final tables
or collections that will be used in the production database.
Jupyter notebook contains the code for the following tasks:
- Webscraping for : 'http://quotes.toscrape.com/'
- Imported Splinter, pymongo, pandas, requests
- Utilized BeautifulSoup, sqlalchemy
- Created functions for data scraping for:
- quote text,tags,Author Name, Author Details (born, description)
- Send data to MongoDB
- Move data from MongoDB to Postgres
- Created 3 Tables : Author info, Tags, Quotes
In the app.py: Created FLASK API for the following endpoints.
NAVIGATE TO THESE ENDPOINTS:
- **"/authors"**
- **"/quotes"**
Imported flask, sqlalchemy, pandas
Connected engine to the SQL DB AWS server Created multiple routes for endpoint testing.
- Welcome Page
- Quotes
- Authors
- Tags
- Top 10 Tags