In this video, we'll walk you through an exciting project where we combine the power of PostgreSQL with Python to collect U.S. Census data. Follow along as we set up a cloud-based PostgreSQL server and a Jupyter notebook on AWS, then dive into data collection using the U.S. Census API.
Apply Census Key: https://api.census.gov/data/key_signup.html
GitHub Code: https://github.com/xbwei/Social-Data-Analytics-in-the-Cloud-with-AI
Steps Covered:
Set up a PostgreSQL instance on AWS RDS – Spin up your cloud-hosted PostgreSQL database.
Set up a Jupyter Notebook instance on AWS Sagemaker – Launch a robust development environment for Python coding.
Apply for a Census API Key – Access U.S. Census data and gather valuable population and income data.
Store the API key and database credentials on AWS Secrets Manager. Learn how to manage your sensitive data securely.
Download and install pgAdmin – Manage your PostgreSQL instance with a user-friendly interface.
Connect pgAdmin to the PostgreSQL instance – Establish a connection and prepare your database for data insertion.
Start Jupyter Lab and clone the Python code – Get everything set up in your development environment.
Run the Python notebook to create tables and collect Census data. It will automatically pull state names, populations, and incomes into your database.
Check the collected data using pgAdmin. Verify that your data has been correctly inserted into your PostgreSQL instance.
Stop the database and notebook instances. Learn how to shut down your AWS resources properly to avoid unwanted charges.
By the end of this tutorial, you'll have a fully functioning PostgreSQL server filled with real census data that you can query, analyze, and visualize. Perfect for building your next data-driven project!
Comments