Database - w2

Objective: Using DaanMatch’s Data Model, create a Schema and load DaanMatch data to database.

Patrick Contact Info

OH: Tuesday: 6:30 - 8:00pm PST
guopatrick.comping@gmail.com
Mobile: 5107178380
GitHub: shpatrickguo

This Week’s Objective

Get familiar with DaaMatch’s data model, data, GitHub, AWS and your team.

Meiyi (Emily) Ding

emilyding@berkeley.edu GitHub: EmilyDing201

Arthi Matrubutham

apm.butham8@berkeley.edu GitHub: artmatru4b

Apoorv Lawange

DaanMatch’s Data Model

DaanMatch Data Model

Fig. 1 DaanMatch’s Data Model visualized using DrawSQL.

Git

DaanMatch is using GitHub for version control. Code submissions will be done through pull requests.

TODO: To enable effective collaboration please download/review the following

TODO: Clone Data Model

How to get our data

DaanMatch’s data files are stored on AWS S3.

TODO: Load data from AWS S3 to Jupyter notebook.

Warning

Login information found in #team-shpg on Slack. Please keep login information private.

  • [ ] Run the following in Jupyter Notebook and submit in shpg-1 folder in Data Model

import pandas as pd
import io
import boto3

client = boto3.client('s3')
obj = client.get_object(Bucket='daanmatchdatafiles', Key='webscrape-fall2021/Final_IndiaNGO.csv')
df = pd.read_csv(io.BytesIO(obj['Body'].read()), low_memory=False)
df.head()

File Assignment

  • Arthi: “giveIndia - giveIndia.csv”

  • Emily: “InvestIndia.csv”

  • Apoorv: “helpyourngo.json”

TODO: <2 min presentation about your data