Table of Contents

Databases


Short introduction to databases

   In the states.dbf table, fields exist for the attributes representing
   state name, FIPS (Federal Information Processing Standard) zone, and
   state name abbreviation. A single record exists for each state.

   In the cities.dbf table, fields exist for the attributes representing
   city name, state, etc., and there is a single record for each city.

   If the values for a particular field in one table match the values
   in a particular field in the other table, a relationship can be made
   between the tables. Because both tables contain the field
   "STATE_NAME", it should be possible to relate or join the tables
   based on the common value of this field. In this way, it is possible
   to make selections on one table that also select related records in
   the other table, or to combine the tables into a single virtual
   table. We could select the single record representing the state of
   Washington in the states.dbf table and have an automatic selection
   of all records from the cities.dbf table representing cities within
   Washington state.

Database Uses

Database Software


Note on database assignments


Database Assignments 1 & 2

Note: Make sure that your name and the assignment number appears in any printouts that you are required to submit.

Due date: Wed, 10/22

Database Lab 1 - Create and Sort ( handout)

Database Lab 2 - Filters and Queries ( handout)


Database Assignments 3 - Calculated Fields

Due: Monday, 10/27

Download the following comma-delimited file which contains demographic data on all counties in the United States for 1990-1999:

http://rockhopper.monmouth.edu/~jchung/cs102/counties.txt

Import the text file into Excel and save as a DBase IV file (dbf).

Import the dbf file into Access.

Using Access queries, answer the following questions in a Word file called cs102dblab3.doc.

1) What is the loneliest county in the continental U.S. (in terms of population density = 1999 population divided by area (given in square miles))?

2) What is the most fertile county (kids, age 17 and under, as a percentage of the total 1999 population?)

3) What is the least fertile?

4) What percentage of single-parent households (1_M_HOUSEH and 1_F_HOUSEH) in the U.S. are headed by women (1_F_HOUSEH)?

5) Ranking the states, New Jersey ranks in what place in percentage of single-parent households headed by men?

6) Ranking the 100 largest counties (in terms of 1999 population), New York County (Manhattan) ranks in what place in percentage of single-parent households headed by women?

7) Where is divorce most prevalent, that is, what are the five counties in the U.S. with the highest proportion of divorcees [ divorced / (divorced + married + separated + widowed) ]?

8) Where is divorce least prevalent, that is, which five counties have the lowest proportion of divorcees?

9) Which county and which state have the greatest number of mobile homes?


Database Assignments 4 - Poverty in School Districts

Due: Wednesday, 10/29

Download the following Excel data file:

(Poverty in school districts, estimated 1995, from U.S. Census Bureau)

http://rockhopper.monmouth.edu/~jchung/cs102/poverty.xls

Open poverty.xls in Access. Generate and save the following reports as PDF files. In each report, include the states that the school districts are in, if applicable.

a) The 25 most populous school districts in terms of total number of people. (create dblab4a.pdf)

b) The 25 most populous school districts in terms of total number of kids. (create dblab4b.pdf)

c) All the states (including Washington D.C.) ranked by proportion of poor kids to all kids. (poor kids / kids) (dblab4c.pdf)

d) Top 25 school districts ranked by proportion of poor kids to all kids. (poor kids / kids, where kids > 0) (dblab4d.pdf)

e) Top 25 school districts in New Jersey ranked by proportion of poor kids to all kids. (dblab4e.pdf)

Make sure that your name is on the reports that you generate.


Database Assignment 5 - Toxic Release Inventory

Due: Wednesday, 11/5

Save all work to the Z:\cs102\db\lab5 folder.

1. Download this Excel file containing [USEPA Toxic Release Inventory] data for 2003 in New Jersey:

http://rockhopper.monmouth.edu/~jchung/cs102/tri2003_reduced.xls

2. Open the file in Access to generate a link to the data file.

3. Generate and save the following reports as PDF files (reports should include the amount in pounds of toxic chemicals released and the city and county names where applicable):

(Toxic chemical releases are in pounds.)

(Generate the reports as dblab5a.pdf - dblab5f.pdf)

a) The top 10 individual emitters of the chemical Benzene in the State of NJ in 2003.

b) The top 5 NJ counties with the greatest releases of Chromium & Chromium compounds in 2003.

c) The top 5 NJ cities with the greatest releases of the chemical Ethylene Oxide in 2003.

d) The top emitters of toxic chemicals in your home zip code. (If your home zip code is not in the database, use zip code 08818.)

e) The 10 most emitted toxic chemicals in Monmouth County. (All chemicals in the database are considered toxic.)

f) Total emissions of toxic chemicals from NJ's 11 northern counties: Bergen, Essex, Hudson, Hunterdon, Middlesex, Monmouth, Morris, Somerset, Sussex, Union, Warren. What percentage of NJ's total toxic chemical releases come from just the 11 northern counties?

Make sure that your name is on the reports that you generate.


Database Assignment 6 - Linking Tables I

Due: Wednesday, 11/16

Save ALL work to the Z:\cs102\db\lab6 folder.

 http://rockhopper.monmouth.edu/~jchung/cs102/fa05/access/form95.dbf 

(Look for the TOTREL field in the 1995 table.)

 http://rockhopper.monmouth.edu/~jchung/cs102/tri2003_reduced.xls

Hints: For numbers 1-4, you'll need to create intermediate queries for 1995 and 2003. For number 2, you may be thrown off by the fact that there were no emissions reported for Cape May County in 1995.

1. The total toxic releases for the entire state of NJ in 2003 and 1995 and the change in lbs. between 1995 and 2003.

2. The 21 NJ counties and their change in toxic releases between 1995 and 2003 as a percent increase or decrease ( (2003-1995)/1995 ).

3. Top 20 toxic chemicals in NJ in terms of percent reduction from 1995 to 2003 ( (2003totrel-1995totrel)/1995totrel is negative).

4. Top 20 toxic chemicals in NJ in terms of percent increase from 1995 to 2003 ( (2003totrel-1995totrel)/1995totrel is positive).

Download and save the tab-delimited 2006 NJ toxic releases data at http://rockhopper.monmouth.edu/~jchung/cs102/nj2006tri.raw.txt . Open the file in Excel. Follow instructions to reduce the data to only the fields we are interested in. Generate this final report:

5. The 21 NJ counties and their change in toxic releases between 1995 and 2006 as a percent increase or decrease ( (2006-1995)/1995 ).


Database Assignment 7 (optional) - Mapping

In this optional assignment, we're going to try mapping the individual toxic emitting facilities from the 2006 NJ TRI data. We will be using Google Maps/Google Earth for this exercise. This will require you to have a Google account, which you should have already created.

Steps:

1) Make sure you're logged into your Google account. We will use the Google Docs spreadsheet for this exercise.

2) Go to Google's Spreadsheet Mapper page . We will largely just follow the steps given there. From there, open the “starter spreadsheet,” which will open a Google Docs spreadsheet in your web browser.

3) Use Access to filter the NJ TRI 2006 data to show only the facility names, latitude, longitude and street address, in that order. This data will be pasted into the Google Docs spreadsheet that you have opened.

4) After our initial attempt at mapping the NJ TRI facilities, we will attempt to apply different color Google Maps placemarks to the facilities, depending on the level of toxic emissions from those facilities. We'll categorize the facilities into 4 groups (templates) based on the level of total releases:

total releases          group (template)          placemark color
--------------          ----------------          ---------------
1       - 9,999               1                       green
10,000  - 99,999              2                       yellow
100,000 - 999,999             3                       orange
> 1,000,000                   4                       red

You'll want to change your copy of the Google Spreadsheet Mapper and rename it to “NJ TRI 2006 Test”. In “NJ TRI 2006 Test,” follow my instructions for replacing the default placemarks for templates 1-4.