The purpose of this assignment is to provide you with some experience exploring and analyzing data without using an information visualization system. You will also gain some familiarity using Processing to import and manipulate data. This assignment must be completed on your own. Each part is equally weighted: please follow the submission guidelines on T-Square to get full credit.
Below is a dataset (that can be imported into Excel) about cereals. You should explore and analyze this data using Excel or simply by hand (drawing pictures is fine), but do not use any visualization tools; that means no charts or visual representation of any kind. Your goal here is to perform an exploratory analysis of the data set, better understand the data set and its characteristics, and develop insights about the cereal data.
Your report should consist of three small sections on a single page. First, list (bullet list of items) five "insights", chunks of knowledge, or deeper questions that you either encountered or gained while exploring the data. Even if you come up with more, only list your five best insights. An insight could be some understanding of the data and its characteristics that is not relatively obvious or intuitive. It is something that most people might not realize initially. Note that an insight or knowledge chunk simply may be a deeper question that arose in your mind while exploring the data. And your analysis may not have been sufficient to answer the question. Second, write one paragraph about the process you used to do the exploration and analysis. Did you load the data into Excel, work manually, or do both? What did you do in Excel? Did you draw pictures? Just tell me (briefly) what you did. Third, write one paragraph about challenges or problems that you encountered in doing the analysis this way. Did anything limit or frustrate you? If nothing did, perhaps there was something that was more difficult than you thought it should be. Nothing is perfect, so you should be able to list some potential issues here. So, to sum up, your assignment should have a bullet list of five items followed by two paragraphs.
Grading: We will evaluate the quality of the insights you listed. We are looking for things that we find interesting or perhaps unexpected. This is subjective. For the second and third sections, we will evaluate if you did what the assignment asked. Please proofread your submission before submitting it and make sure it is free of spelling and grammar issues.[Back to Top]
Download Processing and install it into your development machine. You will write a small sketch capable of importing the data and containing a small collection of functions that support basic queries. Your system must contain the following functions:
importCereal()A function that takes a filename location as a string parameter, loads the csv data into a Table object, and returns that object.
getBounds()A function that can take a Table object and the name of a quantitative attribute column as parameters. It should return a PVector containing the minimum and maximum values of the column.
normalizeColumn()A function that can take a Table object, the name of a quantitative attribute column, and a PVector as parameters and returns an ArrayList of the values of that column normalized to a scale from 0 to 1. The ArrayList should be Float type.
remapColumn()A function that can take a Table object, the name of a quantitative attribute column, and a PVector as parameters and returns an ArrayList. The PVector parameter should contain new min and max values for the attribute. The returned ArrayList should contain the values of the column mapped from the original min and max values to the new min and max values. The ArrayList should be Float type.
getCategoryCount()A function that can take a Table object and the name of a categorical attribute column as parameters. It should return a Hashmap in which the keys are the possible categories for the attribute and the values are the count of each category instance. The Hashmap should be String, Integer type.
Most of the information required for this assignment can be found in the Processing reference. You should not require the use of external libraries, though you might want to begin to familiarize yourself with some of the more common libraries available for Processing in order to start thinking about your group project. While you may test out your functions in the main file of your sketch, please place all of the required functions into a separate file named "cerealFun.pde" that you submit. This file must not include
draw() functions, though other helper functions outside the scope of the assignment are permitted.
Grading: Your assignment will be graded on whether it can perform the list of functions correctly. Each function will be worth one fifth of the grade for this part of the homework. Please use good naming conventions and appropriate comments to help us read your code.[Back to Top]
The data set should be pretty self-explanatory. The Manufacturer is a one letter code with the expected mapping (Q-Quaker Oats, P-Post, G-General Mills, K-Kelloggs, R-Ralston Purina, N-Nabisco) and Type is C (cold) or H (hot). Intepret other attributes to the best of your ability given the information available.[Back to Top]