Welcome to Stat 360: Syllabus and Schedule
SyllabusData is everywhere! We are generating data everytime we click a link, go to the doctor, buy a book, take a picture, etc...
There is also data out there that we want to measure: CO$_2$ emissions, energy consumption of electronic devices, people's reaction to a new idea, etc...
So we have data, we can make data, we can collect data...but then what do we do with data?
First we need to make sure we did a good job collecting, organizing and cleaning the data
Then there are two types of analyses we can make on our data:
This is a summary of your data, such as the maximum, the average, the standard deviation.
You probably have been doing this sort of data analysis a long time ago. We can gain a lot of insight on a problem just looking at the summary of our data. For instance, the average daily temperatures for a month and a place gives an idea about the weather then and there.
This sort of analysis requires us to understand our data more profoundly in order to draw conclusions and make predictions about the larger set of data from which our small sample came from.
For instance, testing the quality of all manufactured motherboards can be time consuming and expensive. The manufacturing process is the same for every motherboard, so we test a sample of 100 and find that 10 out of 100 motherboards have a defect. What can we say about all of the motherboards? What is the confidence of our statement?
Probability will help us here!
For a statistical problem: sample + inferential statistics (uses probability theory) = conclusion about the population
Probability allows us to draw conclusions about characteristics of hypothetical data taken from the population, based on known features of the population.
$\rightarrow$ deductive reasoning