# Week 1

Get Started. It's Free
Week 1 by

## 1. Concepts and ideas

### 1.1. Replicate your results by different people

1.1.1. Cannot be replicated due to:

1.1.1.1. No time

1.1.1.2. No money

1.1.1.3. Unique

1.1.2. Make code available to everyone

### 1.3. Research pipeline

1.3.1. Article

1.3.1.1. Author goes left to right

1.3.1.2. Reader goes right to left

### 1.4. What needed

1.4.1. Data should be available

1.4.2. Avilable code

1.4.3. Documentation of code and data

1.4.4. Standard ways of distribution

1.5.1. Author

### 1.6. Literate Programming

1.6.1. Article

1.6.1.1. Text

1.6.1.2. Code

1.6.2. Presentation code

1.6.3. General concept

1.6.3.1. Documentation language

1.6.3.2. Programming language

1.6.4. Types

1.6.4.1. Sweave

1.6.4.1.1. uses Latex

1.6.4.1.2. Lacks features: caching, multiple plots

1.6.4.1.3. Not well udapted

1.6.4.2. knitr

1.6.4.2.1. uses R

## 3. Structure of data analysis

### 3.1. Steps

3.1.1. Define a question

3.1.1.1. Narrow as much as possible

3.1.1.2. This helps remove the noise of other data

3.1.2. Define ideal data set

3.1.2.1. May depend on your goal

3.1.2.1.1. Descriptive

3.1.2.1.2. Exploratory

3.1.2.1.3. Inferential

3.1.2.1.4. Predictive

3.1.2.1.5. Causal

3.1.2.1.6. Mechanistic

3.1.3. What data you can access

3.1.3.1. Free on the web

3.1.3.3. Might need to generate it

3.1.4. Obtain data

3.1.4.1. Try to get raw data

3.1.4.2. If got from web: record url and time accessed

3.1.5. Clean data

3.1.5.1. if it preprocessed already, understand how

3.1.5.2. understand souce of data

3.1.5.3. determine if data is good enough

3.1.5.3.1. quit

3.1.5.3.2. change data

3.1.6. Exploratory data analysis

3.1.7. Statistical prediction/modeling

3.1.7.1. Get the value of uncertancy

3.1.8. Interpret results

3.1.8.1. Use apropriate language

3.1.8.2. Give explanation

3.1.8.3. Interpret the results

3.1.9. Challenge results

3.1.9.1. All steps

3.1.9.2. Measures of uncertanty

3.1.9.3. Think of potential alternatives

3.1.10. Synthesize/write up results

3.1.10.2. Don't include analysis if can

3.1.10.3. pretty figures

3.1.11. Create reproducible code

## 4. Organizing analysis

### 4.1. Data

4.1.1. Raw data

4.1.2. Processed data

4.1.2.1. Should be named so it is easy to understand which script generated the data

### 4.2. Figures

4.2.1. Exploratory figures

4.2.2. Final figures

### 4.3. R code

4.3.1. Raw / unused scripts

4.3.2. Finl scripts

4.3.3. R markdown files

### 4.4. Text

4.4.1.1. Should contain step-by-step instructions for analysis

4.4.2. Article

4.4.2.1. Title

4.4.2.2. Intro

4.4.2.3. Used methods

4.4.2.4. results

4.4.2.5. Conclusions