Main Page
Contents
- 1 Welcome to Sustainability Methods!
- 2 What are correlation and regressions
-
3 Causal vs non-causal relations
- 3.1 Day 5 - Correlation and regression
- 3.2 Day 6 - Designing studies Pt. 1
- 3.3 Day 7 - Designing studies Pt. 2
- 3.4 Day 8 - Types of experiments
- 3.5 Day 9 - Statistics from the Faculty
- 3.6 Day 10 - Statistics down the road
- 3.7 Day 11 - The big recap
- 3.8 Day 12 - Models
- 3.9 Day 13 - Ethics and norms of statistics
- 4 Admin Tools
Welcome to Sustainability Methods!
Day 1 - Intro
- Do models and statistics matter? Why does it pay to be literate in statistics and R?
- Getting concepts clear: Generalisation, Sample, and Bias
- - See also, misunderstood concepts
- History of statistics
Day 2 - Data formats based on R
- Continuous vs. categorical, and subsets
- Normal distribution
- Poisson, binomial, Pareto
Day 3 - Simple tests
Day 4 - Correlation and regression
What are correlation and regressions
Propelled through the general development of science during the Enlightenment, numbers started piling up. With more technological possibilities to measure more and more information, and slow to store this information, people started wondering whether these numbers could lead to something. The increasing numbers had diverse sources, some were from science, such as Astronomy or other branches of natural science. Other prominent sources of numbers were from engineering, and even other from economics, such as double bookkeeping. It was thanks to the tandem efforts of Adrien-Marie Legendre and Carl Friedrich Gauss that mathematics offered with the methods of least squares the first approach to relate one line of data with another. How is one continuous variable related to another? The box of the Panters was opened, and questions started to emerge. Economists were the first who utilised regression analysis at a larger scale, relating all sorts of economical and social indicators with each other, building an ever more complex controlling, management and maybe even understanding of statistical relations. The Gross domestic product -or GDP- became for quite some time kind of a pet variable for many economists, and especially Growth become a core goal of many analysis to inform policy. What people basically did is ask themselves, how one variable is related to another variable. If nutrition of people increases, do they live longer (Yes). If Economies have a higher GDP do they offer more social security (No). Does a higher income lead to more Co2 emissions at a country scale (yes). As these relations started coming in the questions of whether two continuous variables are casually related becoming a nagging thought. With more and more data being available, correlation became a staple of modern statistics. There are some core questions related to the application of correlations and regressions. 1) Are relations between two variables positive or negative? Relations between two variables can be positive or negative. Being taller leads to a significant increase in body weight. Being smaller leads to an overall lower gross calorie demand. The strength of this relation -what statisticians call the estimate- is an important measure when evaluating correlations and regressions. Is a relation positive or negative, and how strong is the estimate of the relation?
2) Does the relation show a significantly strong effect, or is it rather weak? In other words, can the regression explain a lot of variance of your data, or is the results rather weak regarding its explanatory power? Take EXAMPLE
3) Relation can explain a lot of variance for some data, and less variance for other parts of the data. Take the percentage of people working in Agriculture within individual countries. At a low income (<5000 Dollar/year) there is a high variance. Half of the population of the Chad work in agriculture, while in Zimbabwe with a even slightly lower income its 10 %. At an income above 15000 Dollar/year, there is hardly any variance in the people that work in agriculture within a country. The proportion is very low. This has reasons, there is probably one or several variables that explain at least partly the high variance within different income segments. Finding such variance that explain partly unexplained variance is a key effort in doing correlation analysis.
Causal vs non-causal relations
- - See also, misunderstood concepts
- Are all correlations causal?
- Is the world linear?
- Transformation
Day 5 - Correlation and regression
- P values vs. sample size
- Residuals
- Reading correlation plots
Day 6 - Designing studies Pt. 1
- How do I compare more than two groups?
- Designing experiments - degrees of freedom
- One way and two way
Day 7 - Designing studies Pt. 2
- Balanced vs. unbalanced - Welcome to the Jungle
- Block effects
- Interaction and reduction
Day 8 - Types of experiments
- Are all laboratory experiment really made in labs?
- Are all field experiment really made in fields?
- What are natural experiments?
Day 9 - Statistics from the Faculty
Day 10 - Statistics down the road
- Multivariate Statistics
- AIC
Day 11 - The big recap
- Distribution & simple test
- Correlation and regression
- - See also, misunderstood concepts
- Analysis of Variance
Day 12 - Models
- Are models wrong?
- Are models causal?
- Are models useful?
Day 13 - Ethics and norms of statistics
- What is informed consent?
- How does a board of ethics work?
- How long do you store data?
View All Pages.