…start using R, from scratch!

Some time ago, since I was able to use R by myself, have found some fellows and other people who wanted to learn R as well. Then I pointed them to help pages, to CRAN repositories… but in some cases they said that didn’t know how to start using those resources. Obviously, the main self-perceived limitation for non-programmers is the use of “commands” -ok, many of the 80’s kids will remember the use of some command lines to access games such as PacMan, Frogger… :).

At the same time, they also wanted to refresh some basic statistics, acquiring a general knowledge of their data before asking for a statistician’s help. An idea to quickly help them was to make some scripts to guide them through basic commands, seeing results on real-time, and being able to recycle them for their own data.

If you have just started using R, maybe they can be useful for you. However, I will recommend that you use some open “plain text” file(s) to paste your favorite commands and clone/modify them to suit your needs. Remember to store the files where you can access them later!

  • Tip: you can change the extension of your mytext.txt file into mytext.R file, telling Windows to open it with the Notepad again. It will be also a plain text document, but some text editors will recognize it as an “R script” and will highlight the content according to that.
  • Apart from the Notepad in Windows, you also have a bunch of other text/code editors which are more pleasant to use. See for example R-studio and Notepad ++.

Copy the Gists below into your own text files, and begin playing with R!

Reinhart, Rogoff…. and R

EDIT: At the time I wrote this post, I didn’t know of the existence of this great one, from Christopher Gandrud, take a look!

On april, the 15th, an article was published that will change economic theories… Or at least, it will questionate and change the methods employed to formule those theories. As a doctor, I spend time on reviewing evidence that can be applied to daily practice. I should right now be reviewing papers, studying or going for a walk, but I came across weird news last week: A paper of CM Reinhart and K Rogoff, published in January 2010 defending austerity, was questioned by a student, Thomas Herndon and two of his professors.

Among other statements, Reinhart and Rogoff basically collected data and wrote that, if a country reached more than 90% of debt respect to its GBP, growth would become abruptly negative. It would be another scientific paper if there weren’t happened two things:

  • Many politicians, institutions, countries or even the European Union, have assumed quickly these facts, without revisiting or trying to reproduce them. This happens all the time, but it shouldn’t happen. It increases the chances that flawed studies reach the status of ‘science’. Sometimes with dramatic consequences.
  • A scientific research should be reproducible. It means it should render the same results when repeating the study. But when Herndon tried to replicate it, he found big errors -bad excel coding, data collected selectively- and a dubious weighting method, that when corrected, gave completely different results.

Herndon, Ash and Pollin kindly provided their data, so I rapidly ran it in a R console; I also played with databases from the Reinhart and Rogoff website, and from the Maddison-Project database:

Reinhart data Herndon

GAM_growth_debt_overplotted

It’s a scatterplot with data modelled by a generalized additive model (GAM), and what drawed my attention at first sight was that the correlation between Debt/GDP ratio and growth is non-linear, and surprisingly weak. Neither debt seems to increase growth nor seems to be heavily associated to recessions -note that I don’t know much of Economics-. ¿What do you think about these data?

watercolor plots

R has been recognised as the most powerful statistical tool for displaying graphs. In the last years, R’s awesomeness in depicting relationships between variables is exploding with great packages such as ggplot2. One can simply walk around some R blogs, and find something like this:

The ‘watercolor‘ plot (aka à la Solomon Hsiang).

Once you click these links above, you will forget completely about my blog, so… wait! I have to show a graph I made with the code provided there! Here it is, isn’t beautiful?

The ‘aurora’ plot:
Imagen

This black background is a little tweak from the plain one… Here’s the line of code I modified from the original of Felix Schönbrodt:

# you can change this line:

gg.points <- geom_point(data=data, aes_string(x=IV, y=DV), size=1, shape=shape, fill="white", color="black")

# for this one (you can tweak more parameters in it like the size of the points):

gg.points <- geom_point(data=data, aes_string(x=IV, y=DV), size=1, shape=shape, fill="white", color="white")

# then I run the function adding my black background (surrounded by a white background):

p <- vwReg(......, shape = 21, ......)

p + theme(
panel.background = element_rect(fill = "black",colour = NA),
panel.grid.minor = element_line("gray28", size = 0.1),
panel.grid.major = element_line("gray48", size = 0.1),
plot.background = element_rect(fill = "white",colour = NA))

And another one, in white background.. (and spaghetti = TRUE)

spaghetti_scores