…start using R, from scratch!

Some time ago, since I was able to use R by myself, have found some fellows and other people who wanted to learn R as well. Then I pointed them to help pages, to CRAN repositories… but in some cases they said that didn’t know how to start using those resources. Obviously, the main self-perceived limitation for non-programmers is the use of “commands” -ok, many of the 80’s kids will remember the use of some command lines to access games such as PacMan, Frogger… :).

At the same time, they also wanted to refresh some basic statistics, acquiring a general knowledge of their data before asking for a statistician’s help. An idea to quickly help them was to make some scripts to guide them through basic commands, seeing results on real-time, and being able to recycle them for their own data.

If you have just started using R, maybe they can be useful for you. However, I will recommend that you use some open “plain text” file(s) to paste your favorite commands and clone/modify them to suit your needs. Remember to store the files where you can access them later!

  • Tip: you can change the extension of your mytext.txt file into mytext.R file, telling Windows to open it with the Notepad again. It will be also a plain text document, but some text editors will recognize it as an “R script” and will highlight the content according to that.
  • Apart from the Notepad in Windows, you also have a bunch of other text/code editors which are more pleasant to use. See for example R-studio and Notepad ++.

Copy the Gists below into your own text files, and begin playing with R!

Power and sample size calculator for mitochondrial DNA association studies (Shiny)

The functions detailed inside the piece of code below (in a Gist) has been useful for me when I had to calculate many possible scenarios of statistical power and sample size. The formulae were taken from the article of Samuels et al., AJHG 2006, and the script showed even useful for making a variety of comparative plots.

This is intended for estimating power/ sample size in association studies, involving mitochondrial DNA haplogroups (which are categories whose frequencies depend on each other), on a Chi-square test basis. The problem with scripts is that sometimes they aren’t as friendly to many people as GUIs are. To solve this, there are many solutions but, as I don’t have programming background (apart from R), the most straightforward for me was Shiny.

Shiny is a friendly interface which allows for great interactive features (see its Tutorial), and it loads onto the web browser from an open R console, just by clicking:

https://aurora.shinyapps.io/mtDNA_power_calc/

This Gist, displays a simple graph using two power/number-of-cases values (it was hard for me to show the graph, mostly thanks to Stackoverflow and to MadScone):

library(shiny)
shiny::runGist('5895082')

Where 5895082 is the ID of the Gist. Here is the source:

To work with files inside your computer, just run R from the same directory of the files ui.R and server.R, and execute the Gist with the command:

runApp()

If this doesn’t work, you can paste the complete path to the ui and server files:

runApp("path/to/directory")
Structure of the human mitochondrial genome.

Structure of the human mitochondrial genome. (Photo credit: Wikipedia)