R Reporting Part 1: Tools
This is the first of a series of article on how to use R, RStudio and TexMaker to prepare presentations and batch jobs for automated reporting on a web server or Microsoft SharePoint server. The series is based upon the presentation that I did at the February 27, 2016 Dallas R User Group Meetup. Because the presentation was primarily a demonstration, there really isn’t a presentation to distribute; this series covers the topics from the presentation/demonstration. The series will eventually include the following articles as I complete them over the next couple of weeks:
- R Reporting Part 1: Tools
- R Reporting Part 2: Choosing the Right Markup for the Task
- R Reporting Part 3: Using Rhtml for Batch Web Reporting
- R Reporting Part 4: Using Markdown for Interactive Presentations
- R Reporting Part 5: Using LaTeX/Beamer for PDF Presentations
- R Reporting Part 6: Using LaTeX for PDF Articles
- R Reporting Part 7: Converting R Documents to E-books
- R Reporting Part 8: Using LaTeX to Create Posters
The series of articles describes the process for daily batch jobs that generate the Daily Econometric Graphs web page which includes links to the same econometric charts in several formats, all generated through the same R code:
- Econometric graphs in PDF form for use as slides on a projector
- Econometric graphs in PDF form for printing
- Econometric graphs in EPUB format
- Econometric graphs in Kindle AZW3 format
- Econometric graphs in A0 poster format
All of the examples are based upon the
knitr R package; you should reference the
knitr documentation, as this article is not a replacement for the
Example source is available in images/documents/econometric_source.zip.
Software to Install
To use R for presentations and batch reports, there are a number of software applications that you will need install on your desktop and your web server. The sections that following describe the installation of the various packages that you will need. All of the software applications in this section are available on Linux, Windows and OS X.
R and Related Packages
First you will need to install R from CRAN. Install files and instructions are available on the various CRAN mirrors.
RStudio is a popular integrated development environment (IDE) for R, although is not required for any of the presentations here, it makes a number of things very convenient. Other R IDEs are Emacs Speaks Statistics, which has the advantage of working with other statistical and programming languages. Eclipse users should look at StatET for R. The examples for this series are done in RStudio, but would work in the other environments with minimal modification.
Once you have R and RStudio installed, you will need to install
sweave for any presentation or reporting use, and you will need the
fImport, ggplot2 packages to run the examples in this series of articles. Use the following command to install the packages:
The HTML Editor of Your Choice
Although RStudio is a great IDE for R, it does not do a good job of HTML sytax highlighting and spell checking. Once you have the R portions of your code working well, you will want to use a dedicated HTML editor to do the writing, and HTML formatting in your Rhtml documents. You can use any editor that you want; Bluefish is available on Linux, Windows and OS X.
If you need heavy formating, tables of contents, figure cross references and bibliography management in your presentations and reports, you will want to use LaTeX. It was developed primarily for accedemic writing for math and science and thus does a very good job of handling mathmatical symbols and equations, automatic tables of contents, indexing, cross referencing, bibliography, and citation. It is a tag language like HTML.
In Linux, most package managers will allow the easy install of the TeXLive distribution, although not necessarily the most recent one. On Windows, MikTeX is the preferred way to install LaTeX, but you can also install it via Cygwin. On OS X, MacTeX and MacPorts are perhaps the most convenient ways to install the LaTeX distribution. You should install the
Beamer package; it is not part of the default installation.
If will be using LaTeX for presentations and articles, you will want to use embed your R code in LaTeX documents; although RStudio has great capabilities for the R portion of this workflow, at the point that you start working on the writing tasks, you will want to begin using a LaTeX IDE. Texmaker runs on Linux, Windows and OS X; it allows you to run R using
knitr in the same way that RStudio does, but has spell checking features that make working with the LaTeX document easier.
By default, Texmaker uses
sweave as shown in Figure 1. For most uses today and particularly for the examples in this article, you will want to switch it to
knitr by changing the Sweave command to
/usr/bin/Rscript -e "require('knitr'); knit('%.Rnw')"
as shown in Figure 2. The final configuration step is changing the Quick Build (F1) key to run Sweave/Knitr before running pdflatex as shown in Figure 3.
Secure File Transfer Utilities–scp, rsync or Something Else
For batch reporting through
cron or some other scheduler, you will almost invariably need some way to transfer files between systems. Secure copy or
scp is probably the most universal way to do this. It is installed by default on most Linux and OS X systems. On Windows,
scp is available as part of Cygwin. In a corporate Windows environment, you should talk to you IT group about what tools to use on your network; in Windows environments, shared drives are a common way to handle file copies. Another alternative is
rsync which routinely available on Linux; for OS X, it can be installed via MacPorts while on Windows it can be installed via Cygwin.
rsync, you will want to use
ssh-keygen to allow secure connections without using passwords and potentially
ssh-agent for additional security.
Optipng and Other Image Compression Tools
The PNG and other images that R generates are not compressed as fully as is possible. To speed up web pages, you will want to compress images before uploading them using
optipng or some other compression optimization tool.
Optipng is available routinely in Linux, Cygwin and MacPorts. To call it in R use
images/figures/*.png is the path to the image files that your R script created.
ImageMagick Image Resizing and Conversion Tools
For web applications, you will probably want additional image sizes for use in links that are specific for different social media sites. ImageMagick is the most convenient tool for converting and resizing images in a script. It is available for Linux, Windows (Cygwin) and OS X (MacPorts). To use it in R to create an 450 pixel image for use in Facebook or some other social media site in R code, you would use something like
system("convert images/figures/ncid_daily_plot-1.png -resize 450x images/figures/ncid_daily_plot-1_shrink.png")
Calibre and latex2html for E-book Tools
To create ebooks in EPUB for most e-readers and AZW3 for Kindle e-readers, you will need latex2html (or some other LaTeX to HTML converter) and Calibre. LaTeX2html is not being actively maintained so it is not a good choice for a production environment, but it is available for all platforms.
Server Side Include Software on Web Server
If you are posting your R document to a webserver running a content management system like Wordpress, Joomla or Drupal, you will need an extension to enable server side includes. This will probably require higher administrative rights than is typically given to normal authors, so check with your CMS administrator before you start on a big project. On Joomla, Sourcerer is one of several extensions that allow server side includes. It uses the syntax
<?php include("images/interactive/econometric_charts_home_page.html"); ?>
## Warning in file(filename, "r", encoding = encoding): cannot open file ## 'optipng images/figures/*.png': No such file or directory
## Error in file(filename, "r", encoding = encoding): cannot open the connection