Although not statistically complex, the ACLU use of R for litigation support in the family separation case was big.
Although not statistically complex, the ACLU use of R for litigation support in the family separation case was big.

RStudio::conf 2019

What a difference 12 years makes. My first R-related conference was the 2007 UseR! at Iowa State University in Ames, Iowa. It was about 300 people and almost all attendees were from academia. Fast forward to the 2019 RStudio::conf, with 1700 people, almost all of whom are from industry. The UseR! conference now draws about 1000, the RFinance conference draws hundreds, and the BioConductoR conference draws large crowds. The constant theme over this decade has been documentation and reproducibility of research; in 2007, a pharma developed and put the MS Word version of sweave into the open source world because they needed a MS Word version and wanted to contribute. At the 2011 conference in Warwick, RStudio and IDEs that made the use of sweave and then knitr much easier. 2015 in Aalborg brought discussions of Docker and portability/archivability into the discussion.

Reproducible Research Continues to be a Theme

Reproducible research has been a theme in R for a decade, and the RStudio::conf 2019 continued the theme with workshops and conference sessions on Markdown and Bookdown among other reproducibility- and repeatability-related topics. I have continued to stick to LaTeX due to inertia and the early lack of citation capability in Markdown, but that has changed, and it is time to make the switch for new work. Many of the new books in R are being produced in Bookdown rather than LaTeX.

Garret Grolemund’s talk R Markdown: the Bigger Picture pointed out the scope of the repeatability and reproducibility problems in research today and how using Markdown can help in documenting what you did and how you did it.

The proportion of published articles describing experiments that cannot be repeated due to lost code, documentation or data is a huge problem.
A large portion of current published research cannot be replicated.  R Markdown can help with documenting data and procedures.

Using R for Family Reunification

Although much of the RStudio::conf presentations were oriented to the RStudio products and open source packages, many were not. One of the most compelling was Brooke Watson presentation on using R to clean up the total garbage data provided by the U.S. Government on the locations of children and parents who were separated at the border. Regardless of politics, it is clear that the government was unprepared for this policy and did not (does not?) have procedures in place to keep track of kids and their parents who have been separated.

Brooke Watson's (@BrookLYNevery1) presentation on family reunification showed a jumble of files provided during discovery.
The Department of Homeland Security provided jumbled data to the ACLU for child separation lawsuits.
Brooke Watson's (@BrookLYNevery1) presentation on family reunification showed that the Department of Homeland Security does not have systems in place to track children separated from their parents..
The Department of Homeland Security provided bad data to the ACLU during discovery in the child separation lawsuit.

RStudio Package Stickers Continue to be a Hit

The RStudio package stickers were a major hit in Brussels, and continue to be a big hit at R conferences. This has to be one of the best marketing ideas I have ever seen. Whenever they put out a new batch of stickers, there was always a scrum as people searched for ones that they didn’t have.

The RStudio hexagonal package stickers are clearly one of the best marketing ideas I have ever seen.
The RStudio hexagonal marketing stickers continue to be a hit at conferences.

R and Data Science are No Longer PhD Things

The Education track on the last day was poorly attended, but I think in many ways it was perhaps the most important for understanding the future, the future of R and the future of Data Science. Mary Rudis (@mrshrbrmstr) spoke on teaching R at the community college level and on improving data literacy in the general population. Her presentation talked about certificate programs that do not require a prior bachelor’s degree. Her presentation was followed by Carl Howe’s (@cdhowe) presentation on Teaching the Next Million R Users. Carl is RStudio’s Director of Education or something like that and it is clear that RStudio is looking not just at the professional Data Scientist, but at increasing data literacy in the general population.

Panel Discussion of Data Science as a Career

The conference concluded with a panel discussion that was effectively how to manage a career in Data Science. It has a lot of useful information, but failed to talk about how to survive being the messenger in “shoot the messenger,” which all Data Scientists need to know how to do. How I would answer that question will be the subject of a future blog post.

Knitting is Making a Comeback

When I was a kid, my mom knitted all of the time–it was the major form of her gift giving, and I still have most of the sweaters from my late teen and college years that theoretically still fit. Mom was quite proficient and could do intricate patterns while carrying on a spirited conversation, though if the conversation was too spirited, she might have to rip out a row and re-do it. During the education track on the last day, the woman in front of me did a major portion of a child’s sock during the session. It was pretty cool.

Knitting is making a comeback. A child somewhere is going to get some cool socks for a birthday.
The woman in front of me made major progress on knitting a childs sock during the education track on Friday.