Survival Analysis in R video available
As promised earlier, the “special topic” material on Survival Analysis is now available on YouTube in lieu of inperson sessions. Take a look at the Survival Analysis in R Playlist.
Survival analysis deals with data that may have truncated observations, called censored data. A typical example is studying the time until failure of a part in engineering, or failure of a part of the human body in medicine (colloquially known as “disease”). We usually have some accurate data on when the problem occurs until the point that the end of the study is reached. Then we will have some subjects that survived without failure until the end of the study, but we are uncertain just how long they would have lasted until failure. The methods of survival analysis account for this partial uncertainty in the data. R can deal with almost all necessary aspects of survival analysis, but requires some mixing and matching of packages to get the best results, as shown in the videos.
As always, my YouTube videos are fueled by music behind the scenes. Giving a throwback shoutout to Public Image Limited, some holiday Twice, plus the usual Mongolian suspects.
Statistical Software and Data Workshops Spring 2016
Rutgers University Libraries Data Services Workshop Series (New Brunswick)
January 2016
This Spring, Ryan Womack, Data Librarian, will repeat the series of workshops on statistical software, data visualization, and data management, as part of the Rutgers University Libraries Data Services. A detailed calendar and descriptions of each workshop are below. This semester each workshop topic will be repeated twice, once at the Library of Science and Medicine on Busch Campus, and once at Alexander Library on College Ave. These sessions will be identical except for location. Sessions will run approximately 3 hours. Workshops in parts will divide the time in thirds. For example, the first SPSS, Stata, and SAS workshop would start with SPSS at 12, Stata at 1, and SAS at 2. You are free to come only to those segments that interest you. There is no need to register, just come!
Logistics
Location: The Library of Science and Medicine (LSM on Busch) workshops will be held in the Conference Room on the 1st floor of LSM on Mondays from 12 to 3 pm. The Alexander Library (College Ave) workshops will be held in room 413 of the Scholarly Communication Center (4th floor of Alexander Library) from on Tuesdays from 1:10 to 4:10 pm.
For both locations, you are encouraged to bring your own laptop to work in your native environment. Alternatively, at Alexander Library, you can use a library desktop computer instead of your own laptop. At LSM, we will have laptops available to borrow for the session if you don’t bring your own. Room capacity is 25 in both locations, first come, first served.
If you can’t make the workshops, or would like a preview or refresher, screencast versions of many of the presentations are already available at http://libguides.rutgers.edu/data. Additional screencasts are continually being added to this series.
Calendar of workshops
Monday (LSM)
12 noon – 3 pm 
Tuesday (Alexander)
1:10 pm 4:10 pm 

January 25  Introduction to SPSS, Stata, and SAS  January 26 
February 1  Introduction to R  February 2 
February 8  Data Visualization in R  February 9 
February 15  Special Topics:
Time Series in R, Survival Analysis in R, Big Data in Brief 
February 16 
Description of Workshops:
§ Introduction to SPSS, Stata, and SAS (January 25 or January 26) provides overviews of these three popular commercial statistical software programs, covering the basics of navigation, loading data, graphics, and elementary descriptive statistics and regression using a sample dataset. If you are already using these packages with some degree of success, you may find these sessions too basic for you.
 SPSS is widely used statistical software with strengths in survey analysis and other social science disciplines. Copies of the workshop materials, a screencast, and additional SPSS resources can be found here: http://libguides.rutgers.edu/content.php?pid=115296&sid=1208425. SPSS is made available by OIRT at a discounted academic rate, currently $100/academic year. Find it at software.rutgers.edu. SPSS is also available in campus computer labs and via the Apps server (see below).
 Stata is flexible and allows relatively easy access to programming features. It is popular in economics among other areas. Copies of the workshop materials, a screencast, and additional Stata resources can be found here: http://libguides.rutgers.edu/content.php?pid=115296&sid=1208427. Stata is made available by OIRT via campus license with no additional charge to install for Rutgers users. Find it at software.rutgers.edu.
 SAS is a powerful and longstanding system that handles large data sets well, and is popular in the pharmaceutical industry, among other applications. Copies of the workshop materials, a screencast, and additional SAS resources can be found here: http://libguides.rutgers.edu/content.php?pid=115296&sid=1208423. SAS is made available by OIRT at a discounted academic rate, currently $100/academic year. Find it at software.rutgers.edu. SAS is also available in campus computer labs, online via the SAS University Edition cloud service, and via the Apps server (see below).
Note: Accessing software via apps.rutgers.edu
SPSS, SAS, Stata, and R are available for remote access on apps.rutgers.edu. apps.rutgers.edu does not require any software installation, but you must activate the service first at netid.rutgers.edu.
§ Introduction to R (February 1 or February 2) – This session provides a threepart orientation to the R programming environment. R is freely available, open source statistical software that has been widely adopted in the research community. Due to its open nature, thousands of additional packages have been created by contributors to implement the latest statistical techniques, making R a very powerful tool. No prior knowledge is assumed. The three parts cover:
 Statistical Techniques: getting around in R, descriptive statistics, regression, significance tests, working with packages
 Graphics: comparison of graphing techniques in base R, lattice, and ggplot2 packages
 Data Manipulation: data import and transformation, additional methods for working with large data sets
Additional R resources, including handouts, scripts, and screencast versions of the workshops, can be found here: http://libguides.rutgers.edu/data_R
R is freely downloadable from http://rproject.org
§ Data Visualization in R (February 8 or February 9) discusses principles for effective data visualization, and demonstrates techniques for implementing these using R. Some prior familiarity with R is assumed (packages, structure, syntax), but the presentation can be followed without this background. The three parts are:
 Principles & Use in lattice and ggplot2: discusses classic principles of data visualization (Tufte, Cleveland) and illustrates them with the use of the lattice and ggplot2 packages. Some of the material here overlaps with Intro to R, pt 2, but at a higher level.
 Miscellany of Methods: illustrates a wide range of specific graphics for different contexts
 3D, Interactive and Big Data: presentation of 3D data, interactive exploration data, and techniques for large datasets
Additional R resources can be found here: http://libguides.rutgers.edu/data_R
R is freely downloadable from http://rproject.org
§ Special Topics (February 15 or February 16) covers a few different specialized areas. The three parts presented during the afternoon workshop are not related.
 Time Series in R: review of commands and techniques for basic time series analysis in R. Scripts at http://libguides.rutgers.edu/data_R
 Survival Analysis in R: review of commands and techniques for basic survival analysis in R. Scripts at http://libguides.rutgers.edu/data_R
 Big Data in Brief: an introduction to some of the techniques and software environments used to work with big data, with pointers to resources for further learning at http://libguides.rutgers.edu/bigdata
Of related interest: There is also a Digital Humanities Workshop Series this spring, covering topics including text analysis, network analysis, and digital mapping. See https://dh.rutgers.edu/spring2016workshops/ for information on the topics and schedule.
Data Workshops Full, Registration Closed
Somehow the response this Fall was much higher than expected, so all data and statistical software workshop sessions are now full and registration is closed. Please consult the screencasts, scripts, and handouts at libguides.rutgers.edu/data for a selfguided version of the same material.
The same sessions will run again live in the Spring.
Data Viz and other techniques in R
Registration is now open for the remainder of the continuing Fall Data Workshop series, presented by Ryan Womack, Data Librarian.
To go directly to the registration page, click here. A detailed calendar and descriptions of each workshop are below.
Logistics
All workshops for Fall 2014 will be held in Room 413 on the 4^{th} floor of Alexander Library (169 College Avenue). Workshops are held on Tuesdays from 1:102:30 pm according to the schedule below. Room capacity is limited to 25.
Room 413 has R installed on its workstations. You are also welcome to bring your laptop if you want to follow along with the exercises, but this is not required.
If you can’t make the workshops, or would like a preview or refresher, screencast versions of many of the presentations are already available at http://libguides.rutgers.edu/data. Additional screencasts will be added for the newer workshops in the series.
Description of Workshops:
§ Data Visualization in R (Sept 30, Oct 7, Oct 14) discusses principles for effective data visualization, and demonstrates techniques for implementing these using R. Some prior familiarity with R is assumed (packages, structure, syntax), but the presentation can be followed without this background. The three parts are:
(Sept 30) Part I – Principles & Use in lattice and ggplot2: discusses classic principles of data visualization (Tufte, Cleveland) and illustrates them with the use of the lattice and ggplot2 packages. Some of the material here overlaps with Intro to R, pt 2, but at a higher level.
(Oct 7) Part II – Miscellany of Methods: illustrates a wide range of specific graphics for different contexts
(Oct 14) Part III – 3D, Interactive and Big Data: presentation of 3D data, interactive exploration data, and techniques for large datasets
Additional R resources can be found here: http://libguides.rutgers.edu/data_R
R is freely downloadable from http://rproject.org
§ Time Series in R (Oct 21)
Review of commands and techniques for basic time series analysis in R
§ Survival Analysis in R (Oct 28)
Review of commands and techniques for basic survival analysis in R
