Tag Archives: Time Series

Statistical Software and Data Workshops Spring 2016

Rutgers University Libraries Data Services Workshop Series (New Brunswick)

January 2016

This Spring, Ryan Womack, Data Librarian, will repeat the series of workshops on statistical software, data visualization, and data management, as part of the Rutgers University Libraries Data Services.   A detailed calendar and descriptions of each workshop are below.  This semester each workshop topic will be repeated twice, once at the Library of Science and Medicine on Busch Campus, and once at Alexander Library on College Ave.  These sessions will be identical except for location. Sessions will run approximately 3 hours.  Workshops in parts will divide the time in thirds.  For example, the first SPSS, Stata, and SAS workshop would start with SPSS at 12, Stata at 1, and SAS at 2.  You are free to come only to those segments that interest you.  There is no need to register, just come!

Logistics

Location: The Library of Science and Medicine (LSM on Busch) workshops will be held in the Conference Room on the 1st floor of LSM on Mondays from 12 to 3 pm.  The Alexander Library (College Ave) workshops will be held in room 413 of the Scholarly Communication Center (4th floor of Alexander Library) from on Tuesdays from 1:10 to 4:10 pm.

For both locations, you are encouraged to bring your own laptop to work in your native environment.  Alternatively, at Alexander Library, you can use a library desktop computer instead of your own laptop.  At LSM, we will have laptops available to borrow for the session if you don’t bring your own.  Room capacity is 25 in both locations, first come, first served.

If you can’t make the workshops, or would like a preview or refresher, screencast versions of many of the presentations are already available at http://libguides.rutgers.edu/data.  Additional screencasts are continually being added to this series.

Calendar of workshops

Monday (LSM)

12 noon – 3 pm

  Tuesday (Alexander)

1:10 pm -4:10 pm

January 25 Introduction to SPSS, Stata, and SAS January 26
February 1 Introduction to R February 2
February 8 Data Visualization in R February 9
February 15 Special Topics:

Time Series in R, Survival Analysis in R, Big Data in Brief

February 16

 

Description of Workshops:

§ Introduction to SPSS, Stata, and SAS (January 25 or January 26) provides overviews of these three popular commercial statistical software programs, covering the basics of navigation, loading data, graphics, and elementary descriptive statistics and regression using a sample dataset.  If you are already using these packages with some degree of success, you may find these sessions too basic for you.

  • SPSS is widely used statistical software with strengths in survey analysis and other social science disciplines.  Copies of the workshop materials, a screencast, and additional SPSS resources can be found here: http://libguides.rutgers.edu/content.php?pid=115296&sid=1208425. SPSS is made available by OIRT at a discounted academic rate, currently $100/academic year.  Find it at software.rutgers.edu.  SPSS is also available in campus computer labs and via the Apps server (see below).
  • Stata is flexible and allows relatively easy access to programming features.  It is popular in economics among other areas.  Copies of the workshop materials, a screencast, and additional Stata resources can be found here: http://libguides.rutgers.edu/content.php?pid=115296&sid=1208427. Stata is made available by OIRT via campus license with no additional charge to install for Rutgers users.  Find it at software.rutgers.edu.
  • SAS is a powerful and long-standing system that handles large data sets well, and is popular in the pharmaceutical industry, among other applications. Copies of the workshop materials, a screencast, and additional SAS resources can be found here: http://libguides.rutgers.edu/content.php?pid=115296&sid=1208423. SAS is made available by OIRT at a discounted academic rate, currently $100/academic year.  Find it at software.rutgers.edu.  SAS is also available in campus computer labs, online via the SAS University Edition cloud service, and via the Apps server (see below).

Note: Accessing software via apps.rutgers.edu

SPSS, SAS, Stata, and R are available for remote access on apps.rutgers.eduapps.rutgers.edu does not require any software installation, but you must activate the service first at netid.rutgers.edu.

 

§ Introduction to R (February 1 or February 2) – This session provides a three-part orientation to the R programming environment.  R is freely available, open source statistical software that has been widely adopted in the research community.  Due to its open nature, thousands of additional packages have been created by contributors to implement the latest statistical techniques, making R a very powerful tool.  No prior knowledge is assumed. The three parts cover:

  • Statistical Techniques: getting around in R, descriptive statistics, regression, significance tests, working with packages
  • Graphics:  comparison of graphing techniques in base R, lattice, and ggplot2 packages
  • Data Manipulation:  data import and transformation, additional methods for working with large data sets

Additional R resources, including handouts, scripts, and screencast versions of the workshops, can be found here: http://libguides.rutgers.edu/data_R

R is freely downloadable from http://r-project.org

 

§ Data Visualization in R  (February 8 or February 9) discusses principles for effective data visualization, and demonstrates techniques for implementing these using R.  Some prior familiarity with R is assumed (packages, structure, syntax), but the presentation can be followed without this background.  The three parts are:

  • Principles & Use in lattice and ggplot2: discusses classic principles of data visualization (Tufte, Cleveland) and illustrates them with the use of the lattice and ggplot2 packages.  Some of the material here overlaps with Intro to R, pt 2, but at a higher level.
  • Miscellany of Methods: illustrates a wide range of specific graphics for different contexts
  • 3-D, Interactive and Big Data: presentation of 3-D data, interactive exploration data, and techniques for large datasets

Additional R resources can be found here: http://libguides.rutgers.edu/data_R

R is freely downloadable from http://r-project.org

 

§ Special Topics (February 15 or February 16) covers a few different specialized areas.  The three parts presented during the afternoon workshop are not related.

Of related interest:  There is also a Digital Humanities Workshop Series this spring, covering topics including text analysis, network analysis, and digital mapping. See https://dh.rutgers.edu/spring-2016-workshops/ for information on the topics and schedule.

Advertisements

Data Workshops Full, Registration Closed

Somehow the response this Fall was much higher than expected, so all data and statistical software workshop sessions are now full and registration is closed.  Please consult the screencasts, scripts, and handouts at libguides.rutgers.edu/data for a self-guided version of the same material.

The same sessions will run again live in the Spring.

Data Viz and other techniques in R

Registration is now open for the remainder of the continuing Fall Data Workshop series, presented by Ryan Womack, Data Librarian.

To go directly to the registration page, click here.  A detailed calendar and descriptions of each workshop are below.

Logistics

All workshops for Fall 2014 will be held in Room 413 on the 4th floor of Alexander Library (169 College Avenue).  Workshops are held on Tuesdays from 1:10-2:30 pm according to the schedule below.  Room capacity is limited to 25.

Room 413 has R installed on its workstations.  You are also welcome to bring your laptop if you want to follow along with the exercises, but this is not required.

If you can’t make the workshops, or would like a preview or refresher, screencast versions of many of the presentations are already available  at http://libguides.rutgers.edu/data.  Additional screencasts will be added for the newer workshops in the series.

Description of Workshops:

§ Data Visualization in R (Sept 30, Oct 7, Oct 14) discusses principles for effective data visualization, and demonstrates techniques for implementing these using R.  Some prior familiarity with R is assumed (packages, structure, syntax), but the presentation can be followed without this background.  The three parts are:

(Sept 30) Part I – Principles & Use in lattice and ggplot2: discusses classic principles of data visualization (Tufte, Cleveland) and illustrates them with the use of the lattice and ggplot2 packages.  Some of the material here overlaps with Intro to R, pt 2, but at a higher level.

(Oct 7) Part II – Miscellany of Methods: illustrates a wide range of specific graphics for different contexts

(Oct 14) Part III – 3-D, Interactive and Big Data: presentation of 3-D data, interactive exploration data, and techniques for large datasets

Additional R resources can be found here: http://libguides.rutgers.edu/data_R

R is freely downloadable from http://r-project.org

§ Time Series in R (Oct 21)

Review of commands and techniques for basic time series analysis in R

§ Survival Analysis in R (Oct 28)

Review of commands and techniques for basic survival analysis in R

Register for the workshops here

Budget Balances (Deficits and Surpluses) for 54 Middle Income Countries From 1976-2007

Budget Balances (Deficits and Surpluses) for 54 Middle Income Countries From 1976-2007 is exactly what it claims to be.

Dutch Parliamentary Election Study Cumulative Dataset, 1971-2006

Dutch Parliamentary Election Study Cumulative Dataset, 1971-2006 is s a compilation of common core variables included in the Dutch Parliamentary Election Studies of 1971, 1972, 1977, 1981, 1982, 1986, 1989, 1994, 1998, 2002, 2003, and 2006. The major areas of study focus on national problems, political efficacy, perceived stand of the main political parties on important political issues, view of religion in society, satisfaction with government, social participation, voting behavior in recent elections, left-right self-rating, left-right rating of political parties, sense of civic competence, civic political participation, legitimacy of social protest and government reaction, political distrust, and political cynicism. Respondents’ views on other salient political and social issues, such as abortion, nuclear energy, differences in income, and nuclear armaments, were also elicited.

United Nations Surveys of Crime Trends and Operations of Criminal Justice Systems, 1970-2006

The major goal of the United Nations Surveys on Crime Trends and the Operations of Criminal Justice Systems was to collect data on the incidence of reported crime and the operations of criminal justice systems with a view to improving the analysis and dissemination of that information globally. Surveys were distributed to officials in every member country of the United Nations. Designated officials completed the surveys to the best of their abilities given the country’s available data. Crime variables include counts of recorded crime for homicide, assault, rape, robbery, theft, burglary, fraud, embezzlement, drug trafficking, drug possession, bribery, and corruption. There are also counts of suspects, persons prosecuted, persons convicted, and prison admissions by crime, gender, and adult or juvenile status. Other variables include the population of the country and largest city, budgets and salaries for police, courts, and prisons, and types of sanctions, including imprisonment, corporal punishment, deprivation of liberty, control of freedom, warning, fine, and community sentence. The countries participating in the survey and the variables available vary across the ten waves.

Prime Ministerial Power in 22 Countries, 1980-2000

Prime Ministerial Power offers a measure of prime ministerial power to set government policy in 22 countries with established parliamentary democracies. The collection comprises variables relating to the power of prime ministers including an index of prime ministerial power, which consists of a quantitative score of the power of individually named prime ministers in their different terms based on an expert survey conducted in 2001-2003.

Housing Affordability

The Housing Affordability Data System (HADS) is available from ICPSR covering the time period 1985 to 2004.  This data is drawn from the American Housing Survey and measures the affordability of housing units and the housing cost burdens of households, relative to area median incomes, poverty level incomes, and Fair Market Rents. The purpose of these datasets is to provide housing analysts with consistent measures of affordability and burdens over a long period.

ICT Diffusion and Distribution Dataset, 1990-2007

ICT [Information Communication and Technology] Diffusion and Distribution Dataset, 1990-2007 is an international comparison of ICT indicators, such as mobile phones, internet access, and computers, based on 204 surveys across 47 countries.

Weekly US Automobile Production: 1972-1983 and 1990-2001

Weekly Production Scheduling at Assembly Plants in the United States Automobile Industry: 1972-1983 and 1990-2001 reports production of car manufacturers by Chrysler, Ford, and GM.

“The data consist of information on weekly operations at United States and Canadian automobile assembly plants owned by the Detroit Three automakers (Chrysler, Ford Motor Company, and General Motors). The dataset was constructed from industry trade publications that report production schedules at these assembly plants on a weekly basis over the two time periods: 1972-1983, and 1990-2001. The period 1984 to 1989 was excluded only because the authors did not have access to key publications at the time the data were collected.”