Authors: Kari L. Jordan, Ben Marwick, Belinda Weaver, Naupaka Zimmerman, Jason Williams, Tracy Teal, Erin Becker, Jonah Duckles, Beth Duckles, Elizabeth Wickes
October 2017
Software Carpentry is a worldwide volunteer organization whose mission is to make scientists more productive, and their findings more reliable, by teaching them foundational computing skills. Established in 1998, it runs short, intensive workshops that cover task automation using the Unix shell, structured programming in Python and R, and version control using tools such as Git. Its sibling organization, Data Carpentry, teaches foundational data science skills. To date, the majority of Software and Data Carpentry workshops have been run in the United States, Canada and the UK. However, there is growing interest elsewhere, and there are active ‘Carpentries’ communities in Australia, New Zealand, South Africa, the Netherlands, Norway and in other countries in Africa and in Central and South America.
While most workshops are favorably assessed by learners at the time of delivery, no systematic, long-term follow-up study has previously been done on the efficacy of the training delivered, nor of the short or longer term impact that such training might have had on learners’ work practices, further skills acquisition, or subsequent career paths. There has also been no useful demographic profiling of learners.
Why should that matter?
We are outcome-driven organizations interested in continually improving the workshop experience for both our learners and our instructors. Additionally, we are largely volunteers. In order to continue our valuable work of teaching skills to researchers, we need supportive funding either from grant-making bodies or from member institutions, or - ideally - from both. Our case for funding is strengthened if we can provide impartial evidence that proves our workshops have the outcomes we claim. While it is wonderful to have amassed what must now amount to container-sized loads of positive ‘sticky note’ workshop feedback over the years, funders generally require more solid evidence of achievement before they are willing to cut a check. Establishing value is not just important for funding though: it is important for our growing community of trainers, instructors, lesson maintainers and helpers, all of whom volunteer their time because they believe in the importance of what we do. We owe it to them to show that their time is not wasted - that we are genuinely furthering the cause of efficiently organized, reproducible science. Lastly, we need to demonstrate to our learners that precious time carved out to master computational and data science skills will pay off many times over in time saved further down the track. Data Carpentry’s post-workshop survey results tell us that our respondents are either enthusiastically or very involved in our workshops, they learn a great deal of practical knowledge, and agree that they can immediately apply what they learned at the workshop. Our interest is in establishing, long-term, what impact workshops are having on learners’ confidence in the skills they were taught.
Therefore, through the generosity of the Moore Data Driven Discovery Initiative, assessment work has been undertaken across both Carpentries to help build an evidence base to complement the large body of existing anecdotal evidence that Carpentry-style training is both useful and effective in improving researchers’ work practices.
To gather the required evidence, Data and Software Carpentry launched a long-term assessment survey in March 2017. We first engaged in a community consultation to determine how best to design and word this survey. The responses from that consultation then guided the development of our long-term assessment strategy. You can read more about the consultation process in this blog post.
The main goal of the resulting survey was to ask our learners to describe concrete changes they had implemented to their research practices as a result of completing a Carpentries workshop. We also asked whether they now had greater confidence in the tools they had been taught, and whether they had progressed in their careers as a result. The inclusion of multiple choice questions around programming in R or Python helped make the evidence of training efficacy more concrete, comparable and measurable, taking it out of the realm of ‘opinion’ or ‘feeling’, which, while interesting, is not as robust or reliable a marker of success as demonstrable evidence. For instance, ‘I can write a FOR loop to rename and move a batch of files’ is a much more reliable metric of achievement than ‘I can use the shell’.
All of the data collected in this survey was self-reported. It should be noted that there are disadvantages to self-reported surveys. For one, respondents may exaggerate their achievements. Additionally, a respondent’s state of mind while taking the survey may affect her/his answers. Survey results can potentially be biased because those feeling most positive are also those more likely to respond, while learners whose experience was less positive – or even negative - may not bother to answer. To account for this, we compared the results of the long-term survey with that of Data Carpentry’s post-workshop survey results and Software Carpentry’s post-workshop survey results. We found consistent patterns of increased confidence and self-efficacy in our learners.
The survey questions and the data used in this analysis are located in the carpentries/assessment repo on GitHub. We have already received several pull requests from community members interested in this data. Feel free to use the data and tell us about your findings.
This analysis includes 476 observations. Not all respondents answered every one of the 26 questions.
The long-term survey assessed confidence, motivation, and other outcomes more than six months after respondents attended a Carpentry workshop. Provided below are a few highlights from the data.
Carpentry learners represent a wide range of disciplines ranging from the sciences to engineering. Respondents were asked to indicate their field of research, work, or study by checking all that apply from a list of various disciplines. A breakdown of their responses is provided in the table below. Many of the respondents work in Life Sciences.
Field | n | % |
---|---|---|
Life Sciences | 166 | 36.2 |
Biomedical/Health Sciences | 95 | 20.7 |
Agricultural or Environmental Sciences | 74 | 16.1 |
Physical Sciences | 57 | 12.4 |
Earth Sciences | 48 | 10.5 |
Engineering | 41 | 8.9 |
Mathematics or Statistics | 41 | 8.9 |
Computer Science | 38 | 8.3 |
Social Sciences | 24 | 5.2 |
Library Sciences | 20 | 4.4 |
Humanities | 12 | 2.6 |
Business | 6 | 1.3 |
Carpentries workshops are open to individuals from all backgrounds and fields. Attendees vary from students (undergraduate and graduate) and faculty to staff and persons working in industry. 35% of our respondents were graduate students.
Below is a breakdown of our respondents by the country in which they attended a Carpentries workshop.
A large portion of Carpentry learners responding to the survey attended a workshop in the United States (48.5%), followed by Canada (12.2%) the UK (7.8%), and Australia (4.8%).
The Carpentries constantly strive to improve our workshop content and operations, which means a workshop run today might be different in some ways from a workshop run six months ago or run last year. If we know how many workshops respondents attended, and how long it has been since they completed that workshop or workshops, we can take those changes into account when we assess learner responses. Locating workshops temporally allows us to account for spikes in data trends that may be a result of changes in our workshop operations.
74% of respondents participated in a Carpentry workshop more than one year ago, and 84% of respondents have attended only one Carpentry workshop.
The response rate from learners who attended a workshop more than a year ago speaks to their level of involvement with the Carpentries. 120 of the survey respondents are subscribed to the Carpentries newsletter, indicating their ongoing interest in our work. In survey research, it can be difficult to collect responses from participants a whole year after their involvement in an event. It is great to see that learners feel positively enough about their experience to, firstly, be receptive to our email communication and, secondly, to have taken the time to complete the survey in full. As mentioned in the introduction, this could be a potential disadvantage to self-reported data, as those who may have negative experiences may not have completed the survey.