The Carpentries Workbench

Last updated on 2024-05-22 | Edit this page

Overview

Questions

  • How is a lesson site set up and published with GitHub?
  • What are the essential configuration steps for a new lesson?
  • How do you create and modify the pages in a lesson?

Objectives

After completing this episode, participants should be able to…

  • Identify the key tools used in The Carpentries lesson infrastructure.
  • Complete the fundamental setup steps for a new lesson repository.
  • Edit Markdown files using the GitHub web interface.

At this stage in the training, you should have a clear idea of who the people are that you want to teach this lesson to, and what exactly the skills are that you want them to learn while they are following it. It is time to begin creating a website that will present that lesson to the world!

GitHub Pages


The source of all The Carpentries lessons is made publicly-available in repositories on GitHub. By making our repositories public like this, we encourage others to help us maintain and improve our lessons, and make it as easy as possible for them to re-use and modify our lessons for their own purposes.

GitHub provides a hosting service to open source projects such as these, allowing users to present their projects to the wider world. The repository includes a complete history of the changes made between versions of the individual files in the project, and provides many features that facilitate collaboration on projects. We will learn more about some of those collaborative features later in this training. For now, we will focus on one other important feature that GitHub provides: website hosting.

Via a system called GitHub Pages, users can build and host websites from the files present in any repository on GitHub. For many years, this has been how The Carpentries presents its lesson websites to the world.

Using The Carpentries Workbench


Carpentries lesson websites are built with The Carpentries Workbench, a toolkit that converts Markdown and RMarkdown files into the HTML that can be served by GitHub Pages. We will use it now to initialise a new lesson.

A Brief History of The Carpentries Lesson Infrastructure

In the past, our lesson sites were generated by software called Jekyll, a tool built into GitHub Pages that allows the webpage content, written in the text files of the repository, to be combined with descriptions of settings, structure, and styling, to create a website. The template used by Jekyll to structure and style lessons was initially developed in 2013/2014 by Abby Cabunoc Mayes, Greg Wilson, Jon Pipitone, and Michael Hansen for Software Carpentry. It was expanded and maintained by many members of the community over almost a decade, to also support Data Carpentry, Library Carpentry, and many other lessons.

In 2022, we adopted a new infrastructure for our lesson sites: The Carpentries Workbench. Lesson sites built on the Workbench are still hosted with GitHub Pages, but no longer use Jekyll. Instead, the lessons are built using a programming language, R, and pandoc, a software designed for converting content between file formats. The Workbench combines three R packages:

  • sandpaper: converts a collection of Markdown or RMarkdown files into the structure of a lesson website.
  • varnish: provides the files and folders that add styling and additional functionality to a lesson website.
  • pegboard: a programmatic interface to the lesson, enabling various automated validation tasks.

For lesson developers, the Workbench makes The Carpentries lesson repositories much simpler to navigate and work with.

Creating a Lesson Repository

To get started, we first need to create a new repository for our lesson. We will use a template to do this, so that the new repository contains the basic files and folders that the Workbench needs in order to build a lesson site. There are currently two templates to choose between:

  1. A Markdown template
  2. An RMarkdown template, best suited to lessons you expect to include R source code that will generate output.

One member of each participating lesson team should choose one of these templates, following the link above and completing the configuration as follows:

  • click on the green “Use this template” button near the top right of the window
  • add a name for the repository
    • The name should be descriptive but fairly brief, with hyphens (-) to separate words
    • the name can always be changed later, via the repository settings
  • make sure the “Include all branches” box is checked to copy all branches from the template repository
    • this is important as lessons are developed in branch main but rendered into websites from branch gh-pages
  • in the “Description” field, write the title of your lesson
  • choose “Public” visibility

After pressing the Create repository button, you should be presented with a brand new lesson repository, like in the picture below.

Directory structure of a new lesson repository created from a lesson template
Directory structure of new lesson repository created from a lesson template. Note that new repositories created from the R Markdown lesson template will include an additional renv/ directory.

Adding collaborators

To be able to add content to the lesson, your collaborators on this project will need access to the repository. To add collaborators to the repository, navigate to Settings, then choose Collaborators from the left sidebar. Now repeat the following steps for every collaborator working with us on the project.

  • Click Add people and enter the GitHub username of one of our collaborators.
  • Click Add USERNAME to this repository.

Your collaborator should receive an email inviting them to join the repository. After they have accepted this invitation, they should be able to edit the repository, adding new files and modifying existing ones. Only the person who created the repository will be able to adjust the repository settings.

Repository Files

The repository contains a number of files and folders. Most of these are source files for the content of our new lesson, but a few are accompanying files primarily intended for the repository itself rather than the lesson website. These are:

  • CITATION.cff
  • CODE_OF_CONDUCT.md
  • CONTRIBUTING.md
  • LICENSE.md
  • README.md
  • .gitignore
  • and the .github/ folder.

We will address all of the files later in the training. For now, we will move on to complete the basic setup of the lesson.

Configuring a Lesson Repository

Since your lesson was created from a lesson template which contains gh-pages branch, GitHub Pages should already be generating a rendered version of your lesson from this branch. If it isn’t, the instructor should work with groups to activate GitHub Pages.

Which Branch and GitHub Actions

Although we configure GitHub Pages to serve the lesson website from the gh-pages branch, the default working branch for a lesson will be main. For the rest of this training, you should add and edit files on main, and in future, when you open Pull Requests to update the lesson content, these should also be made to main. The gh-pages branch should never be modified by anyone other than the automated actions-user account.

To see the rendering of the gh-pages branch by the automated actions-user, you can click the “Actions” tab from the main repo page. This tab shows the actions triggered by each commit and if they are in-progress or completed (sucessfully or with an error). You can also see the details of actions in more depth by clicking on an individual commit in this page.

On the repository home page (e.g. by clicking the name of the project near the top left of the window), click the gear wheel icon near the top right, to edit the About box. Check the box that says “Use your GitHub Pages website” to add the address of your lesson site to the About box.

After following these steps, when you navigate to the pages URL, you should see a lesson website with The Carpentries logo and “Lesson Title” in the top-left corner. You may need to wait a few minutes for the website to be generated.

config.yaml

The lesson title can be adjusted by modifying the config.yaml file in the repository. The config.yaml file contains several global parameters for a lesson - to determine some of the page styling, contact details for the lesson, etc. config.yaml is written in YAML, a structured file format of key-value pairs in the form key: value. For example, a YAML file of personal data might include lines such as name: 'Mei', height_m: 1.84, and birthdate: 1899-01-12. As well as the title of the lesson, you can and should adjust some of the other values in config.yaml, but you should not need to add new values or learn a lot about YAML to be able to configure your lesson. You should also only have to add or modify single-line string values and list entries, and not deal with multi-line strings or other data types (e.g. numbers, booleans, dictionaries) that YAML supports. The template config.yaml aims to guide users through examples and annotations on how to format the values they provide e.g. if you are modifying a value wrapped in quotation marks in the template file, it is safest to replace it with another value within quotation marks.

Practice withconfig.yaml(5 minutes)

Complete the configuration of your lesson by adjusting the following fields in config.yaml. Remember that FIXME values with quotes should be replaced with another value still in quotes and FIXMEs without quotes should be replaced by values without quotes.

  • contact: add an email address people can contact with questions about the lesson/project.
  • created: the date the lesson was created (today’s date) in YYYY-MM-DD format.
  • keywords: a (short) comma-separated list of keywords for the lesson, which can help people find your lesson when searching for resources online. At a minimum, include ‘lesson’, ‘pre-alpha’, and your lesson’s (human) language. For example, ‘lesson, pre-alpha, español’.
  • source: change this to the URL for your lesson repository.

We will revisit the life_cycle and carpentry fields in config.yaml later in this training.

.Rproj file

The repository also contains a file called FIXME.Rproj, a project file that RStudio (a commonly-used editor for working on lessons) can use to track some options and preferences for working on your lesson on your local machine. This file should be renamed. You can call this file whatever you like, but the convention is for its name to match that of the lesson repository.

README.md

The README.md file is the “front page” of your lesson repository on GitHub, and is written in Markdown. You should use it to provide basic information about the project, for anyone who finds their way to the source files for your lesson.

Improving theREADME.md(5 minutes)

Take a few minutes to update it with some basic information about the project:

  • the lesson title
  • a short description of the lesson
  • a list of the names of the authors, optionally linked to their GitHub profile

We will revisit README.md later in the training with more details on what to include in this file.

Adding Lesson Content


Now that we have completed the basic configuration of a lesson site, it is time to move on and look at where the actual lesson content will be written.

Lesson Home Page

The “home page” of a lesson is created from the index.md file: this file should contain a brief introduction to the lesson, designed to give visitors to the page a first impression of the lesson and whether it is appropriate for them. The file begins with a short header, written in the same key-value pair syntax we encountered in config.yaml.

MARKDOWN

---
site: sandpaper::sandpaper_site
---

This header configures how the site will be built by sandpaper, one of the components of The Carpentries Workbench, and should be left untouched. The page content begins after the blank line that follows this header.

Markdown Rendering on GitHub

You might have noticed by now that GitHub provides a preview of content in Markdown files, and displays an interpreted (“rendered”) version of Markdown file content by default in its web interface. You might also notice that the content of index.md is displayed differently when viewed on GitHub from how it appears on the homepage of your lesson website: the header mentioned above appears in a tabular layout on GitHub, but is not visible on the lesson homepage.

This is because the lesson infrastructure accepts an expanded set of elements in the source Markdown files: syntax for components of a lesson website that are not part of the standard set of Markdown features, e.g. page metadata such as the YAML header above, or blocks to make exercises and solutions visually distinctive. In these cases, you may see GitHub render the content differently from how it appears on your website, or not at all.

We will encounter more examples of this throughout the lesson, but the important thing to remember for now is that how GitHub displays source (R) Markdown files is not a reliable indicator of how content will be displayed in your website.

To get started on this home page, replace the template text with the same short description you added to your README.md.

Below those, you can add the prerequisite skills you determined earlier for your lesson, as a bullet point list to index.md:

MARKDOWN

- prerequisite 1
- prerequisite 2
- ...

Exercise: practice editing Markdown in GitHub (optional)

Add the objectives you defined for your lesson as a bullet list in the index.md file of your lesson repository.

Episodes


The main body of the lesson is written in episodes: the individual chunks or sections that the lesson is separated into. The episode pages of the lesson site will be constructed from Markdown files in the episodes folder of the lesson repository.

The episodes folder of the new repository contains a single file, introduction.md. The content of this file includes examples of how to write Markdown files for The Carpentries Workbench.

There are two important things to note:

  1. Like index.md, the episode file begins with a short header. The fields in this header describe the episode title and the estimated time (in minutes) required to teach it and complete the exercises.
  2. The example content includes several lines that start with strings of colons (:::::::). These mark the beginnings and ends of structural elements within the page, called fenced divs. Each fenced div starts and ends with a string of colons. A word at the end of the starting colons indicates what kind of block it is. We will explore them in more detail later in the training.

Creating a new episode

Let’s create a new episode file, for one of the episodes you have just identified. First, open the “raw” view of the introduction.md example episode, and copy the first 19 lines, down to the blank line under the closing string of the objectives div.

MARKDOWN

---
title: "Using Markdown"
teaching: 10
exercises: 2
---

:::::::::::::::::::::::::::::::::::::: questions

- How do you write a lesson using Markdown and `{sandpaper}`?

::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: objectives

- Explain how to use markdown with The Carpentries Workbench
- Demonstrate how to include pieces of code, figures, and nested challenge blocks

::::::::::::::::::::::::::::::::::::::::::::::::

Now create a new file in the episodes folder. Based on the episodes you planned out in Defining lesson objectives, choose a name for it that concisely describes the intended content, e.g. data-visualisation.md (or data-visualisation.Rmd for a lesson based on R Markdown). It is vital to include the file extension when naming this file: only files with the .md or .Rmd extensions will be built into webpages by the lesson infrastructure.

For page content, paste those first 19 lines of the introduction.md file and:

  1. replace the title
  2. set the teaching and exercises fields to zero for now
  3. replace the contents of the questions and objectives divs with “- TODO”

MARKDOWN

---
title: "Episode Title"
teaching: 0
exercises: 0
---

:::::::::::::::::::::::::::::::::::::: questions

- TODO

::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::: objectives

- TODO

::::::::::::::::::::::::::::::::::::::::::::::::

Adding a new episode to the lesson navigation

This new episode will not yet appear in the navigation of your lesson site. To enable this, we need to specify where the episode should appear in the order of the lesson. That episode order is defined in the episodes field of the config.yaml file:

YAML

# Order of episodes in your lesson
episodes:
- introduction.md

Add the name of the new episode file you created to this list in config.yaml and commit this change, for example:

YAML

# Order of episodes in your lesson
episodes:
- introduction.md
- data-visualisation.md

After the lesson site has been rebuilt on GitHub, you should see the episode title appear under Chapters in the left sidebar navigation of your lesson site after refreshing the webpage. Clicking on that title will take you to the episode page built from the file you created. At the top of the page body, you will find the episode title and an Overview box containing a list of the questions and objectives defined for the episode. Later, we will add more content to your chosen episodes.

TODO add a labelled screenshot of a new episode page

Exercise: practice creating episodes (10 minutes)

Repeat the steps you just saw, to create another new episode file and add it to the lesson site navigation. If you know what another episode in your lesson will be about, create the page for that. Otherwise, feel free to use any values you like for the file name and episode title.

Using this approach, we can build up our lesson one episode at a time.

We have now learned everything we need to be able to use The Carpentries Workbench, and our focus can return to the process of developing a new lesson.

Sometimes, formatting errors and typos in the files of your repository can cause the process that builds your lesson website to fail.

You will likely first notice the failure to build the lesson if the lesson website is not showing the changes you made. You may also receive an email from GitHub about the build failure if that is your preferred way of receiving notifications (but it may take you some time to realise this). The good thing is that GitHub keeps your previous lesson version online until the error is fixed and a new build is completed successfully. If you are not yet familiar with the GitHub interface, it can be difficult to pinpoint the problem.

Here are some points of advice that you can follow to help find and fix the problem when you notice the build process fail.

  • When you first notice that there is a problem, stop editing your lesson: any subsequent changes that you make will not be included in your lesson site until the problem is fixed, and might introduce additional issues, making it more difficult to find the original cause.
  • Look at the history of commits (the link with a reversing stopwatch icon and “NN commits” at the top of the listing of files and folders in the repository homepage). Is there a place where the build has a red cross instead of green circle? If so, click on that commit to look at the “diff” (where it shows which lines have been modified). The error is likely to have been introduced by these changes.
  • Ask for help: you can open an issue on your repository and tag a member of The Carpentries team to ask for help finding the problem. Alternatively, you can also ask for help in The Carpentries Slack workspace. (More about communication channels for discussing lesson development coming later in this training.)
  • Waiting for the GitHub Actions process that builds the website to run can be tedious, and slow down the process of troubleshooting: it can take a few minutes for the process to reach the point where it encounters the problem. It can be helpful to download the files for your lesson and try to build a version of the website on your local computer, where the build process will be much faster. The Workbench documentation provides instructions for installing the infrastructure and building a local preview of the lesson website.

Key Points

  • Lesson sites are built from source repositories with GitHub Pages.
  • A new lesson repository can be created from a template maintained by The Carpentries, and configured by adjusting the config.yaml file.
  • The main pages of a lesson website are created from Markdown or RMarkdown files in the episodes folder.