Avida/Software Carpentry Hybrid Workshop

Posted October 20, 2015 by Emily Dolson in Information / 0 Comments

This weekend, Josh, Cliff, and I ran a Software Carpentry workshop targeted at people interested in learning to use Avida for their research. After taking the instructor training course in May, we realized that the core skills covered in Software Carpentry (shell scripting, git, and a programming language) align very well with the skills needed to use Avida. The only thing left to do was drop a lesson on Avida in as the fourth session! Since there has been an ever-growing group of biologists interested in learning to use Avida, this seemed like a good idea. So, this weekend, we gave it a try.

Our overall plan was to start with a brief intro to Avida and then get into the computational skills so that people could work with it hands-on. There’s a lot going on in Avida, so learning about all of its facets at once can be more overwhelming than effective. Instead, after the high level overview, we dived straight into the shell section. Since it usually works best to run Avida on some sort of remote computer, this lesson also included a section on connecting to such resources. Since all of the learners had accounts on MSU’s high-performance computing cluster, this was actually pretty convenient, since it meant we could do most of the lesson with everyone in a common environment. I organized the lesson as follows:

  • I did a brief demonstration of using the shell to navigate everyone’s local computers and then did the rest of the lesson on the cluster.
  • From there, the software carpentry shell lesson flowed pretty naturally.
  • I used the section on creating files and directories to talk about laying out a directory structure for an experiment, took a brief break to demonstrate running Avida and playing around with the configuration settings, and then demonstrated writing shell scripts to explore the data generated by Avida.
  • This fed neatly into talking about the shell scripts used to submit jobs to the cluster queue.
  • The lesson ended with everyone writing a script to submit a mini version of a classic Avida experiment. The lesson plan for the Avida-ified version of this section is still in progress, but here’s the draft I used (once I finish the TODOs, it will probably double as a nice step-by-step tutorial).

Josh then did a more standard Git lesson, using the Avida repository as an example of the kind of large-scale collaboration that git enables. He also demonstrated using Filezilla to pull data down from the cluster. The next day, we opened with a lesson on Python by Cliff. Since our learners had all at least seen Python before (we did a pre-workshop lesson the day before for the two who hadn’t), this lesson was focused around getting data out of Avida output files and plotting it with matplotlib. For the last section of the day, Charles stopped by and talked more about advanced features in Avida, and the variety of research that’s been done. We closed with a round-table discussion of research that people were interested in using Avida for and how they would go about actually implementing them.

On the whole, I think the workshop went pretty well for the first time we’d tried out this format. The feedback that we got was pretty positive (although there were some great suggestions on how to improve as well). Having Avida as a consistent motivating example throughout seemed to be effective, and people came away with a much clearer idea of how to use it going forward. Since it went so well, it’s likely that we’ll do something similar again. If you’d be interested in attending a future workshop, fill out this form, and we’ll keep you posted!

There are, as always, some things that we could have done better. Here’s what we’re planning on changing for next time:

  • Use a single Avida experiment as a motivating example throughout the workshop. If we’re planning on this from the beginning, then it will be easier to make the Python section more hands-on by having learners write the code to pull out their results and graph them. For this workshop, we were worried about having enough time to cover everything, so we tried to provide some canned functions, with mixed success. Focusing in on one set of graphs should fix this (although a corollary is that we should really just provide some canned functions with Avida to handle this). Similarly, instead of making toy repositories in the git section, we can have people make a repository for their experiment, complete with config files. Then, in the Python section, they can be committing their scripts as they work on them. This way, the whole reproducible science workflow will be more apparent.
  • Potentially use scp instead of Filezilla. A few people ended up with strange errors when they accidentally rearranged their remote directory structure by clicking in the wrong place. The goal of Filezilla was to be more accessible, but we’re not sure it achieved that goal.
  • Potentially rethink which programming language we use. Most of our learners were more familiar with R than Python, and if our primary motivation is graphing and stats, R might be a better fit. This is particularly true if we do bundle a better data parsing script with Avida that people can just run out of the box. That said, if you’re working with Avida, you’re going to get into situations where you need to automate things that are not already automated. Knowing a scripting language is really useful for that. Moreover, since Avidians are evolved code, I would argue that there is value in understanding the basics of programming if you’re working with Avida. Since Python is better for teaching those concepts and as a scripting language, my personal inclination is to stick with Python, but maybe restructure its role in the workshop.
  • I tried to cram so much into the shell section that there wasn’t as much time for challenge problems as there probably should have been. Finding some things to prune and some more places to add them would probably be good.

Emily Dolson

I’m a doctoral student in the Ofria Lab at Michigan State University, the BEACON Center for Evolution in Action, and the departments of Computer Science and Ecology, Evolutionary Biology, & Behavior. My interests include studying eco-evolutionary dynamics via digital evolution and using evolutionary computation techniques to interpret time series data. I also have a cross-cutting interest in diversity in both biological and computational systems. In my spare time, I enjoy playing board games and the tin whistle.

More Posts - Website - Twitter

Leave a Reply