RSE Sheffield Blog

The importance of waiting for your children (within HPC batch jobs)

Will Furnass
14 August 2019 18:00

Today I helped a researcher with an interesting problem. He had workflow that generated the desired number of output files when run interactively on our ShARC HPC system but returned fewer output files when run as a (Grid Engine) batch job.


Dept of Computer Science invests in GPUs for new HPC cluster

Will Furnass
17 July 2019 10:09

Here in the Dept of Computer Science (DCS) at the University of Sheffield the need for access to high-performance GPU hardware has increased considerably in the last couple of years. Within the dept uses currently include machine/deep learning and agent-based modelling (e.g. using FLAME GPU).


Two Sheffield RSEs in the 2019 SSI Fellowships cohort

Anna Krystalli, Mozhgan Kabiri Chimeh, Becky Arnold
3 June 2019 14:56

2019 is another good year for representation of the University of Sheffield and RSE Sheffield in the Software Sustainability Institute’s (SSI) Fellowship program, with Dr Mozhgan Kabiri Chimeh and Dr Anna Krystalli, both members of the RSE team, joining a cohort of 17 Fellows from around the country and across disciplines. This follows another successful year in 2018, when the University of Sheffield was represented by three Fellows, Dr Tania Allard, a member of the RSE team at the time, Dr Adam Tomkins, a neuroinformatic researcher on the Digital Fruit Fly Brain project and Becky Arnold, a PhD student in Astrophysics. Becky went on to spend 4 months with the RSE team, working with Anna on the Turing Way project, an open source how-to guide for reproducible data science.


Wanted: a second new member of the RSE Sheffield team

Will Furnass
3 June 2019 10:58

Following on from my previous post about us recruiting for a new member of the RSE Sheffield team to work on a variety of projects, we’re also recruiting for another new RSE role for a specific project, PRIMAGE.


Wanted: new member of the RSE Sheffield team

Will Furnass
31 May 2019 12:19

We are looking for a new member of the growing Research Software Engineering (RSE) team here at the University of Sheffield. We’d love to hear from you if:


10th International Women in HPC Workshop: call for posters

Will Furnass
8 March 2019 15:00

The 10th International Women in HPC workshop will discuss methods to improve diversity and provide early career women with the opportunity to develop their professional skills and profile. It is to be held as part of the ISC (International Supercomputing) 2019 conference on Thursday June 20th 2019 in Frankfurt, Germany.


The Turing Way: An open source resource promoting best practice for reproducible research

Becky Arnold
7 March 2019 09:00

Additional material by Rosie Higman, University of Manchester

The Turing Way is a project funded as part of UKRI’s Strategic Priorities fund. It aims to help researchers and RSEs improve the reproducibility of their research. It has three main components:

  • An open source textbook hosted on GitHub providing guidance on best practice for reproducible research
  • Case studies of reproducible research
  • Workshops teaching researchers and RSEs to use tools to make their work more reproducible

If you want to be kept up to date with the project’s progress, you can sign up to the project’s newsletter. Myself and a lot of the team will also be at the Collaborations Workshop 2019 to discuss the project if you’re there and want to chat.

A Curated, Open Textbook

There are a lot of fantastic open source resources available providing tutorials and discussions related to practically every facet of reproducible research. However these materials are not centralised or integrated, but scattered across the web. The goal of the textbook is not to spend time reinventing the wheel by writing more, but to synthesise materials that already exist into the form of an open-source textbook covering topics such as

  • Why research reproducibility matters and how to define it (ready for community feedback)
  • Version control (ready for community feedback)
  • Open research (ready for community feedback)
  • Sustainable research data management (in progress)
  • Testing (in progress)
  • Reproducible computational environments (in progress)
  • Continuous integration (in progress)
  • Continuous analysis (in progress)
  • Reproducibility with interactive notebooks (community help wanted)
  • Good coding practice (community help wanted)
  • Reproducibility with Deep Learning (community help wanted)

Fundamentally this is an open project so everyone is encouraged to get involved! Whether that’s by making improvements to already complete chapters, writing new ones, reviewing new work, raising issues, or any other contribution it’s welcome and appreciated. We want this to be a living, evolving resource that serves and is supported by the wider community.

Case Studies of Reproducible Research in Practice

We’re working with several researchers from the Alan Turing Institute (the Turing reproducible research champions) to reproduce their work, share their experiences and provide case studies of reproducible research in practice. We’re interested in why reproducibility is important to them, how they went about trying to work in a reproducible way, and crucially what advice they would give to their previous selves.

Once complete, these case studies will be will be interwoven throughout the textbook to demonstrate how theory translates to practice. They will allow readers to follow along with the work of the Turing reproducible research champions as we collaborated with them to overcome challenges including reproducible analysis when using HPC resources, managing and packaging fast-growing projects and sharing workflows built for sensitive data.

If you have tips and tricks, or “gotchas” that caught you out as you tried to make your work reproducible, please share them through this google form.

Workshops for Researchers and Research Software Engineers

We’re running some workshops as part of the project. Two (one in Manchester on March 1st and one in London on March 12th) was/will be on helping people to boost their research reproducibility with Binder. Binder is a tool for helping people share code without worrying about the computational environment they’re running or installing a long list of requirements.

During these free workshops we discuss reproducible computing environments, show examples of others’ projects in Binder and help you learn how to prepare a binder-ready project. At the end of the workshop you will be able to take some of your own content (in a R or Jupyter notebook, or scripts that can be run in the terminal) and prepare it so that it can be used by others on myBinder.org.

This workshop is geared towards people who are:

  • Interested in reproducibility, containers, Docker or continuous integration;
  • Already familiar with R Markdown or Jupyter notebooks;
  • Looking to communicate their research more effectively.

We’ll also be running a workshop on how to build a BinderHub in Sheffield on March 18th. Hosting your own BinderHub locally allows you to control who has access to code and data and to provide greater computational power. This is not the case in the public Binder instance, myBinder.org, which requires all code and data to be fully open, and computational power and data storage is limited.

During this free workshop we will demonstrate how to build your own BinderHub on Microsoft Azure cloud computing resources. We will help you get started with building a BinderHub on your institution’s computing platform and discuss the challenges of maintaining a BinderHub. At the end of the workshop you will know why this would be a useful resource for your team, and will know where to look for help and support building your institution’s BinderHub.

This workshop is geared towards Research Software Engineers and IT staff who are:

  • Interested in reproducibility, containers, Docker or continuous integration;
  • Already familiar with Binder and R Markdown or Python for data science;
  • Interested in setting up their own local BinderHub.

See you soon

Working on this project had been fascinating, and it’s been a great excuse to set aside time to learn more about a whole host of topics. It’s also given me the opportunity to work with a lot of fantastic people. I hope to see you in the future on GitHub if you have the time to contribute to the project, or even just saying hello in our gitter chat room.

Useful links:

  • GitHub repository: https://github.com/alan-turing-institute/the-turing-way
  • Gitter chat room: https://gitter.im/alan-turing-institute/the-turing-way
  • Newsletter: https://tinyletter.com/TuringWay
  • Manchester event (1 March): https://www.eventbrite.co.uk/e/boost-your-research-reproducibility-with-binder-manchester-registration-55331997494
  • London event (12 March): https://www.eventbrite.co.uk/e/boost-your-research-reproducibility-with-binder-london-registration-55337162944
  • Sheffield event (18 March): https://www.eventbrite.co.uk/e/build-a-binderhub-registration-55336756729

xarray for Earth science

Joseph Cook (github.com/jmcook1186, @tothepoles)
8 February 2019 11:48

Why xarray


SSH Forwarding for easier HPC interaction

Phil Tooley
31 January 2019 12:00

What is SSH?


Christopher Woods talk about How To Design And Engineer Good Code For Research

Becky Arnold
30 January 2019 15:30

*Image courtesy of Nic Trott*

Contact Us

For queries relating to collaborating with the RSE team on projects: rse@sheffield.ac.uk

Information and access to JADE II and Bede.

Join our mailing list so as to be notified when we advertise talks and workshops by subscribing to this Google Group.

Queries regarding free research computing support/guidance should be raised via our Code clinic or directed to the University IT helpdesk.