may, 2014

22mayalldayalldayScalable Bioinformatics Boot Camp


Event Details

This is a free introductory event with limited space available. Registration is required. Lunch and refreshments will be provided.

Registration(Two more seats available.)

Location:  The Synthesis Center  E-B143 is located just off the lobby of SDSC’s east entrance off Hopkins Drive.  Directions to SDSC:

In Big Data era, scalability is becoming a prerequisite for a bioinformatics application to be able to efficiently process large scale datasets. This bootcamp will explain how you can turn your bioinformatics applications into scalable workflows by analyzing available options, techniques and tools.

  • Learn about distributed platforms and system
  • Learn about Cloud and Big Data
  • Learn about scalable workflow tools
  • Learn how to make your science reproducible
  • Gain hands-on-experience with bioKepler tools to build scalable bioinformatics workflows

About the day:The day will start with a crash course on workflow technology and a hands-on session for using the locally developed open source Kepler workflow system.  We will then explore common computing platforms including Sun Grid Engine, NSF XSEDE high performance computing resources, the Amazon Cloud and Hadoop. We will explain how workflow systems can help with rapid development of distributed and parallel applications on top of any of these platforms. We will then discuss how to track data flow and process executions within these workflows (i.e. provenance tracking) including the intermediate results as a way to make workflow results reproducible. We will end with a session on using bioKepler to learn how to build and share scalable bioinformatics workflows in Kepler.  We will provide lab sessions at the end of each section of the course to apply the explained concepts to real application case studies.


All Day (Thursday)
