Next-generation sequencing platforms are capable of generating gigabytes of data in a sequence run leading to terabytes of data in a single experiment. Thus data storage, transfer, and management will be unquestionably the rate limiting steps in turning this new sequencing data into knowledge. Cambridge Healthtech Institute’s Sequencing Data Storage and Management convenes hardware and software engineers, database architects, storage managers, systems integrators and analysts, as well as biological researchers and bioinformaticists. Each specialty provides unique perspectives and must be integrated into a cohesive, comprehensive team to efficiently manage the sequencing data deluge.
Day 1 | Day 2 | Day 3 | Download Brochure
8:30 am Short Course Registration
9:00 Short Course Sessions*
*Separate registration required.
12:00 pm Close of Short Courses
11:30 am-2:00 pm Conference Registration
2:00 Chairperson’s Remarks
Kevin Davies, Ph.D., Editor-in-Chief, Bio-IT World
2:05 High Performance Cyberinfrastructure Enables Data-Driven Science in the Globally Networked World
Larry Smarr, Harry E. Gruber Professor, Department of Computer Science and Engineering, University of California, San Diego; Director, California Institute for Telecommunications and Information Technology
High performance cyberinfrastructure (10Gbps dedicated optical channels end-to-end) enables new levels of discovery for data-intensive research projects—such as next-generation sequencing. In addition to international and national optical fiber infrastructure, we need local campus high performance research cyberinfrastructure (HPCI) to provide “on-ramps,” as well as scalable visualization walls and compute and storage clouds, to augment the emerging remote commercial clouds. I will review how UCSD has built out just such a HPCI and is in the process of connecting it to a variety of high-throughput biomedical devices. I will show how high performance collaboration technologies allow for distributed interdisciplinary teams to analyze these large data sets in real-time.
2:50 Complex Data Analysis through Semantic Workflows
Yolanda Gil, Ph.D., Associate Director of Research, Intelligent Systems Division, Research Professor, Computer Science, University of Southern California
In the coming decades, computational experimentation will push the boundaries of current science infrastructure not only in terms of scale but also inter-disciplinary scope and integrative models of the phenomena under study. A key emerging concept is computational workflows, which can be seen as software instruments that allow a scientist to analyze raw experimental or simulation data to derive new scientific results. I will present our work to date on semantic workflows to enable new capabilities for automated workflow generation, reuse, validation, and experiment design that have the potential to increase scientific productivity by orders of magnitude. I will describe our applications in genomics and biomedical image analysis for neuroscience and cancer research.
3:25 Refreshment Break
4:00 Shock -- A Next Generation Data Management Infrastructure for a World of Democratized SequencingNarayan Desai, Principal Experimental Systems Engineer, Mathematics and Computer Science; Fellow, Computation Institute, Argonne National Laboratory
4:35 How to Read 1,000,000 Manga Pages: Visualizing Patterns in Art, Games, Comics, Photography, Cinema, Animation, and Print MediaLev Manovich, Ph.D., Director, Software Studies Initiative; Professor, Visual Arts, University of California, San DiegoThe explosive growth of cultural content on the web including social media and the digitization work by museums, libraries, and companies make possible a fundamentally new paradigm for the study of culture and media. We can use computer-based techniques for data analysis and interactive visualization already employed in sciences to analyze patterns and trends in massive cultural data sets. I will show examples of our visualizations of patterns in art, film, animation, video games, magazines, literature and comics created in our lab situated at University of California, San Diego (UCSD) and California Institute of Telecommunication and Information (Calit2). I will also discuss our current work on the analysis and visualization of trends in 1 million Manga pages using 215 megapixel HIPerSpace display developed at Calit2 and the supercomputers at National Energy Research Scientific Computing Center (NERSC).
5:10 Ultra High-Speed Transport of Life Sciences Data over Global Networks
Michelle Munson, President & Co-Founder, Aspera, Inc.
Collaborative research teams need to efficiently exchange, process and analyze gigabytes of data in a sequence run. Traditional data transport methods are unable to manage this volume of data. This session focuses on now-generation transport technologies used in genomic research that achieves up to 1000x, the throughput of standard file transfer protocols. A case study of global researchers participating in the 1000 Genomes Project showcases how they have been able to exchange sequencing data at 1 Gbps.
5:45 Reception in the Exhibit Hall
7:15 Close of Day
Day 1 | Day 2 | Day 3 | Download Brochure