A good essay is 10% inspiration, 15% perspiration, and 75% desperation.

Eichler Lab

Department of Genome Sciences,
University of Washington

Computational Facilities:


The Eichler Lab computational facilities consist of two main components: a high-performance cluster and network-available storage. In addition to the Eichler Lab's dedicated systems, the department maintains a shared infrastructure and an IT team.


The Eichler Lab's Linux-based high-performance cluster includes a total of 166 nodes with an aggregate 2,516 CPU cores. This cluster is specifically useful in running parallel applications, such as running large assemblies, variant callers, and annotation pipelines. Additionally, there are 1,840 terabytes (TB) of usable network storage available to compute nodes and desktop systems.



The Genome Sciences (GS) Department has a fully dedicated IT staff of eight professionals. This group consists of five System Engineers, two System Administrators, and an IT Director. Shared departmental computing resources include 42 TB of online storage, a high-performance cluster with an aggregate 588 CPU cores, and a Globus/Aspera data transfer system with 10 GB Internet2 connectivity. The IT group has significant bioinformatics and scientific computing experience; they manage a collection of lab-specific scientific computing systems that include high-performance clusters with 9000+ total CPU cores, 9 petabytes (PB) of network-available storage, and 600+ server systems. GS-IT also provides cloud computing architecture, security, and support services. Three IT team members are Amazon certified AWS Solutions Architect Associates. On-premise systems are housed in the department’s dedicated data center with redundant cooling, UPS/generator backed power, key-card controlled access, environmental monitoring, seismically braced racks, and gas fire suppression. Departmental and lab-specific systems are backed up to tape, which are regularly shipped offsite for third-party vaulted storage. In addition to close monitoring, weekly restore tests of randomly selected file systems ensure back-ups are functional. Tape back-ups are accomplished using two Oracle SL3000 tape robots with 36 tape drives and an 18 PB tape slot capacity.


Databases

Software

Segmental Duplication Assembler
SMRT-SV v2
PARASIGHT
Multiple Alignment Manipulator (MaM)
DupMasker
mrFAST: micro-read Fast Alignment Search Tool
mrsFAST: micro-read substitution-only Fast Alignment Search Tool
drFAST: dibase-read Fast Alignment Search Tool
VariationHunter/CommonLAW: Tool for Structural Variation Detection using Next-Gen Sequencing
NovelSeq: Tool for Novel Sequence Detection using Next-Gen Sequencing
SPLITREAD: Split read-based INDEL/SV caller for detecting structural variants and indels from genome and exome sequencing data
CoNIFER: Copy Number Inference from Exome Reads