Bridging Boundaries in Molecular and Cellular Visualization

Publiceret Juli 2014

I had the pleasure of speaking at the 2012 SciViz conference, in the historic surgical theatre of the Medical Museion in Copenhagen. The conference had the subtitle: "pushing the boundaries of scientific visualization," so I took the opportunity to look at some of the boundaries that I face when creating visual representations of molecular and cellular subjects, and some of the solutions that I've developed to address them.

Bridging Boundaries Between User Communities

When planning a visualization effort, it is important to specify the community of users that will be using the imagery, to ensure that the visual method will meet the needs of the intended users. In the field of molecular visualization, there is a strong divide between two types of imagery that has been specifically developed for two different user communities: graphics for research, and illustration for education and outreach.

Decades of effort in the development of molecular graphics for research have lead to a robust set of tools. In the 1970s and 1980s, computer graphics was the realm of specialists, and scientists often enlisted the aid of computer graphics experts to explore their data and create images for publication. Today, high speed computer graphics is ubiquitous and most scientists have access to hardware and software for creating their own visualizations. A wide variety of turnkey software is available for molecular graphics, and decades of work by clever scientists has developed a toolbox of representations for exploring everything from enzyme action to protein folding.

The graphics used for education and outreach has advanced along with the advances in research graphics. The goal for these images is often quite different-they tend to be simpler, more schematic, and artists are often called on to illustrate subjects with larger scope, and perhaps less hard data, than purely scientific imagery. The process of simplification often includes a loss of information-in extreme cases, the subjects may be schematized entirely as a series of circles and boxes.

In my work at the RCSB Protein Data Bank (www.pdb.org), I have developed a visual method that bridges these two user communities, creating images that are comprehensible by communities without research training, but still retain enough scientific rigor to be useful to the scientific community (Figure 1). The core idea is to reduce the scientific jargon underlying much of graphics used for research. The powerful methods used for research rely on a number of visual metaphors, each capturing a different aspect of the molecular structure. However, if the user is not familiar with the conventions, the image will be confusing or misleading.

2014-3-goodsell-Figure1
Figure 1. Three images from the "Molecule of the Month" at the RCSB Protein Data Bank, and monthly feature designed to introduce new users to the PDB archive. A. Enhanceosome shows interaction of transcription factors (blue and purple) with DNA (yellow and red). B. Adenovirus, composed of a symmetrical arrangement of protein subunits. C. Complex I, a respiratory enzyme complex, using selective transparency to reveal a few of the cofactors in the upper portion. Images were created with entries 1t2k, 2pi0, 2o6g, 2o61, 1vsz, 1qiu, 3m9s and 3rko.

I have chosen a spacefilling representation for the molecules. This representation was developed by Linus Pauling, and is composed of a set of spheres that enclose the volume occupied by the atoms. To my mind, this best represents what a molecule "looks like," presenting the overall shape and size and giving insight into how it might interact with other molecules. For rendering, I have chosen a non-photorealistic style. This style has several advantages. Cartoons, with outlines and flat colors, are a common way to represent physical objects in the macroscopic world, so we're familiar with the graphical conventions. When applied to molecules, the outlines highlight the overall shape, form and subunit organization, and the flat shading model minimizes distracting details. The images also retain the atomicity of the structures, so viewers can easily see the size of the component atoms in relation to the whole molecule.

Experimental Boundaries and Integrative Structural Biology

Boundaries may also be imposed by the (current) limitations of experimental methods. In molecular and cell biology, there is a region of scale that is largely invisible to experiment, termed the mesoscale. At larger scales, microscopy is able to probe the ultrastructure of cells, but the resolution is not sufficient to resolve individual molecules. At smaller scales, techniques such as x-ray crystallography can determine the atomic structure of individual molecules, but only when they have been purified and separated from their cellular context. Methods are continually improving, providing better and better resolution by advanced microscopy methods and determining the atomic structures of larger (and more biologically-relevant) molecular complexes, but the mesoscale is still largely inaccessible to experiment.

For the past two decades, I have been synthesizing images of the mesoscale, integrating information from diverse experimental sources to create images of the molecular structure of cellular environments. This effort has been greatly aided by the growing availability of digital sources of information. Three key resources have transformed the way that I approach the research for these images, by making a large body of information available. For information at the nanoscale, the Protein Data Bank is a comprehensive resource for the study of structures of individual molecules. The database recently passed the milestone of 100,000 structures, including most of the important biomolecules in cells. When atomic structures are not available, UniProt is a comprehensive sequence database, with extensive annotations on domain structure, cellular location, subunit structure, and interactions. Information on the mesoscale (concentrations, interactions, cellular locations, etc) is more scattered, but PubMed allows instant access to the entire body of biomedical research publications. This invaluable resource is the central method for finding reports on all aspects of the simulations.

When I began creating these images, I started with two systems with abundant data: Escherichia coli and a red blood cell. At the time, the environments were far too complex to be modeled using the available computer hardware, so I created illustrations by hand, based on computational renderings of the molecular components. Computational hardware and software is catching up now, and there is currently an active effort to simulate 3D models of the cellular mesoscale.

Cross-Discipline Collaboration

Much of the current progress in science and education occurs through collaborations across the boundaries of different disciplines. I have been lucky to collaborate on a variety of different projects, applying the techniques of visualization to different subjects in collaboration with experts from research, education and outreach.

In a long-standing collaboration with Dan Klionsky at the University of Michigan, I have created a series of illustrations of autophagy, a process where cells engulf portions of themselves to recycle obsolete molecules. It has important connections to disease and is a topic of intensive current research. Klionsky and I have collaborated on illustrations of the entire process, filling in the molecular details as they become known from his research (Figure 2). He has used the illustrations as a way to integrate many lines of research into a coherent view of the process.

2014-3-goodsell-Figure2
Figure 2. Planning for an illustration of autophagy included research on the molecular complexes that regulate and direct the process. A. Schematic from Klionsky showing the interactions between proteins that had been identified in his research. B. Sketch from Goodsell incorporating molecular size and domain information from annotations of each protein in UniProt. C. Detail from the final illustration--the complex of proteins is shown at lower right.

With Tim Herman at the Milwaukee School of Engineering, I have worked as part of a team to design new materials for education. This project pairs a group of students with a researcher, and together they develop a variety of educational tools for presenting the subject. For instance, with Dan Sem and students at University of Minnesota, Madison, we developed a painting of cell signaling through VegF (Figure 3). This painting provides an overall view of the process to complement other materials that highlight the structure and function of the individual molecules.

2014-3-goodsell-Figure3small
Figure 3. Students developed a molecular story to be shown in an illustration of cell signaling by VegF. A. Students created a schematic of the network of molecules involved in the signaling pathway, and their cellular location. Additionally, they searched UniProt and the RCSB PDB for information on the structures of each molecule. B. The team found electron micrographs to identify a region of the cell that would show cell adhesion, signaling at the membrane, and the effect of signaling in the nucleus. This is an illustration based on the micrographs. C. The final illustration, showing the junction between two cells (left), signaling at the cell membrane (upper left), and the nucleus (right).

I have worked on two collaborative projects with the Medical Musieon in Copenhagen. In both cases, an image was needed to complement an exhibit, with the goal of highlighting some of the research being done at the University. We chose two aspects of hormone signaling and glucose transport, showing the effect of insulin on cells in one painting (Figure 4), and the release of glucagon in the other painting.

2014-3-goodsell-Figure4sml

Figure 4. Detail from an illustration of insulin signaling. (1) The insulin receptor is binding to insulin (small molecules in white) outside the cell, and activating (2) signaling proteins inside the cell. Some of the effects are to transport (3) glucose transporters to the surface of the cell and stimulate (4) enzymes that store glucose in glycogen.

 

Acknowledgements

Contract Grant Sponsor: RCSB PDB (NSF DBI 0829586, NIGMS, DOE, NLM, NCI, NINDS, NIDDK). This is manuscript 27095 from the Scripps Research Institute.