A01: Data- and diversity-driven development of a Shotgun crystallization screen using the ProteinData Bank

Gabriel Abrahams1, Janet Newman2

 

1University of Oxford, Oxford, U.K.

2CSIRO Biomedical Program, Parkville, Australia and School of Biotechnology and Biomolecular Sciences, University of New South Wales, Kensington, Australia

gabriel.abrahams@dtc.ox.ac.uk

 

Protein crystallization has for decades been a critical and restrictive step in macromolecular structure determination via X-ray diffraction. Crystallization typically involves a multi-stage exploration of the available chemical space, beginning with an initial sampling (screening) followed by iterative refinement (optimization). Effective screening is important for reducing the number of optimization rounds required, reducing the cost and time required to determine a structure. Here, we propose an initial screen (Shotgun II) derived from analysis of the up-to-date Protein Data Bank (PDB) and compare it with the previously derived (2014) Shotgun I screen. In an update to that analysis, we clarify that the Shotgun approach entails finding the crystallization conditions which cover the most diverse space of proteins by sequence found in the PDB – which can be mapped to the well known Maximum Coverage problem in computer science. With this realization we are able to apply a more effective algorithm for selecting conditions. In-house data demonstrates that the Shotgun I screen, compared with alternatives, has been remarkably successful over the seven years it has been in use, indicating that Shotgun II is also likely to be a highly effective screen. Discussion trigger: Ideally, protein crystallization screens would be designed specifically for the protein under investigation. Existing tools such as XtalPred can analyze the protein sequence to predict crystallizability. However, they do not give much guidance as to how to crystallize a given protein i.e. predicting the most likely crystallization conditions. Recently, significant progress has been made in structural prediction, and in surface characterisation (e.g. crystal vs. biological contacts). Could the crystallization data contained in the PDB, together with current approaches to analyse as yet un-crystallized proteins (such as structural prediction), be used to develop a model that can usefully inform crystallization screening?

 

Keywords: shotgun; crystallisation; screening