
This contains the spectra of more than 70 proteins and has been designed for extensive coverage of both secondary structure and fold space. Recently a new larger and broader-based reference dataset containing the SRCD spectra of proteins of known structure has been produced. Synchrotron radiation circular dichroism (SRCD) beamlines, which provide very bright light sources, can routinely enable the measurement of CD data to much lower wavelengths than can be achieved in conventional lab-based CD instruments. Whilst existing methods tend to produce excellent results for the helical content, they are generally not very accurate in defining β-sheet and β-turn structures, and for the most part do not break down the secondary structural types into several of the components that are now seen to be functionally important, namely polyproline II (PP-II) helixes, 3 10 helices and different types of turns. The accuracy of an empirical analysis depends on the reference dataset containing representations of the types of structures present in the unknown protein. Currently, however, there is no consensus as to which of these secondary structure assignment methods correlates best with CD spectroscopic data. In addition, the Xtlsstr algorithm (based on various dihedral angles) was developed with the aim of being more relevant to spectroscopic measurements. There are many methods of assigning protein secondary structures from crystallographic data including those based on C α coordinates or hydrogen bonding patterns only, or in combination with phi and psi angles. A reference dataset consists of the CD spectra of a group of proteins, along with their corresponding secondary structure assignments derived from crystal structures.

Hence, the far UV CD data have been used for empirical determinations of protein secondary structure contents by employing the different reference dataset/algorithm combinations currently available.

In the far ultraviolet (UV) region of the electromagnetic spectrum, the electronic transitions of amide backbone groups dominate the CD spectra of proteins, with different types of secondary structures producing characteristic spectra. VUV CD data include important information on protein structure which can be exploited with the algorithms and methodologies described.Ĭircular dichroism (CD) spectroscopy measures the differential absorbance of left- and right-handed circularly polarised light as it passes through a sample of chiral molecules. However, we have also demonstrated that the new reference dataset, methods, and assignments can also improve the analyses of conventional circular dichroism data, even if the low wavelength data is not available.

We have further shown that if precise measurements of protein concentrations, and therefore spectral magnitudes, are not available, the inclusion of the low wavelength data will significantly improve the analyses. In this study, we have optimised secondary structure calculation methods based on the low wavelength CD data by examining existing algorithms and secondary structure assignment schemes, and then developing new methods which have produced clear improvements in prediction accuracy, especially for beta-sheet components. However, the existing algorithms used to calculate protein secondary structures from CD data have not been designed to take optimal advantage of the additional information in these low wavelength data. Recently a new reference dataset of SRCD spectra of proteins of known structure, designed to cover secondary structure and fold space, has been produced which includes low wavelength (vacuum ultraviolet – VUV) data. Modern synchrotron radiation CD (SRCD) instruments have considerably higher photon fluxes than do conventional lab-based CD instruments, and hence have the ability to routinely measure CD data to much lower wavelengths. Circular Dichroism (CD) spectroscopy is a widely used method for studying protein structures in solution.
