DNA Fragment Calculator Copyright (C) 1999 David L. Tabb, University of Washington July 12, 1999 Updated June 11, 2002 SYNOPSIS: -=-=-=-=- DNA Fragment Calculator (DFCalc) produces lists of ions expected to appear in tandem mass spectra of known DNA sequences. The software allows the user to select which classes of ions are included in the prediction (selecting among A, B, C, D, W, X, Y, and Z ions) as well as which variants of the 5' ions are included (base losses or water losses). Ions resulting from double fragmentations can also be predicted using this software. DFCalc was written in Java and can be run using any Java Virtual Machine that supports applications written with version 1.1 of that language. It sports a graphical user interface for entry of the sequence by typing or via the clipboard. The GUI also allows the user to modify the types of ions to be included in the analysis. The resulting list of ions is directed to a file named by the user and is formatted as a tab-delimited text file with one line per ion. INSTALLATION AND EXECUTION: -=-=-=-=-=-=-=-=-=-=-=-=-=- To install DFCalc, download the DFCalc.zip file and extract its contents to a directory on your computer. The home website for DFCalc is: http://fields.scripps.edu/dfcalc To run DFCalc, a Java virtual machine capable of running Java applications must be installed on the computer system. The easiest way to ensure that one is installed is simply to install Microsoft Internet Explorer 4.0 (or later) or to install the Microsoft Java Virtual Machine independently. If the Microsoft Virtual Machine is to be used, the command line to make the program run looks like this: jview DFCalc A DOS batch file is included in the DFCalc directory that allows the software to be run by simply typing "go" when in that directory. USE OF DFCALC: -=-=-=-=-=-=-=- The upper portion of the GUI is for sequence entry. Simply type the DNA sequence to be analyzed into the text entry area at the top, or copy the sequence to the clipboard in another application and click the "Paste" button to make a copy from elsewhere. The "Include" section allows the user to select the ions to be included in the analysis. from the eight types of backbone fragments possible in DNA. Checking the box beside an ion type (a-d, w-z) will include that class of ions in the analysis. To include a-base, b-base, c-base, and d-base ions, check the appropriate box. To include a-water, b-water, c-water, and d-water ions, check the "Water Loss" box. Double fragments occur when both 5' and 3' ion fragmentations take place. If a 10-mer forms a-7 and then cleaves as a w-7 as well, a 4-base double fragment is produced. These ions can be included among those generated by DFCalc. They can either be integrated into the list of single fragments or listed separately. The appropriate checkboxes appear on the GUI. Two ranges allow the user to limit the number of ions listed by DFCalc. The software can be limited in the charge states it will consider for each ion by entering the two limiting charges in the first two text fields (which one is larger doesn't matter). The user can set the minimum and maximum M/Z of ions to be reported in the following two fields (which one is larger doesn't matter). DFCalc can create the file listing the ions anywhere the user prefers. It lists its present intended target file in the line above the buttons. The user can change the output file by clicking the "Save As" button. A standard file dialog will appear to allow the selection of a location for saving. The "Quit" button will exit the program. The "Compute Ions" will start generating ions according to the parameters set in the GUI. This process can take a long time, so the progress of the ion listing is updated in the command prompt area. The progress will slow down and speed up depending on where in the sequence ions are being generated; when 5' ions are being generated near the 3' end of the sequence, many more double fragments are possible. When the program finishes, the command prompt area will be updated with the time taken for ion enumeration and the file listing the ions will be written to disk. The file can then be opened in a spreadsheet or other application. A WORD ABOUT RUN TIME: -=-=-=-=-=-=-=-=-=-=-=- The number of possible double fragments goes up dramatically with the length of the sequence. Where the length of the sequence = N, (N-1)(N-2) possible double fragments exist. This number is multiplied by the number of charge states and by the number of 5' ion types, the number of variations on 5' ion types (plain, -base, and -water), and the number of 3' ion types. In short, one can easily swamp one's computer resources generating these lists of ions. DFCalc has been used to generate double fragments for sequences up to 140 bases in length (although this process took 20 seconds on a Pentium II-300). CHANGING THE DEFAULTS: -=-=-=-=-=-=-=-=-=-=-=- Frequent users of DFCalc may find it helpful to change the default options of the program. This can be accomplished by editing the DFCalc.ini file. The file must always end with a 0 in the final line. Every other line must fit one of the following templates (comments need not be included): I ABCDWXYZ //Which ions should be included in the analysis? B true //Should base loss ions be included? W true //Should water loss ions be included? C -5 -1 //Which charge states should be included? M 100.0 3000.0 //What mass range should be applied? D true //Should double fragments be generated? L true //Should double fragments be separated from singles? F C:\Temp\Ions.txt //What file should be written of the ions? REFERENCES: -=-=-=-=-=- Information about the fragmentation of DNA: McLuckey SA, Habibi-Goudarzi S. "Decompositions of Multiply Charged Oligonucleotide Anions" J. Am. Chem. Soc. (93) 115:12085-12095. Comparison of positive and negative charging of DNA: Wang P, Bartlett MG, Martin LB. "Electrospray Collision-Induced Dissociation Mass Spectra of Positively Charged Oligonucleotides" Rapid Comm. Mass Spec. (97) 11:846-856 KNOWN PROBLEMS: -=-=-=-=-=-=-=- DFCalc's predicted M/Z values for double fragments may be inaccurate. If someone can help me determine the error by supplying correct M/Zs for a particular sequence, I'd be glad to correct this. SUPPORT OR COMMENTS: -=-=-=-=-=-=-=-=-=-=- If you would like more information about DFCalc or have suggestions for its improvement, please contact David Tabb at dtabb@u.washington.edu. I would be glad to hear from you! ABOUT THE AUTHOR: -=-=-=-=-=-=-=-=- David Tabb is currently a graduate student in John Yates' lab at The Scripps Research Institute. His interests include computers (obviously), piano performance, cookie baking, desktop video, theology, reading, and practically anything else you could name. Someday he may graduate. ACKNOWLEDGEMENTS: -=-=-=-=-=-=-=-=- The author would like to thank Mark Krahmer for the suggestion that led to the creation of this software. Mark patiently tested this software during development and suggested articles to explain the fragmentation of DNA. LEGAL INFORMATION: -=-=-=-=-=-=-=-=-=- This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA The GNU Public License is included with this distribution in a file entitled gpl.txt. CHANGELOG: -=-=-=-=-=- 990712 DFCalc 1.0; initial public release