
About
Agilent FE is a frequently used software to extract data from CGH arrays (for a more specific description, look here). Although this program is ideal for extracting micro array data, many scientists prefer using CGH Explorer to analyse the results. Up until now there has been no way of importing FE-files into CGH Explorer directly. This forced users to merge 200 or more files exported from FE by hand to be able to import the data into CGHe. This was of course time consuming and tedious. To solve this problem, I (Ivan Potapenko) have created a small (but hopefully useful) program to entirely automate this process.
This program processes a given directory which contains any number of Agilent Feature Extraction raw data files. These files are processed in a number of ways, which include:
- Removing unnecessary columns
- Splitting columns (chromosome number, start and stop)
- Removing flagged values
- Removing controls
- Converting log ratio values from log_10 to log_2
- Sorting data by two criteria: chromosome number and start position
- Merging duplicate entries
- Checking file consistency (cross-matching ProbeUID-s and number of lines)
After the processing the results are written to a master file which should contain all data needed by CGH Explorer and can be imported right away.
Current version supports 44k and 244k Agilent CGH arrays processed by most versions of FE.
However, if you encounter any problems with the version you are using, please contact me - and I will be most happy to help.
This program is distributed under GPL (GNU Public License) version 3. You can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.
Downloads
Latest version is: 13.0.1 BETA (last modified: 28.03.10)
Source code: Compatible with all OS-s (source only). Makefile for *nix-s included.
Windows binary: For windows only (precompiled 32-bit binaries)
Note: Manual is included in all of the archives listed above, but can also be viewed separately here
NB! Probe classification is frequently updated by Agilent. To avoid incompatibilities between your files it is highly recommended to use the same Feature Extraction version to export all files you want to merge.
Version History
- v.13.0.1: Some changes have been made to the error messages which now should be slightly more helpful. Updated the manual to reflect changes in v.13.0.0.
- v.13.0.0: A major upgrade. Instead trying to catch up with every version of FE file formats, columns are now loaded dynamically based on column headers. This dramatically expands the number of supported versions. Some of the helper functions have also been re-written to make better use of various C++ features.
- v.12.0.0: Brings three main features: 1) Both 44k and 244k CGH array data may now be merged; 2) Fixed another severe bug (SIGSEGV on duplicate value check); 2) Implemented file format retrieval from header (finally) - so we are no longer dependant on properly named files. Also some minor internal clean up and user input modifications. Manual updated accordingly.
- v.11.0.2: Fixed a severe bug which crashed the program on unmapped values in files formatted according to 4.10.apr08.
- v.11.0.1: Code cleanup and commenting. Updated manual (contents and design).
- v.11.0.0: Support for FE-format version 4.10.apr08.
- v.10.1.2: From now on all FE-values (log_10) are converted to log_2 internally. Small changes to Makefile.
- v.10.1.1: Fixed a small bug introduced in 10.1.0 which prevented Windows version from running at all; timer enabled for windows; linux script rewritten; manual changed accordingly
- v.10.1.0: Implemented a workaround for a compiler bug in Windows (calling dtor deep within vector array before init on throw); startup script in Linux should now be interactive; manual changed accordingly; minor file open bug fixed; some code cleaning; few minor archive changes; minor Makefile changes
- v.10.0.0: Ugly bug in dupe remover fixed; manual cache definition for windows users; huge code clean up; some annoying user input bugs removed
- v.9.5.1: A different algorithm to merge duplicates has been implemented, should now be 400% or so faster.
- v.9.0.0: Duplicate removal has been implemented
- v.8.0.0: From now on, we are (almost) 100% cross-platform. After a few minor changes the abnormal termination is not happening like it did before with some STL and compiler versions.
- v.7.5.4: A bug has been fixed when reusing io-streams for file r/w
- v.7.5.3: Cleaned up the code, corrected some outdated comments, error throwing is now up to date
- v.6.1.0: Caching implemented and working
- v.5.0.0: Program should now be compatible with windows paths.
- v.4.0.0: Multiple files can now be processed at the same time.
- v.3.1.0: Many bug fixes, everything should now be working more or less as expected.
- v.2.0.0: Some speed enhancements.
- v.1.0.0: Initial release, processes one single file
Contact
If you have any questions regarding use of this program, want to report a bug or have some feedback, please email me.
Links
|