Anduril (workflow engine) explained
Anduril |
Screenshot Size: | 200 |
Latest Release Version: | 2.0.0 (2016-07-01) |
Latest Release Date: | [1] |
Operating System: | Linux, Microsoft Windows, Mac OS X |
Programming Language: | Java |
Genre: | Workflow engine |
License: | GPL (v.1.x), BSD (v.2.x) |
Anduril is an open source component-based workflow framework for scientific data analysis[2] developed at the Systems Biology Laboratory, University of Helsinki.
Anduril is designed to enable systematic, flexible and efficient data analysis, particularly in the field of high-throughput experiments in biomedical research. The workflow system currently provides components for several types of analysis such as sequencing, gene expression, SNP, ChIP-on-chip, comparative genomic hybridization and exon microarray analysis as well as cytometry and cell imaging analysis.
Architecture and features
A workflow is a series of processing steps connected together so that the output of one step is used as the input of another. Processing steps implement data analysis tasks such as data importing, statistical tests and report generation. In Anduril, processing steps are implemented using components, which are reusable executable code that can be written in any programming language. Components are wired together into a workflow, or a component network, that is executed by the Anduril workflow engine. Workflow configuration is done using a simple yet powerful scripting language, AndurilScript. Workflow configuration and execution can be done from Eclipse, a popular multipurpose GUI, or from the command line.
The core Anduril engine is written in Java and components are written in a variety of programming languages, including Java, R, MATLAB, Lua, Perl and Python. Components may also have dependencies on third-party libraries, such as Bioconductor. Components for cell imaging and microarray analysis are provided but additional components can be implemented by users. The Anduril core has been tested on Linux and Windows.
Anduril 1.0: AndurilScript language
Hello world in AndurilScript is simply std.echo("Hello world!")
Commenting follows the syntax of Java: // A simple comment /* Another simple comment */ /** A description that will be included in component description */
Components are called by assigning their calls to named component instances. Names cannot be re-used within a single workflow. There are special components for input files that include external files to the script. Supported atomic types are integer, float, boolean and string, and typing is done implicitly. in1 = INPUT(path="myFile.csv") constant1 = 1 componentInstance1 = MyComponent(inputPort1 = in1, inputParam1 = constant1)
Workflows are constructed by assigning outputs of component instances to inputs of following components. componentInstance2 = AnotherComponent(inputPort1 = componentInstance1.outputPort1)
Component instances can also be wrapped as functions. function MyFunction(InType1 in1, ..., optional InTypeM inM, ParType1 param1, ..., ParTypeP paramP=defaultP) -> (OutType1 out1, ..., OutTypeN outN)
In addition to standard if-else and switch-case statements, AndurilScript also includes for-loops. // Iterates over 1, 2, ..., 10 array = record for i: std.range(1, 10)
Extensibility
Anduril can be extended on multiple levels. Users can add new components to existing component bundles. However, if the new component or components carry out tasks that are not related to existing bundles, users can also create new bundles.
Moksiskaan
Moksiskaan is a data integration framework for the cancer research and molecular biology.[3] The framework provides a relational database that represents a graph of biological entities such as genes, protein, drugs, pathways, diseases, biological processes, cellular components, and molecular functions. In addition, there is a wide set of analysis and accession tools built on top of this data. The great majority of these tools are implemented as Anduril components and functions.
Moksiskaan is used mainly to interpret lists of candidate genes obtained from the genomic studies. Its tools can be used to generate graphs of biological entities related to the input genes. The exact for of these graphs may vary from the drug target predictions to the time series of signalling cascades. Some of the goals of these tools are closely related to IPA.
See also
Further reading
- Scientists develop new database that provides comprehensive view of Glioblastoma Multiforme genome in the Cancer Genome Atlas Research Briefs, March 2011, by Catherine Evans.
- Almeida . J. S. . Computational ecosystems for data-driven medical genomics . Genome Medicine . 2 . 9 . 67 . 2010 . 20854645 . 3092118 . 10.1186/gm188 . free .
- Sahu . B. . Laakso . M. . Ovaska . K. . Mirtti . T. . Lundin . J. . Rannikko . A. . Sankila . A. . Turunen . J. P. . Lundin . M. . Konsti . J. . Vesterinen . T. . Nordling . S. . Kallioniemi . O. . Hautaniemi . S. . Jänne . O. A. . Dual role of FoxA1 in androgen receptor binding to chromatin, androgen signalling and prostate cancer . The EMBO Journal . 30 . 19 . 3962–3976 . 2011 . 21915096 . 3209787 . 10.1038/emboj.2011.328.
- Pihlajamaa . P. . Zhang . F. -P. . Saarinen . L. . Mikkonen . L. . Hautaniemi . S. . Janne . O. A. . 10.1210/en.2011-0221 . The Phytoestrogen Genistein is a Tissue-Specific Androgen Receptor Modulator . Endocrinology . 152 . 11 . 4395–4405 . 2011 . 21878517. free .
- Blom . H. . Rönnlund . D. . Scott . L. . Spicarova . Z. . Rantanen . V. . Widengren . J. . Aperia . A. . Brismar . H. . Nearest neighbor analysis of dopamine D1 receptors and Na+-K+-ATPases in dendritic spines dissected by STED microscopy . Microscopy Research and Technique . 220–228. 2011 . 10.1002/jemt.21046 . 75. 2 . 21809413 . 206067902 .
- Ehlers . P. I. . Kivimäki . A. S. . Turpeinen . A. M. . Korpela . R. . Vapaatalo . H. . High blood pressure-lowering and vasoprotective effects of milk products in experimental hypertension . British Journal of Nutrition . 106 . 9 . 1353–1363 . 2011 . 10.1017/S0007114511001723. 21736845 . free .
- Maliniemi . P. . Carlsson . E. . Kaukola . A. . Ovaska . K. . Niiranen . K. . Saksela . O. . Jeskanen . L. . Hautaniemi . S. . Ranki . A. . NAV3 copy number changes and target genes in basal and squamous cell cancers . Experimental Dermatology . 20 . 11 . 926–931 . 2011 . 21995814 . 10.1111/j.1600-0625.2011.01358.x. 26219786 .
- Chen . P. . Lepikhova . T. . Hu . Y. . Monni . O. . Hautaniemi . S. . Comprehensive exon array data processing method for quantitative analysis of alternative spliced variants . Nucleic Acids Research . 39 . 18 . e123 . 2011 . 21745820 . 3185423 . 10.1093/nar/gkr513.
- Karinen S., Heikkinen T. . Data Integration Workflow for Search of Disease Driving Genes and Genetic Variants. PLOS ONE . 6. 4. 2011. 10.1371/journal.pone.0018636 . etal . e18636 . 21533266 . 3075259. free.
- Heinonen M., Hemmes A. . Role of RNA binding protein HuR in ductal carcinoma in situ of the breast. The Journal of Pathology . 2011. 10.1002/path.2889 . etal . 224 . 4. 529–539. 3504799 . 21480233.
- Louhimo R., Hautaniemi S. . CNAmet: an R package for integrating copy number, methylation and expression data. Bioinformatics . 27. 6. 887–888. 2011 . 10.1093/bioinformatics/btr019 . 21228048. free.
External links
Notes and References
- Web site: anduril-dev / anduril / doc / ChangeLog.txt — Bitbucket . bitbucket.org . 2021-03-25.
- Ovaska . K. . Laakso . M. . Haapa-Paananen . S. . Louhimo . R. . Chen . P. . Aittomäki . V. . Valo . E. . Núñez-Fontarnau . J. . Rantanen . V. . Karinen . 10.1186/gm186 . S. . Nousiainen . K. . Lahesmaa-Korpinen . A. M. . Miettinen . M. . Saarinen . L. . Kohonen . P. . Wu . J. . Westermarck . J. . Hautaniemi . S. . Large-scale data integration framework provides a comprehensive view on glioblastoma multiforme . Genome Medicine . 2 . 9 . 65 . 2010 . 20822536. 3092116 . free .
- Laakso . M. . Hautaniemi . S. . Integrative platform to translate gene sets to networks . Bioinformatics . 26 . 14 . 1802–1803 . 2010 . 20507894 . 10.1093/bioinformatics/btq277. free .