Biositemap Explained

A Biositemap is a way for a biomedical research institution of organisation to show how biological information is distributed throughout their Information Technology systems and networks. This information may be shared with other organisations and researchers.

The Biositemap enables web browsers, crawlers and robots to easily access and process the information to use in other systems, media and computational formats. Biositemaps protocols provide clues for the Biositemap web harvesters, allowing them to find resources and content across the whole interlink of the Biositemap system. This means that human or machine users can access any relevant information on any topic across all organisations throughout the Biositemap system and bring it to their own systems for assimilation or analysis.

File framework

The information is normally stored in a biositemap.rdf or biositemap.xml file which contains lists of information about the data, software, tools material and services provided or held by that organisation. Information is presented in metafields and can be created online through sites such as the biositemaps online editor.[1]

The information is a blend of sitemaps and RSS feeds and is created using the Information Model (IM) and Biomedical Resource Ontology (BRO). The IM is responsible for defining the data held in the metafields and the BRO controls the terminology of the data held in the resource_type field. The BRO is critical in aiding the interactivity of both the other organisations and third parties to search and refine those searches.

Data formats

The Biositemaps Protocol[2] allows scientists, engineers, centers and institutions engaged in modeling, software tool development and analysis of biomedical and informatics data to broadcast and disseminate to the world the information about their latest computational biology resources (data, software tools and web services). The biositemap concept is based on ideas from Efficient, Automated Web Resource Harvesting[3] and Crawler-friendly Web Servers,[4] and it integrates the features of sitemaps and RSS feeds into a decentralized mechanism for computational biologists and bio-informaticians to openly broadcast and retrieve meta-data about biomedical resources.

These site, institution, or investigator specific biositemap descriptions are published in RDF format online and are searched, parsed, monitored and interpreted by web search engines, web applications specific to biositemaps and ontologies, and other applications interested in discovering updated or novel resources for bioinformatics and biomedical research investigations. The biositemap mechanism separates the providers of biomedical resources (investigators or institutions) from the consumers of resource content (researchers, clinicians, news media, funding agencies, educational and research initiatives).

A Biositemap is an RDF file that lists the biomedical and bioinformatics resources for a specific research group or consortium. It allows developers of biomedical resources to describe the functionality and usability of each of their software tools, databases or web-services.[2] [5]

Biositemaps supplement and do not replace the existing frameworks for dissemination of data, tools and services. Using a biositemap does not guarantee that resources will be included in search indexes nor does it influence the way that tools are ranked or perceived by the community. What the Biositemaps protocol will do is provide clues, information and directives to all Biositemap web harvesters that point to the existence and content of biomedical resources at different sites.

Biositemap Information Model

The Biositemap protocol relies on an extensible information model that includes specific properties[6] that are commonly used and necessary for characterizing biomedical resources:

Up-to-date documentation on the information model is available at the Biositemaps website.

See also

External links

Notes and References

  1. http://biositemaps.ncbcs.org/editor/ Biositemaps online editor
  2. Dinov ID, Rubin D, Lorensen W . iTools: A Framework for Classification, Categorization and Integration of Computational Biology Resources . PLOS ONE . 3 . 5 . e2265 . 2008 . 18509477 . 2386255 . 10.1371/journal.pone.0002265 . 2008PLoSO...3.2265D . etal. free .
  3. M.L. Nelson . J.A. Smith . del Campo . H. Van de Sompel . X. Liu . Efficient, Automated Web Resource Harvesting . WIDM'06 . 2006 .
  4. Crawler-friendly Web Servers . ACM SIGMETRICS Performance Evaluation Review . 28 . 2 . 9–14. 2000 . 10.1145/362883.362894 . Brandman O . Cho J . Garcia-Molina H . Hector Garcia-Molina . Shivakumar N . Narayanan Shivakumar . 10.1.1.34.7957 . 5732912 .
  5. Cannata N, Merelli E, Altman RB . Time to organize the bioinformatics resourceome . PLOS Comput. Biol. . 1 . 7 . e76 . December 2005 . 16738704 . 1323464 . 10.1371/journal.pcbi.0010076 . 2005PLSCB...1...76C . free .
  6. Chen YB, Chattopadhyay A, Bergen P, Gadd C, Tannery N . The Online Bioinformatics Resources Collection at the University of Pittsburgh Health Sciences Library System—a one-stop gateway to online bioinformatics databases and software tools . Nucleic Acids Res. . 35 . Database issue . D780–5 . January 2007 . 17108360. 1669712 . 10.1093/nar/gkl781 .