A new initiative in the Public Library of Science is attempting to
harness new technology to create a new, efficient process of archiving
information about the human genetic sequence. This work is reported in
an article released on July 7, 2008 in the open access journal PLoS Biology.
In sequencing DNA, scientists determine a long list of information
encoded in the form of a four letter alphabet, with each letter
corresponding to a specific chemical in the DNA molecule. These
sequences can be very long and complicated: the human genome, for
example, consists of three billion of these letters. Once this sequence
is determined, many scientific advances have made it possible
to interpret what biochemical products result from its use, usually by
using concepts in biochemistry and by comparing the genetic sequences
of similar organisms. However, this work is painstaking,
slow, and requires a high level of expertise.
The vital information about a given gene is quite extensive. Each
identified gene in the genome has a name, sequence, position on a
specific chromosome, protein, interaction partners, and many more
characteristics that could influence its function and
structure. Once the information is actually obtained about a
gene, the options to access the work done by scientists on it can be
limited.. While the presently existing libraries of information,
sometimes called gene portals, are considered extremely reliable --
even definitive -- are usually filled with information from only a few
major contributors which must be reviewed and updated by specific
experts. Since information about these genes is actually coming from
many different researchers working independently, resources to collect
the information together for efficient access is important.
In this project, researchers in San Diego, California, and St. Louis,
Missouri, are attempting to use the Wikipedia to collect information,
including citations, about specific genes in the human genome and their
associated proteins. Wikipedia is a web based information system that
relies on the contributions and audits of its users to accumulate and
edit information. This team has attempted to establish a "Gene
Wiki" which will allow a network of articles to be created by a
computer program, then enhanced by user comments. This information
would, cumulatively, work towards describing the relationships between
and functions of all human genes. The researchers hope that this would
allow a more flexible accumulation of scientific information, as all
readers would also be able to edit and add to the Gene Wiki pages.
To stimulate this project, a system has been developed to automatically
post information from the existing gene portals as 'stubs' on
Wikipedia. This program downloads the information from one system,
formats in the Wikipedia format, and then posts the information on
Wikipedia as necessary. The authors are confident that this will seed
more detailed information from scientists who find these stubs on
Wikipedia. Since the start of their efforts, the absolute number of
edits on genes in the mammalian genome has doubled. They encourage
users to seek gene information on Wikipedia to observe this new
phenomenon.
About PLoS Biology
All works published in PLoS Biology are open access. Everything is
immediately available"to read, download, redistribute, include in
databases, and otherwise use"without cost to anyone, anywhere,
subject only to the condition that the original authorship and source
are properly attributed. Copyright is retained by the authors. The
Public Library of Science uses the Creative Commons Attribution License.
A gene wiki for community
annotation of gene function.
Huss JW III, Orozco C, Goodale J, Wu C, Batalov S, et al.
PLoS Biol
6(7):e175.
doi:10.1371/journal.pbio.0060175
Click
Here for Full Length Article
Anna Sophia McKenney