Synechocystis PCC 6803 curate-a-thon, 24-28 July; EventBrite
|
To search a different database, click on the "change" link (found below the Quick Search box, and at the bottom of the Search and Tools menus). In the dialog that pops up, you can either search for the organism of interest in the scrollable list, or you can start typing in its name.
When a large number of databases is available, the alphabetical index to the left of the database list provides a convenient shortcut for scrolling to a desired part of the alphabet. If you start typing an organism name, the full list of databases will be replaced by a list of databases matching the string you typed -- you can use the mouse or the up/down arrows on your keyboard to select the desired database. Lists of your recently used databases and the site's most popular databases provide shortcuts for selecting those databases.
If the site supports user accounts, and you are logged in, you may select one database as your preferred database. This database will be your default selection when starting a new web session.
Once you have selected the desired database, click OK to exit the
dialog. The page will reload, and the text under the Quick Search box
should now indicate the newly selected database. Note that if you are
looking at a page that contains data from a particular organism,
selecting a new database will not affect the contents of the current
page -- the new selection will apply only to your future searches.
Some examples of what can be entered into the Quick Search box include:
A few additional rules govern searches:
The results of all object searches is a table containing
the names of all objects that satisfy the search, with hyperlinks to their
corresponding data pages, along with any additional columns
relevant to the particular search. The table will initially be sorted
alphabetically by name, but small triangles in the column headers
allow the user to sort by any column, in either ascending or
descending order.
The sections below describe the different search criteria
that are available for each object type.
Quick Search
The Quick Search box in the upper right hand corner of every page is
useful if you know the name (or part of the name) or database
identifier of the object you are searching for. You may use this box
to search for genes, proteins, compounds, RNAs, reactions, pathways,
operons, and GO terms. If the query string matches a single object, the
page for that object will be displayed immediately. If there are
multiple matches, the full list of matches will be shown, organized by
the type of object (e.g. gene, protein, etc.).
Examples: pyruvate, trpA
Examples: kinase, pyr
Examples: 1.2.3.3, 1.3.99
Examples: CPLX0-3661, HEMN-RXN
Examples: P00561, NP_414543, C00047
This search will be limited to exact matches. In the example given,
assuming the current organism is E. Coli K-12,
without the qualifier there will be several matches including genes,
proteins and transcription units. With the qualifier you will be taken
directly to the trpa gene page.
This search will be limited to the specified type. In the example given,
assuming the current organism is E. Coli K-12,
without the qualifier a large number of results will be returned of various
types. With the qualifier, just the seven compounds with ATP in the name
will be returned.
Allowable type-qualifiers include pathway, gene, enzyme, rna,
go-terms, compound, reaction, operon, and organism.
Search Menu: Object Searches
The Search menu contains links to specialized search pages for
Compounds, Genes/Proteins/RNAs, Reactions and Pathways. Each such
page contains options for searching using a number of different
criteria, either individually or in combination. When the page is
initially loaded, only the name searches are active, but by clicking
on the different search bars, you can enable or disable additional
search criteria. If multiple search criteria are specified for a
given search, then unless otherwise specified the results must satisfy
all of them (that is, an AND connector is used to combine the different
criteria).
Search Menu → Compounds
Enter a compound
name, name fragment, or identifier (either the internal Pathway/Genome Database
identifier, or an identifier from some other database such as PubChem
or LIGAND). The software will attempt to do auto-completion on the
string you have entered based on the contents of the database. If you
select one of the auto-complete options, then when you submit the form
you will be taken directly to the data page for the selected compound,
regardless of other search criteria you may have
specified (i.e., other search criteria will be ignored). If you do not
select one of the auto-complete options, then the string you typed
will be the target of a substring search, which may be combined with
other search criteria.
This option allows you to browse the compound ontology. Each compound
class includes in parentheses after its name the number of
instance-level compound objects that are members of that class.
Clicking a + icon shows the classes and compounds that belong
to a particular class. The ontology may be used in one of two ways.
By selectively clicking on + icons, you can browse to find a
compound or compound class of interest, and click directly on its name
to visit the data page for that compound. Alternatively, you can
check the checkbox next to one or more class names to limit your
search (which may also include other search criteria) so as
to only include compounds that belong to one of the checked classes.
This option can be used to specify either a minimum molecular weight
value, a maximum molecular weight value, or both. If either the
minimum or maximum field is left blank, then the molecular weight is
unconstrained in that direction.
If
one or more element symbols are entered without a number, then the
result will include any compound containing those elements (and
possibly some others). If an element symbol is followed by a number,
then only compounds with exactly that number of that element in its
chemical formula will be included in the result. For example, the
query string C12N will retrieve all compounds with exactly 12
carbons, one or more nitrogens, and possibly some other elements. The
search is case-insensitive unless case is needed to
disambiguate. For example, either co or CO will retrieve
all compounds containing both carbon and oxygen, but Co will
instead retrieve all compounds containing cobalt.
InChI is short for International Chemical Identifier, and offers a way
to search for a molecule by its chemical structure. We support only
exact string matching for InChI strings.
Search Menu → Genes/Proteins/RNAs
Enter a gene
name, name fragment, or identifier (either the internal Pathway/Genome Database
identifier, or an identifier from some other database). The software will attempt to do auto-completion on the
string you have entered based on the contents of the database. If you
select one of the auto-complete options, then when you submit the form
you will be taken directly to the data page for the selected gene, regardless of any other search criteria you may have
specified (i.e., other search criteria are ignored). If you do not
select one of the auto-complete options, then the string you typed
will be the target of a substring search, which may be combined with
other search criteria.
Enter a protein or RNA name, name fragment, identifier (either the
internal Pathway/Genome Database identifier or an identifier from some other database,
such as UniProt), or a fully specified EC number. The software will
attempt to do auto-completion, as for the gene name field.
Enter a minimum and/or maximum sequence length, and specify whether
the units referred to are nucleotides or amino acids. If either the
minimum or maximum field is left blank, then the sequence length is
unconstrained in that direction.
Enter a minimum and/or maximum gene map position, where the units are
the number of base pairs from the start of the replicon. The results
will include any gene that overlaps any portion of the specified
region. If either the minimum or maximum field is left blank, then the
map position is unconstrained in that direction. If the selected
organism has multiple replicons, then this search option will include a
checkable list of replicons -- you may select one or more replicons
either instead of or in conjunction with the map position in order to
constrain the search to genes on a particular replicon.
Enter a minimum and/or maximum molecular weight for the gene product
in kilodaltons. If either the
minimum or maximum field is left blank, then the sequence length is
unconstrained in that direction.
Enter a minimum and/or maximum pI (isoelectric point) for the gene
product. (Typically little information
about pI is available for databases other
than EcoCyc or MetaCyc.)
This search option is for retrieving all proteins
affected by a specified small molecule in any of several ways. An
example might be to search for all enzymes inhibited by ADP, or all
enzymes that use Mg2+ as a cofactor. Enter the name of a
small molecule. We recommend taking advantage of the auto-complete
facility to select the correct small molecule, as only an exact match
to a compound name can be accepted here. Check all roles that you are
interested in for this compound. Note that we consider cofactors to
include only compounds that are not modified in any way during the
reaction. Molecules such as NAD, which are modified, are considered
to be substrates, not cofactors. (Relatively little information
about activators, inhibitors, etc. is typically available for databases other
than EcoCyc or MetaCyc.)
The evidence ontology
appears here in browseable form. Each evidence code includes in
parentheses after its name the number of gene products that have their
function annotated with that code. Selecting one or more codes to
filter on allows you to restrict your search, for example, to all
proteins whose function has been established experimentally. The
Pathway Tools evidence codes and ontology are described here.
The cell component
ontology appears here in browseable form, along with the numbers of
gene products associated with each cell component. Selecting one or more
components allows you to restrict your search to proteins known to
be present in those cellular locations. (Note that relatively little
information about cellular locations of gene products is available for
databases other than EcoCyc or MetaCyc.) The
Pathway Tools cell component ontology is described here.
If the selected database
has been annotated using Gene
Ontology, then you will see a browseable ontology here. Only
terms that have one or more gene products annotated to them or their
children will be present, and the number in parentheses after each term
name indicates the number of gene products annotated to that term or
one of its children. You may browse this ontology to a particular
term to see all gene products annotated with that term. Clicking on a
gene product will then take you directly to the data page for that
gene product, just as clicking on a term name will take you to the
data page for that term. Alternatively, you can use the checkboxes to
indicate that your search should be restricted to include only gene
products annotated with the checked terms or their children. If you
wish to filter by only a single term, and you know the name or ID for
that term, you also have the option of typing it in the text box
(using auto-completion to ensure you select the correct term).
If the selected database
has been annotated using the MultiFun ontology,
then you will see a browseable ontology here. Only
terms that have one or more genes annotated to them or their
children will be present, and the number in parentheses after each term
name indicates the number of genes annotated to that term or
one of its children. You may browse this ontology to a particular
term to see all genes annotated with that term. Clicking on a
gene will then take you directly to the data page for that
gene, just as clicking on a term name will take you to the
data page for that term. Alternatively, you can use the checkboxes to
indicate that your search should be restricted to include only genes
annotated with the checked terms or their children.
This search option will be available only if the selected
database is a multi-organism database (such as MetaCyc), and allows you to browse directly for proteins from a
particular organism, or to restrict your search to one or more
taxonomic groups.
This search option is
useful for retrieving a list of all genes or gene products that
cite a given publication or author. Enter either the PubMed
ID, the author surname, or part or all of an article title.
Search Menu → Reactions
Enter a
reaction EC number or name (typically an enzyme name).
EC numbers can be either full or partial. The software will
attempt to do auto-completion on the name or EC number. If you
select one of the auto-complete options, then when you submit the form
you will be taken directly to the data page for the selected reaction
or reaction class, regardless of any other search criteria you may have
specified (i.e., other search criteria will be ignored). If you do not
select one of the auto-complete options, then the string you typed
will be the target of a substring search, which may be combined with
other search criteria.
Enter a
compound name to retrieve all reactions in which that compound
participates either as a substrate or product. If you enter more than
one compound, then the reaction must involve all specified compounds
in order to be included in the results. We recommend taking advantage
of the auto-complete facility to select the correct compound, as
only an exact match to a compound name can be accepted here.
This option allows you to
browse the Pathway Tools reaction ontology. Each reaction class includes in
parentheses after its name the number of reactions that are members of
that class. The ontology may be used in one of two ways. By
selectively clicking on + icons, you can browse to find a
reaction of interest, and click directly on its name
to visit the data page for that reaction. Alternatively, you can
check the checkbox next to one or more class names to limit your
search (which may also include other search criteria) so
as to only include reactions that belong to one of the checked
classes. Note that there are two parallel reaction classification
systems, one in which reactions are classified by conversion type
(this includes the entire EC hierarchy), and another in which the
reactions are classified by substrate. Most reactions in the database
have parents in both classification systems.
Search Menu → Pathways
Enter a pathway name, name
fragment, or internal Pathway/Genome Database identifier. The software will attempt to
do auto-completion on the string you have entered based on the
contents of the database. If you select one of the auto-complete
options, then when you submit the form you will be taken directly to
the data page for the selected compound. This is true regardless of
any other search criteria you may have specified (i.e. other search
criteria will be ignored). If you do not select one of the
auto-complete options, then the string you typed will be the target of
a substring search, which may be combined with other search criteria.
This option allows you to
browse the Pathway Tools pathway ontology. Each pathway class includes in
parentheses after its name the number of reactions that are members of
that class. The ontology may be used in one of two ways. By
selectively clicking on + icons, you can browse to find a
pathway of interest, and click directly on its name
to visit the data page for that pathway. Alternatively, you can
check the checkbox next to one or more class names to limit your
search (which may also include other search criteria) so
as to only include pathways that belong to one of the checked
classes.
Enter a minimum and/or maximum number of desired reactions in the pathway. If either the
minimum or maximum field is left blank, then the number of reactions is
unconstrained in that direction.
Enter one or more
compound names to retrieve all pathways in which those compounds
participate as a reactant, a product, or an intermediate. If you
enter more than one compound, then the pathway must involve all
specified compounds in order to be included in the results. We
recommend taking advantage of the auto-complete facility to select the
correct compound, as only an exact match to a compound name can be
accepted here.
The Pathway Tools evidence ontology
appears here in browseable form. Each evidence code includes in
parentheses after its name the number of pathways that have their
function annotated with that code. Selecting one or more codes to
filter on allows you to restrict your search, for example, to all
pathways whose presence has been established experimentally. The
Pathway Tools evidence codes and ontology are described here.
This search option will be
available only if a multi-organiam database (such as MetaCyc) is the
selected database, and allows you to browse for pathways that are
curated as occurring in a particular organism based on experimental
information. The fact that a pathway is not stated to be present in a
given organism does not mean that the organism does not have the
pathway -- pathways are curated for only a small subset of the
organisms in which they appear.
This search option will be
available only if a multi-organism database (such as MetaCyc) is the selected database. Each pathway in
MetaCyc has been annotated with its expected taxonomic range. This
search option allows you to restrict your search to include only those
pathways you could reasonably expect to see for a given taxonomic
grouping, for example, to restrict your search to pathways seen in
plants.
This search option is
useful for retrieving a list of all pathways that
cite (either directly or through one of the pathway's enzymes, genes,
subpathways or substrates) a given publication or author. Enter either the PubMed
ID, the author surname, or part or all of an article title.
Search Menu → Advanced Search
The Advanced Search tool facilitates generation of queries that are more
complex than those supported by the object search tools described
above. Using the Advanced Search tool, you can write queries that
combine data from multiple organisms or multiple types of objects, and
you can search fields that are not supported by the individual object
search pages. Detailed instructions for using the Advanced Search
tool to construct complex queries are available here.
Ontology Searches
An ontology is a carefully constructed vocabulary of terms, often
called a controlled vocabulary. The terms are organized into a
classification hierarchy (also called a taxonomy). Ontologies can be
used to browse and search for objects by drilling down from more
general categories to more specific ones. Each Pathway/Genome Database
contains several ontologies. Those that can be searched are available
from the Ontologies sub-menu in the Search menu. These ontologies can
also be accessed from the object search page for their particular
object type. The browseable ontologies are:
Not all databases contain Gene
Ontology (GO) annotations, but for those that do, GO can be
browsed to see which gene products are assigned to which GO terms.
Each database only contains those terms to which one or more gene
products are actually assigned, so a term may be missing from the
browseable ontology even though it is a valid GO term. GO can also be
browsed from the Search Menu → Genes/Proteins/RNAs page.
Not all databases contain MultiFun
annotations, but for those that do (currently only EcoCyc and MetaCyc), MultiFun can be
browsed to see which genes are assigned to which terms.
Each database only contains those terms to which one or more genes
are actually assigned, so a term may be missing from the
browseable ontology even though it is a valid MultiFun term. MultiFun
can also be browsed from the Search Menu → Genes/Proteins/RNAs page.
The Pathway Tools pathway ontology classifies pathways into groups based on their
biological functions, and based on the classes of metabolites that
they produce and/or consume. It is also accessible from the Search Menu → Pathways page.
Enzyme Commission
numbers (EC numbers) form a classification scheme for enzymes,
based on the chemical reactions they catalyze. Pathway/Genome
Databases use EC numbers to organize enzyme-catalyzed reactions (rather
than the enzymes themselves) based on type of transformation and class
of substrates. The EC ontology can also be browsed from the Search
Menu → Reactions page (as a child of Chemical-Reactions). Both Search
Menu → Reactions and Search Menu → Genes/Proteins/RNAs pages allow
searching by EC number.
The Pathway Tools compound ontology describes small molecules, that is, chemical
compounds that are not macromolecules. It is also accessible from the
Search Menu → Compounds page.
Search Menu → Google This Site
The Search Menu → Google This Site command uses Google to
perform a full text search over this entire Web site. Searches
will not be restricted to the selected database, and can locate text
strings found in page comments, help pages, and other page content not
queryable by other means. Submitting this form will direct the user
outside this Web site to a page generated by Google. A Google
full text search is also offered as an option when a Quick Search
fails to return any result (or does not return the desired result).
Search Menu → BLAST
This facility (not available for MetaCyc) allows you to perform
sequence-similarity searches using the
BLAST program to compare your
protein or nucleic acid sequence against the complete genome of the
selected organism database.
Search Menu → Search Full-text Articles
Textpresso is a package for indexing and searching a corpus of
biological literature. Textpresso searches are available for searching a large Escherichia
coli literature corpus only at the BioCyc Web site, and are available only when EcoCyc is the selected
database.