Module:ImportProtein/doc explained
Purpose
This module is meant to allow you to take a plaintext protein file and paste it in as a template parameter to get an annotated figure and legend. See for an example in trial use at Src (gene)
Usually, the figure should be placed on a separate page and transcluded. Until people advise to the contrary, the space is being used for this (note Wikipedia article space doesn't have subpages set AFAIK).
Parameters
The parameters are:
- The file is an unnamed parameter. It should be a standard GenPept format file cut and pasted from a site like https://www.ncbi.nlm.nih.gov/protein/P12931 or https://www.uniprot.org/uniprot/P12931.
- tableoutput -- determines if/how the long table of protein features is displayed. collapsed, the default value, begins by displaying the table in closed form. collapsible leaves it open at the beginning. no means no table output. Anything else leaves a non-collapsible table.
- height (default 50) -- the height of the protein bar itself in pixels not counting any surrounding content
- width (default 500) -- the width of the protein bar and all other table and legend material in pixels
- background (default #333333) -- the color of the unadorned protein bar (such as in disordered N- and C-terminal regions)
- vtext (default 25) -- the number of characters of annotation written vertically beneath each site
- vwidth (default 4) -- determines how close the vertical labels can be to one another. Labels that don't fit are not shown, but the numeric position of the missing labels is mentioned further down. Typically the user should look them up in the table and use the parameters at the end to process them to be able to fit.
- largeonlyregion (default 20) -- the number of vertical pixels set aside only for the large domains that are labelled horizontally in bold within themselves.
Many parameters exist to try to make it easier to customize the output. Overlapping features marked vertically are randomly not displayed, so these features may be necessary to adequately describe all the known sites. Others improve the labels used.
- usenotes -- a list of "region_name" or "site_type" names that aren't very informative. The entry under "notes" will be used as the feature name for all other operations.
- include -- defaults to all, in which case labels to be excluded must be mentioned individually with exclude. If another value is given, it should be a list of names in quotes - only those names will be shown in the figure or in the table.
- exclude -- omit for default no effect. A list of features by name, in quotes, not to be mentioned. These are also omitted from the table.
- substitute -- a list of feature names and replacements, each in quotes, separated by a colon. "binding":"ATP binding site" will convert a feature labelled "binding" to display the second name instead. This affects all annotations. Note that substituting a name with "" will suppress the text, but not the marking on the protein diagram.
- replaceregion -- a list of numeric ranges xx..yy followed by a colon and a quoted phrase. Use this when a lump of different features requires a single summary. So 77..85:"PS/PT/PY" can annotate a group of several phosphorylated amino acids with this short text. Note that placement of the text is at the center of the region you select; this feature therefore can be used to add extra markings if you like. Replaceregion does not affect the table of motifs.
- toprow - a list of motifs (for example secondary structure - helices, sheets, and turns) that are to be displayed as colored markings at the top of the protein graphics box. No vertical annotation is given. A legend beneath the vertical annotation will list them by color.
It is also, of course, possible to edit the original protein sequence annotations directly to remedy any problems, though the module flexibility is provided in the hope that for many proteins an undamaged copy of the original can be kept for versatility
Usage
This module requires a very large text input that would usually overwhelm a page if #invoked in it directly. The output is generally large masses of styled HTML that would overwhelm the page if substed. So it is usually best #invoked from a subpage. (Another option for compactness might be to use a screencap) The recommended policy in this area hasn't been queried yet.