Using MuGeN

Table 3> details all command line options supported by MuGeN. The following sections illustrate their usage. The next sections describe how to use this tool and are based on example files found in the MuGeN examples archive. To run the commands given in these sections, make the Examples directory your current directory, and make sure the mugenv and mugenb commands are located in a direcory accessible through your $PATH.

Interactive Annotated Genome Navigation

Navigation through annotated genome maps is performed with the mugenb command.

Basic operations

Load the annotated genome of Bacillus subtilis by typing mugenv -d Bsub.gbk. After some time (depending on the speed of your computer) three windows should appear.

The Map List Window

This window (Figure 2>) displays all loaded maps and computer analysis results. It also allows the manipulation of these maps and results with the button row located below the map list. To load a new map, select a data source from the available sources in the popup menu, the click on the Add button. Depending on the datasource, some additionnal information will be requested (typically a filename or an access number). It may be useful to work with several copies of the same map (for instance to compare different portions of the same genome). The order in which the maps are displayed can be modified with the two arrow buttons. They shift the currently selected map up or down. A map can be hidden and redisplayed with the Hide/Show button. To generate a new copy of a given map, select it in the map list and click on the Duplicate button. Any map can be "flipped" with the Flip button, meaning that the base positions decrease from left to right, instead of increasing, and that the strands of the features are switched: features ont the forward strand move to the reverse strand and vice versa. This feature is useful to compare genome portions which are conserved but whose directions are opposite. Finally a map can be removed using the Remove button. Notice that if there is only one map in the list, it cannot be removed.

Below the map operations panel, an Anchor textfield can be found. Each map can have it's own anchor which "fixes" its relative position. An anchor is either an integer value, representing a base position, or a gene name. In the latter case, the start position of this gene (if it exists in the selected map) will be used as anchor. Moreover, the map will be flipped if the gene is on the reverse strand. Anchors are useful to simultaneously display distant portions of genome maps. For example, after loading two genome maps of closely related organisms, the context of a gene bearing the same name in the two organisms can be examined by selecting each map in turn and entering the common gene name in the anchor textfield.

The remaining part of the map list window contains a list of computer analysis results loaded for the currently selected genome map. Such results can be added (respectively removed) through the Add Comp. Res. (resp. Remove Comp. Res.) button.

Figure 2. The Map List Window

>

The map list window with two genome maps (Bacillus subtilis and Bacillus halodurans). The selected map (B. subtilis is anchored on the cad gene which has caused the map to be flipped. A computer analysis result, GETORF output has been added to the B. subtilis map.

The Map Drawing Window

This window gives a graphical display of the annotated genome maps along with the computer analysis results. The main area is divided in "strips" or lines. Each strip represents either a portion of an annotated genome or a portion of a computer analysis result. When several annotated maps are loaded, their strips are displayed one above the other (i.e. the first strip of the first map followed bu the first strip of the second map followed by the second strip of the first map etc.). In that case, each map will have a different background color, ranging from white to light grey. When computer analysis results exist for a given map, they are displayed either on the map itself, or they are allocated separate strips immediately below the map they belong to.

By default, six lines per strip are used to draw CDSs, on for each reading frame of each strand. Other features are drawn either on the axis, if they are positional features (promoters, terminators, RBSs), or on a separate line below the CDS lines if they extend more than a dozen bp. (different RNAs, miscellaneous features and others). Also by default, CDSses are colored according to the strand they are located on, and filled if they have a known function, and empty otherwise.

Two other view modes are also available:

  • a bird's eye view: this view is adapted to display large portions of genome maps. It is automatically activated when more than 50 Kb per line are shown. In bird's eye view mode, all features are drawn as simple boxes, or little sticks and are no more reactive.

  • a sequence view: this is the view mode for lines smaller than 100 b. It shows the nucleotide sequence as well as its translation in the six reading frames.

The majority of display settings can be modified with the user controls at the bottom of the Map Drawing Window or with the menu entries it offers. The topmost row of user controls contains arrow buttons to move forward or backward alon the maps. The row below allows them be to zoomed in or out. Precise starting points, number of lines and bases per line can be set with the text fields below the zoom buttons. Finally, the thresholds for switching between the different view modes can be fixed with the sliders at the bottom of the window. The rightmost slider defines the minimum relative size for features whose names will be displayed. The Preferences menu offers several items influencing the map display:

  • Expand Strands: When checked, features belonging to different strands will be displayed on separate lines. Otherwise they will be displayed on the same line.

  • Show Frames: When checked, CDSs are displayed on different lines acccording to their reading frame.

  • Visible Features: This submenu offers one entry per feature type. Only the checked featured are displayed on the map.

  • Map Area Width: The width in pixels of the area on which the maps are drawn can be selected in this submenu.

  • Save Preferences: The current settings of the Preferences menu are saved in the default preferences file ($HOME/.mugenrc).

Figure 3. The Map Drawing Window

>

The Information Window

Figure 4. The Information Window

>

Generating Annotated Genome Images

The MuGeN preferences file

Computer Analysis Result formats

MuGeN Option List

Table 3. Options common to mugenb and mugenv

OptionMulti [a]Functionality
-d source:idYesSpecifies a resource from which to load annotated genome maps. Each resource consists of two parts, a source and an id. The source can be one of file, genbank, embl, xembl or micado. When no source is specified, file is taken as default. The id points to the specific map in the source. When the latter is a file, the id is simply the filename (in GenBank, EMBL, BSML or fasta format). When the source is a database (genbank, embl, xembl, micado) the id is the access number of the database entry. Maps will be displayed from top to bottom in the order they are entered on the command line. If the id start with a "!" the map will be flipped.
-f firstbaseNoSpecifies the starting point of the image to build. In the absence of any reference points, this is the first base of the map that will be located in the upper left corner of the image. If a reference point is given, the upper left corner will be the reference point offset by the amount specified by this option.
-l lastbaseNoSpecifies the ending point of the image to build. In the absence of any reference points, this is the last base of the map that will be located in the upper lower right corner of the image. If a reference point is given, the lower right corner will be the reference point offset by the amount specified by this option.
-s stepNoSpecifies the number of bases per display line.
-r refposYesSpecifies a reference position or anchor for a genome map. If the reference position is an integer, the start of the displayed image will be computed by adding the value of the -f option to the integer. If the reference position is a string, MuGeN will look for a CDS feature having a gene qualifier whose value equals the given string. If such a CDS is found, it's start base will be used to compute the start of de displayed image as explained above. Moreover, if the gene is on the reverse strand, the map will be flipped. The genome map for which the reference position is defined is determined by the index of the -r option wrt. the -d option (i.e. the first -r option will be applied to the map defined by the first -doption, the second -r applies to the second -d and so on).
-c filename[,index]YesSpecifies a computational analysis results file to display with a genome map. If a comma and an index are appended to the filename, the result will be applied to the genome map of the corresponding index. Index 1 is the genome map loaded by the first -d option, index 2 the map corresponding to the second -d and so on.
-e filenameNoSpecifes a file containing a color scheme to apply to displayed features.
-w nNoSpecifes the width in pixels of the drawing area
-p filenameNoSpecifes the preferences file to load. If no -p option is given, the preferenes file will be set to ${HOME}/.mugenrc.
Notes:
a. Multi options are options that can be used several times on the command line.

Table 4. Options specific to mugenb

OptionMultiFunctionality
-o formatNoSpecifies the output format of the image file to be generated. Valid formats are : PNG, IMAP, PS, EPS, XFIG.
-m mediatypeNoSpecifies the media type, for PS or EPS output files. Valid types are : a7, a6, a5, a4, a3, a2, a1, a0, b7, b6, b7, b4, b3, b2, b1, b0, lettern legal, executive, ledger.
-u urlprefixNoSpecifies the root URL for client-side image maps in IMAP format. Parameters relative to dislayed features will be appended to this root URL. For instance, given a root URL of http://www.somewhere.org/cgi-bin/myscript.pl?myid=xyz&, and an image containing a CDS feature, whose name is abcX positioned from base 1234 to base 5678, the URL generated for it's clickable area will be http://www.somewhere.org/cgi-bin/myscript.pl?myid=xyz&tag=CDS&name=abcX&start=1234&end=5678.