Package org.biojava.bio.alignment
Class SubstitutionMatrix
- java.lang.Object
-
- org.biojava.bio.alignment.SubstitutionMatrix
-
public class SubstitutionMatrix extends java.lang.ObjectThis object is able to read a substitution matrix file and constructs a short matrix in memory. Every single element of the matrix can be accessed by the method
getValueAtwith the parameters being two BioJava symbols. This is why it is not necessary to access the matrix directly. If there is no value for the two specifiedSymbols anExceptionis thrown.Substitution matrix files, are available at the NCBI FTP directory.
- Author:
- Andreas Dräger
-
-
Field Summary
Fields Modifier and Type Field Description protected FiniteAlphabetalphabetprotected java.util.Map<Symbol,java.lang.Integer>colSymbolsprotected java.lang.Stringdescriptionprotected short[][]matrixprotected shortmaxprotected shortminprotected java.lang.Stringnameprotected java.util.Map<Symbol,java.lang.Integer>rowSymbols
-
Constructor Summary
Constructors Constructor Description SubstitutionMatrix(java.io.File file)This constructor can be used to guess the alphabet of this substitution matrix.SubstitutionMatrix(FiniteAlphabet alpha, short match, short replace)Constructs a SubstitutionMatrix with every Match and every Replace having the same expenses given by the parameters.SubstitutionMatrix(FiniteAlphabet alpha, java.io.File matrixFile)This constructs aSubstitutionMatrixobject that contains twoMapdata structures having BioJava symbols as keys and the value being the index of the matrix containing the substitution score.SubstitutionMatrix(FiniteAlphabet alpha, java.lang.String matrixString, java.lang.String name)With this constructor it is possible to construct a SubstitutionMatrix object from a substitution matrix file.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description FiniteAlphabetgetAlphabet()Gives the alphabet used by this matrix.java.lang.StringgetDescription()This gives you the description of this matrix if there is one.shortgetMax()The maximum score in this matrix.shortgetMin()The minimum score of this matrix.java.lang.StringgetName()Every substitution matrix has a name like "BLOSUM30" or "PAM160".static SubstitutionMatrixgetSubstitutionMatrix(java.io.BufferedReader reader)This constructor can be used to guess the alphabet of this substitution matrix.shortgetValueAt(Symbol row, Symbol col)There are some substitution matrices containing more columns than lines.SubstitutionMatrixnormalizeMatrix()With this method you can get a “normalized”SubstitutionMatrixobject; however, since this implementation uses an short matrix, the normalized matrix will be scaled by ten.voidprintMatrix()Just to perform some test.voidsetDescription(java.lang.String desc)Sets the description to the given value.java.lang.StringstringnifyDescription()Converts the description of the matrix to a String.java.lang.StringstringnifyMatrix()Creates aStringrepresentation of this matrix.java.lang.StringtoString()Overrides the inherited method.
-
-
-
Field Detail
-
rowSymbols
protected java.util.Map<Symbol,java.lang.Integer> rowSymbols
-
colSymbols
protected java.util.Map<Symbol,java.lang.Integer> colSymbols
-
matrix
protected short[][] matrix
-
min
protected short min
-
max
protected short max
-
alphabet
protected FiniteAlphabet alphabet
-
description
protected java.lang.String description
-
name
protected java.lang.String name
-
-
Constructor Detail
-
SubstitutionMatrix
public SubstitutionMatrix(FiniteAlphabet alpha, java.io.File matrixFile) throws BioException, java.lang.NumberFormatException, java.io.IOException
This constructs aSubstitutionMatrixobject that contains twoMapdata structures having BioJava symbols as keys and the value being the index of the matrix containing the substitution score.- Parameters:
alpha- the alphabet of the matrix (e.g., DNA, RNA or PROTEIN, or PROTEIN-TERM)matrixFile- the file containing the substitution matrix. Lines starting with '#' are comments. The line starting with a white space, is the table head. Every line has to start with the one letter representation of the Symbol and then the values for the exchange.- Throws:
java.io.IOExceptionBioExceptionjava.lang.NumberFormatException
-
SubstitutionMatrix
public SubstitutionMatrix(FiniteAlphabet alpha, java.lang.String matrixString, java.lang.String name) throws BioException, java.lang.NumberFormatException, java.io.IOException
With this constructor it is possible to construct a SubstitutionMatrix object from a substitution matrix file. The given String contains a number of lines separated bySystem.getProperty("line.separator"). Everything else is the same than for the constructor above.- Parameters:
alpha- TheFiniteAlphabetto usematrixString-name- of the matrix.- Throws:
BioExceptionjava.io.IOExceptionjava.lang.NumberFormatException
-
SubstitutionMatrix
public SubstitutionMatrix(FiniteAlphabet alpha, short match, short replace)
Constructs a SubstitutionMatrix with every Match and every Replace having the same expenses given by the parameters. Ambiguous symbols are not considered because there might be to many of them (for proteins).- Parameters:
alpha-match-replace-
-
SubstitutionMatrix
public SubstitutionMatrix(java.io.File file) throws java.lang.NumberFormatException, java.util.NoSuchElementException, BioException, java.io.IOExceptionThis constructor can be used to guess the alphabet of this substitution matrix. However, it is recommended to apply another constructor if the alphabet is known.- Parameters:
file- A file containing a substitution matrix.- Throws:
java.lang.NumberFormatExceptionjava.util.NoSuchElementExceptionBioExceptionjava.io.IOException
-
-
Method Detail
-
getSubstitutionMatrix
public static SubstitutionMatrix getSubstitutionMatrix(java.io.BufferedReader reader) throws java.lang.NumberFormatException, BioException, java.io.IOException
This constructor can be used to guess the alphabet of this substitution matrix. However, it is recommended to apply another constructor if the alphabet is known.- Parameters:
reader-- Throws:
java.lang.NumberFormatExceptionBioExceptionjava.io.IOException
-
getValueAt
public short getValueAt(Symbol row, Symbol col) throws BioException
There are some substitution matrices containing more columns than lines. This has to do with the ambiguous symbols. Lines are always good, columns might not contain the whole information. The matrix is supposed to be symmetric anyway, so you can always set the ambiguous symbol to be the first argument.- Parameters:
row- Symbol of the linecol- Symbol of the column- Returns:
- expenses for the exchange of symbol row and symbol column.
- Throws:
BioException
-
getDescription
public java.lang.String getDescription()
This gives you the description of this matrix if there is one. Normally substitution matrix files like BLOSUM contain some lines of description.- Returns:
- the comment of the matrix
-
getName
public java.lang.String getName()
Every substitution matrix has a name like "BLOSUM30" or "PAM160". This will be returned by this method.- Returns:
- the name of the matrix.
-
getMin
public short getMin()
The minimum score of this matrix.- Returns:
- minimum of the matrix.
-
getMax
public short getMax()
The maximum score in this matrix.- Returns:
- maximum of the matrix.
-
setDescription
public void setDescription(java.lang.String desc)
Sets the description to the given value.- Parameters:
desc- a description. This doesn't have to start with '#'.
-
getAlphabet
public FiniteAlphabet getAlphabet()
Gives the alphabet used by this matrix.- Returns:
- the alphabet of this matrix.
-
stringnifyMatrix
public java.lang.String stringnifyMatrix()
Creates aStringrepresentation of this matrix.- Returns:
- a string representation of this matrix without the description.
-
stringnifyDescription
public java.lang.String stringnifyDescription()
Converts the description of the matrix to a String.- Returns:
- Gives a description with approximately 60 letters on every line
separated by
System.getProperty("line.separator"). Every line starts with#.
-
toString
public java.lang.String toString()
Overrides the inherited method.- Overrides:
toStringin classjava.lang.Object- Returns:
- Gives a string representation of the SubstitutionMatrix. This is a valid input for the constructor which needs a matrix string. This String also contains the description of the matrix if there is one.
-
printMatrix
public void printMatrix()
Just to perform some test. It prints the matrix on the screen.
-
normalizeMatrix
public SubstitutionMatrix normalizeMatrix() throws BioException, java.lang.NumberFormatException, java.io.IOException
With this method you can get a “normalized”SubstitutionMatrixobject; however, since this implementation uses an short matrix, the normalized matrix will be scaled by ten. If you need values between zero and one, you have to divide every value returned bygetValueAtby ten.- Returns:
- a new and normalized
SubstitutionMatrixobject given by this substitution matrix. Because this uses anshortmatrix, all values are scaled by 10. - Throws:
BioExceptionjava.io.IOExceptionjava.lang.NumberFormatException
-
-