Package org.biojava.nbio.survival.data
Class WorkSheet
- java.lang.Object
-
- org.biojava.nbio.survival.data.WorkSheet
-
public class WorkSheet extends java.lang.ObjectNeed to handle very large spreadsheets of expression data so keep memory footprint low- Author:
- Scooter Willis
-
-
Constructor Summary
Constructors Constructor Description WorkSheet()WorkSheet(java.lang.String[][] values)WorkSheet(java.util.Collection<java.lang.String> rows, java.util.Collection<java.lang.String> columns)WorkSheet(CompactCharSequence[][] values)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddCell(java.lang.String row, java.lang.String col, java.lang.String value)Add data to a cellvoidaddColumn(java.lang.String column, java.lang.String defaultValue)voidaddColumns(java.util.ArrayList<java.lang.String> columns, java.lang.String defaultValue)Add columns to worksheet and set default valuevoidaddRow(java.lang.String row, java.lang.String defaultValue)voidaddRows(java.util.ArrayList<java.lang.String> rows, java.lang.String defaultValue)Add rows to the worksheet and fill in default valuevoidappendWorkSheetColumns(WorkSheet worksheet)Add columns from a second worksheet to be joined by common row.voidappendWorkSheetRows(WorkSheet worksheet)Add rows from a second worksheet to be joined by common column.voidapplyColumnFilter(java.lang.String column, ChangeValue changeValue)Apply filter to a column to change values from say numberic to nominal based on some rangevoidchangeColumnHeader(java.lang.String col, java.lang.String newCol)voidchangeColumnHeader(ChangeValue changeValue)voidchangeColumnsHeaders(java.util.LinkedHashMap<java.lang.String,java.lang.String> newColumnValues)Change the columns in the HashMap Key to the name of the valuevoidchangeRowHeader(java.lang.String row, java.lang.String newRow)voidchangeRowHeader(ChangeValue changeValue)voidclear()See if we can free up memoryjava.util.ArrayList<java.lang.String>getAllColumns()Get the list of column names including those that may be hiddenjava.util.ArrayList<java.lang.String>getAllRows()Get all rows including those that may be hiddenjava.lang.StringgetCell(java.lang.String row, java.lang.String col)Get cell valuejava.lang.DoublegetCellDouble(java.lang.String row, java.lang.String col)java.lang.IntegergetColumnIndex(java.lang.String column)java.util.LinkedHashMap<java.lang.String,HeaderInfo>getColumnLookup()java.util.ArrayList<java.lang.String>getColumns()Get the list of column names.static WorkSheetgetCopyWorkSheet(WorkSheet copyWorkSheet)Create a copy of a worksheet.static WorkSheetgetCopyWorkSheetSelectedRows(WorkSheet copyWorkSheet, java.util.ArrayList<java.lang.String> rows)Create a copy of a worksheet.java.util.ArrayList<java.lang.String>getDataColumns()java.util.ArrayList<java.lang.String>getDataRows()Get the list of row namesjava.util.ArrayList<java.lang.String>getDiscreteColumnValues(java.lang.String column)Get back a list of unique values in the columnjava.util.ArrayList<java.lang.String>getDiscreteRowValues(java.lang.String row)Get back a list of unique values in the rowjava.lang.StringgetIndexColumnName()WorkSheetgetLogScale(double base)Get the log scale of this worksheet where a zero value will be set to .1 as Log(0) is undefinedWorkSheetgetLogScale(double base, double zeroValue)Get the log scale of this worksheetjava.util.ArrayList<java.lang.String>getMetaDataColumns()java.util.LinkedHashMap<java.lang.String,java.lang.String>getMetaDataColumnsHashMap()java.util.ArrayList<java.lang.String>getMetaDataRows()java.util.LinkedHashMap<java.lang.String,java.lang.String>getMetaDataRowsHashMap()java.util.ArrayList<java.lang.String>getRandomDataColumns(int number)java.util.ArrayList<java.lang.String>getRandomDataColumns(int number, java.util.ArrayList<java.lang.String> columns)java.lang.StringgetRowHeader()java.lang.IntegergetRowIndex(java.lang.String row)java.util.LinkedHashMap<java.lang.String,HeaderInfo>getRowLookup()java.util.ArrayList<java.lang.String>getRows()Get the list of row names.voidhideColumn(java.lang.String column, boolean hide)voidhideEmptyColumns()voidhideEmptyRows()voidhideMetaDataColumns(boolean value)voidhideMetaDataRows(boolean value)voidhideRow(java.lang.String row, boolean hide)booleanisMetaDataColumn(java.lang.String column)booleanisMetaDataRow(java.lang.String row)booleanisValidColumn(java.lang.String col)booleanisValidRow(java.lang.String row)voidmarkMetaDataColumn(java.lang.String column)voidmarkMetaDataColumns(java.util.ArrayList<java.lang.String> metaDataColumns)marks columns as containing meta datavoidmarkMetaDataRow(java.lang.String row)voidrandomlyDivideSave(double percentage, java.lang.String fileName1, java.lang.String fileName2)Split a worksheet randomly.static WorkSheetreadCSV(java.io.File f, char delimiter)static WorkSheetreadCSV(java.io.InputStream is, char delimiter)Read a CSV/Tab delimited file where you pass in the delimiterstatic WorkSheetreadCSV(java.lang.String fileName, char delimiter)Read a CSV/Tab delimitted file where you pass in the delimitervoidreplaceColumnValues(java.lang.String column, java.util.HashMap<java.lang.String,java.lang.String> values)Change values in a column where 0 = something and 1 = something differentvoidsave(java.io.OutputStream outputStream, char delimitter, boolean quoteit)voidsaveCSV(java.lang.String fileName)Save the worksheet as a csv filevoidsaveTXT(java.lang.String fileName)voidsetCacheDoubleValues(boolean value)voidsetIndexColumnName(java.lang.String indexColumnName)voidsetMetaDataColumns(java.util.ArrayList<java.lang.String> metaDataColumns)Clears existing meta data columns and sets new onesvoidsetMetaDataColumnsAfterColumn()voidsetMetaDataColumnsAfterColumn(java.lang.String column)voidsetMetaDataRows(java.util.ArrayList<java.lang.String> metaDataRows)voidsetMetaDataRowsAfterRow()voidsetMetaDataRowsAfterRow(java.lang.String row)voidsetRowHeader(java.lang.String value)voidshuffleColumnsAndThenRows(java.util.ArrayList<java.lang.String> columns, java.util.ArrayList<java.lang.String> rows)Randomly shuffle the columns and rows.voidshuffleColumnValues(java.util.ArrayList<java.lang.String> columns)Need to shuffle column values to allow for randomized testing.voidshuffleRowValues(java.util.ArrayList<java.lang.String> rows)Need to shuffle rows values to allow for randomized testing.WorkSheetswapRowAndColumns()Swap the row and columns returning a new worksheetjava.lang.StringtoString()static WorkSheetunionWorkSheetsRowJoin(java.lang.String w1FileName, java.lang.String w2FileName, char delimitter, boolean secondSheetMetaData)Combine two work sheets where you join based on rows.static WorkSheetunionWorkSheetsRowJoin(WorkSheet w1, WorkSheet w2, boolean secondSheetMetaData)* Combine two work sheets where you join based on rows.
-
-
-
Constructor Detail
-
WorkSheet
public WorkSheet()
-
WorkSheet
public WorkSheet(java.util.Collection<java.lang.String> rows, java.util.Collection<java.lang.String> columns) throws java.lang.Exception- Parameters:
rows-columns-- Throws:
java.lang.Exception
-
WorkSheet
public WorkSheet(java.lang.String[][] values)
- Parameters:
values-
-
WorkSheet
public WorkSheet(CompactCharSequence[][] values)
- Parameters:
values-
-
-
Method Detail
-
clear
public void clear()
See if we can free up memory
-
toString
public java.lang.String toString()
- Overrides:
toStringin classjava.lang.Object
-
randomlyDivideSave
public void randomlyDivideSave(double percentage, java.lang.String fileName1, java.lang.String fileName2) throws java.lang.ExceptionSplit a worksheet randomly. Used for creating a discovery/validation data set The first file name will matched the percentage and the second file the remainder- Parameters:
percentage-fileName1-fileName2-- Throws:
java.lang.Exception
-
getCopyWorkSheetSelectedRows
public static WorkSheet getCopyWorkSheetSelectedRows(WorkSheet copyWorkSheet, java.util.ArrayList<java.lang.String> rows) throws java.lang.Exception
Create a copy of a worksheet. If shuffling of columns or row for testing a way to duplicate original worksheet- Parameters:
copyWorkSheet-rows-- Returns:
- Throws:
java.lang.Exception
-
getCopyWorkSheet
public static WorkSheet getCopyWorkSheet(WorkSheet copyWorkSheet) throws java.lang.Exception
Create a copy of a worksheet. If shuffling of columns or row for testing a way to duplicate original worksheet- Parameters:
copyWorkSheet-- Returns:
- Throws:
java.lang.Exception
-
getMetaDataColumns
public java.util.ArrayList<java.lang.String> getMetaDataColumns()
- Returns:
-
getMetaDataRows
public java.util.ArrayList<java.lang.String> getMetaDataRows()
- Returns:
-
getDataColumns
public java.util.ArrayList<java.lang.String> getDataColumns()
- Returns:
-
shuffleColumnsAndThenRows
public void shuffleColumnsAndThenRows(java.util.ArrayList<java.lang.String> columns, java.util.ArrayList<java.lang.String> rows) throws java.lang.ExceptionRandomly shuffle the columns and rows. Should be constrained to the same data type if not probably doesn't make any sense.- Parameters:
columns-rows-- Throws:
java.lang.Exception
-
shuffleColumnValues
public void shuffleColumnValues(java.util.ArrayList<java.lang.String> columns) throws java.lang.ExceptionNeed to shuffle column values to allow for randomized testing. The columns in the list will be shuffled together- Parameters:
columns-- Throws:
java.lang.Exception
-
shuffleRowValues
public void shuffleRowValues(java.util.ArrayList<java.lang.String> rows) throws java.lang.ExceptionNeed to shuffle rows values to allow for randomized testing. The rows in the list will be shuffled together- Parameters:
rows-- Throws:
java.lang.Exception
-
hideMetaDataColumns
public void hideMetaDataColumns(boolean value)
- Parameters:
value-
-
hideMetaDataRows
public void hideMetaDataRows(boolean value)
- Parameters:
value-
-
setMetaDataRowsAfterRow
public void setMetaDataRowsAfterRow()
-
setMetaDataColumnsAfterColumn
public void setMetaDataColumnsAfterColumn()
-
setMetaDataRowsAfterRow
public void setMetaDataRowsAfterRow(java.lang.String row)
- Parameters:
row-
-
setMetaDataColumnsAfterColumn
public void setMetaDataColumnsAfterColumn(java.lang.String column)
- Parameters:
column-
-
setMetaDataColumns
public void setMetaDataColumns(java.util.ArrayList<java.lang.String> metaDataColumns)
Clears existing meta data columns and sets new ones- Parameters:
metaDataColumns-
-
markMetaDataColumns
public void markMetaDataColumns(java.util.ArrayList<java.lang.String> metaDataColumns)
marks columns as containing meta data- Parameters:
metaDataColumns-
-
markMetaDataColumn
public void markMetaDataColumn(java.lang.String column)
- Parameters:
column-
-
isMetaDataColumn
public boolean isMetaDataColumn(java.lang.String column)
- Parameters:
column-- Returns:
-
isMetaDataRow
public boolean isMetaDataRow(java.lang.String row)
- Parameters:
row-- Returns:
-
markMetaDataRow
public void markMetaDataRow(java.lang.String row)
- Parameters:
row-
-
setMetaDataRows
public void setMetaDataRows(java.util.ArrayList<java.lang.String> metaDataRows)
- Parameters:
metaDataRows-
-
hideEmptyRows
public void hideEmptyRows() throws java.lang.Exception- Throws:
java.lang.Exception
-
hideEmptyColumns
public void hideEmptyColumns() throws java.lang.Exception- Throws:
java.lang.Exception
-
hideRow
public void hideRow(java.lang.String row, boolean hide)- Parameters:
row-hide-
-
hideColumn
public void hideColumn(java.lang.String column, boolean hide)- Parameters:
column-hide-
-
replaceColumnValues
public void replaceColumnValues(java.lang.String column, java.util.HashMap<java.lang.String,java.lang.String> values) throws java.lang.ExceptionChange values in a column where 0 = something and 1 = something different- Parameters:
column-values-- Throws:
java.lang.Exception
-
applyColumnFilter
public void applyColumnFilter(java.lang.String column, ChangeValue changeValue) throws java.lang.ExceptionApply filter to a column to change values from say numberic to nominal based on some range- Parameters:
column-changeValue-- Throws:
java.lang.Exception
-
addColumn
public void addColumn(java.lang.String column, java.lang.String defaultValue)- Parameters:
column-defaultValue-
-
addColumns
public void addColumns(java.util.ArrayList<java.lang.String> columns, java.lang.String defaultValue)Add columns to worksheet and set default value- Parameters:
columns-defaultValue-
-
addRow
public void addRow(java.lang.String row, java.lang.String defaultValue)- Parameters:
row-defaultValue-
-
addRows
public void addRows(java.util.ArrayList<java.lang.String> rows, java.lang.String defaultValue)Add rows to the worksheet and fill in default value- Parameters:
rows-defaultValue-
-
addCell
public void addCell(java.lang.String row, java.lang.String col, java.lang.String value) throws java.lang.ExceptionAdd data to a cell- Parameters:
row-col-value-- Throws:
java.lang.Exception
-
isValidRow
public boolean isValidRow(java.lang.String row)
- Parameters:
row-- Returns:
-
isValidColumn
public boolean isValidColumn(java.lang.String col)
- Parameters:
col-- Returns:
-
setCacheDoubleValues
public void setCacheDoubleValues(boolean value)
- Parameters:
value-
-
getCellDouble
public java.lang.Double getCellDouble(java.lang.String row, java.lang.String col) throws java.lang.Exception- Parameters:
row-col-- Returns:
- Throws:
java.lang.Exception
-
getCell
public java.lang.String getCell(java.lang.String row, java.lang.String col) throws java.lang.ExceptionGet cell value- Parameters:
row-col-- Returns:
- Throws:
java.lang.Exception
-
changeRowHeader
public void changeRowHeader(ChangeValue changeValue)
- Parameters:
changeValue-
-
changeColumnHeader
public void changeColumnHeader(ChangeValue changeValue)
- Parameters:
changeValue-
-
changeRowHeader
public void changeRowHeader(java.lang.String row, java.lang.String newRow) throws java.lang.Exception- Parameters:
row-newRow-- Throws:
java.lang.Exception
-
changeColumnsHeaders
public void changeColumnsHeaders(java.util.LinkedHashMap<java.lang.String,java.lang.String> newColumnValues) throws java.lang.ExceptionChange the columns in the HashMap Key to the name of the value- Parameters:
newColumnValues-- Throws:
java.lang.Exception
-
changeColumnHeader
public void changeColumnHeader(java.lang.String col, java.lang.String newCol) throws java.lang.Exception- Parameters:
col-newCol-- Throws:
java.lang.Exception
-
getColumnIndex
public java.lang.Integer getColumnIndex(java.lang.String column) throws java.lang.Exception- Parameters:
column-- Returns:
- Throws:
java.lang.Exception
-
getRowIndex
public java.lang.Integer getRowIndex(java.lang.String row) throws java.lang.Exception- Parameters:
row-- Returns:
- Throws:
java.lang.Exception
-
getRandomDataColumns
public java.util.ArrayList<java.lang.String> getRandomDataColumns(int number)
- Parameters:
number-- Returns:
-
getRandomDataColumns
public java.util.ArrayList<java.lang.String> getRandomDataColumns(int number, java.util.ArrayList<java.lang.String> columns)- Parameters:
number-columns-- Returns:
-
getAllColumns
public java.util.ArrayList<java.lang.String> getAllColumns()
Get the list of column names including those that may be hidden- Returns:
-
getColumns
public java.util.ArrayList<java.lang.String> getColumns()
Get the list of column names. Does not include hidden columns- Returns:
-
getDiscreteColumnValues
public java.util.ArrayList<java.lang.String> getDiscreteColumnValues(java.lang.String column) throws java.lang.ExceptionGet back a list of unique values in the column- Parameters:
column-- Returns:
- Throws:
java.lang.Exception
-
getDiscreteRowValues
public java.util.ArrayList<java.lang.String> getDiscreteRowValues(java.lang.String row) throws java.lang.ExceptionGet back a list of unique values in the row- Parameters:
row-- Returns:
- Throws:
java.lang.Exception
-
getAllRows
public java.util.ArrayList<java.lang.String> getAllRows()
Get all rows including those that may be hidden- Returns:
-
getRows
public java.util.ArrayList<java.lang.String> getRows()
Get the list of row names. Will exclude hidden values- Returns:
-
getDataRows
public java.util.ArrayList<java.lang.String> getDataRows()
Get the list of row names- Returns:
-
getLogScale
public WorkSheet getLogScale(double base) throws java.lang.Exception
Get the log scale of this worksheet where a zero value will be set to .1 as Log(0) is undefined- Parameters:
base-- Returns:
- Throws:
java.lang.Exception
-
getLogScale
public WorkSheet getLogScale(double base, double zeroValue) throws java.lang.Exception
Get the log scale of this worksheet- Parameters:
base-- Returns:
- Throws:
java.lang.Exception
-
swapRowAndColumns
public WorkSheet swapRowAndColumns() throws java.lang.Exception
Swap the row and columns returning a new worksheet- Returns:
- Throws:
java.lang.Exception
-
unionWorkSheetsRowJoin
public static WorkSheet unionWorkSheetsRowJoin(java.lang.String w1FileName, java.lang.String w2FileName, char delimitter, boolean secondSheetMetaData) throws java.lang.Exception
Combine two work sheets where you join based on rows. Rows that are found in one but not the other are removed. If the second sheet is meta data then a meta data column will be added between the two joined columns- Parameters:
w1FileName-w2FileName-delimitter-secondSheetMetaData-- Returns:
- Throws:
java.lang.Exception
-
unionWorkSheetsRowJoin
public static WorkSheet unionWorkSheetsRowJoin(WorkSheet w1, WorkSheet w2, boolean secondSheetMetaData) throws java.lang.Exception
* Combine two work sheets where you join based on rows. Rows that are found in one but not the other are removed. If the second sheet is meta data then a meta data column will be added between the two joined columns- Parameters:
w1-w2-secondSheetMetaData-- Returns:
- Throws:
java.lang.Exception
-
readCSV
public static WorkSheet readCSV(java.lang.String fileName, char delimiter) throws java.lang.Exception
Read a CSV/Tab delimitted file where you pass in the delimiter- Parameters:
fileName-delimiter-- Returns:
- Throws:
java.lang.Exception
-
readCSV
public static WorkSheet readCSV(java.io.File f, char delimiter) throws java.lang.Exception
- Throws:
java.lang.Exception
-
readCSV
public static WorkSheet readCSV(java.io.InputStream is, char delimiter) throws java.lang.Exception
Read a CSV/Tab delimited file where you pass in the delimiter- Parameters:
f-delimiter-- Returns:
- Throws:
java.lang.Exception
-
saveCSV
public void saveCSV(java.lang.String fileName) throws java.lang.ExceptionSave the worksheet as a csv file- Parameters:
fileName-- Throws:
java.lang.Exception
-
saveTXT
public void saveTXT(java.lang.String fileName) throws java.lang.Exception- Parameters:
fileName-- Throws:
java.lang.Exception
-
setRowHeader
public void setRowHeader(java.lang.String value)
- Parameters:
value-
-
appendWorkSheetColumns
public void appendWorkSheetColumns(WorkSheet worksheet) throws java.lang.Exception
Add columns from a second worksheet to be joined by common row. If the appended worksheet doesn't contain a row in the master worksheet then default value of "" is used. Rows in the appended worksheet not found in the master worksheet are not added.- Parameters:
worksheet-- Throws:
java.lang.Exception
-
appendWorkSheetRows
public void appendWorkSheetRows(WorkSheet worksheet) throws java.lang.Exception
Add rows from a second worksheet to be joined by common column. If the appended worksheet doesn't contain a column in the master worksheet then default value of "" is used. Columns in the appended worksheet not found in the master worksheet are not added.- Parameters:
worksheet-- Throws:
java.lang.Exception
-
save
public void save(java.io.OutputStream outputStream, char delimitter, boolean quoteit) throws java.lang.Exception- Parameters:
outputStream-delimitter-quoteit-- Throws:
java.lang.Exception
-
getIndexColumnName
public java.lang.String getIndexColumnName()
- Returns:
- the indexColumnName
-
setIndexColumnName
public void setIndexColumnName(java.lang.String indexColumnName)
- Parameters:
indexColumnName- the indexColumnName to set
-
getColumnLookup
public java.util.LinkedHashMap<java.lang.String,HeaderInfo> getColumnLookup()
- Returns:
- the columnLookup
-
getRowLookup
public java.util.LinkedHashMap<java.lang.String,HeaderInfo> getRowLookup()
- Returns:
- the rowLookup
-
getMetaDataColumnsHashMap
public java.util.LinkedHashMap<java.lang.String,java.lang.String> getMetaDataColumnsHashMap()
- Returns:
- the metaDataColumnsHashMap
-
getMetaDataRowsHashMap
public java.util.LinkedHashMap<java.lang.String,java.lang.String> getMetaDataRowsHashMap()
- Returns:
- the metaDataRowsHashMap
-
getRowHeader
public java.lang.String getRowHeader()
- Returns:
- the rowHeader
-
-