Class DataFrame
java.lang.Object
org.snpsift.annotate.mem.dataFrame.DataFrame
- All Implemented Interfaces:
Serializable
- Direct Known Subclasses:
DataFrameDel,DataFrameIns,DataFrameMixed,DataFrameMnp,DataFrameOther,DataFrameSnp
A set of DataColumns, indexed by position.
This class is used to store data for a chromosome.
The DataFrame class manages a collection of data columns, each represented by a DataFrameColumn object.
It provides methods to add rows, retrieve rows, and perform various operations on the data.
The data is indexed by position using a PosIndex object, and can optionally include reference and alternative alleles.
The class also supports creating columns based on VCF header information and resizing the data for memory optimization.
The main components of the DataFrame class are:
- VariantTypeCounter variantTypeCounter: Keeps track of variant types and their counts.
- VariantCategory variantCategory: Represents the category of variants.
- int currentIdx: The current index for adding new rows.
- PosIndex posIndex: Indexes the data by chromosome position.
- StringArray refs: Stores reference alleles.
- StringArray alts: Stores alternative alleles.
- Mapinvalid input: '<'String, DataFrameColumninvalid input: '<'?>> columns: A map of column names to DataFrameColumn objects.
- Fields fields: Represents the fields to create or annotate.
The class provides the following key methods:
- add(String name, DataFrameColumninvalid input: '<'?> column): Adds a column to the DataFrame.
- add(DataFrameRow row): Adds a row to the DataFrame.
- check(): Checks the integrity of the data.
- columnNames(): Returns an iterable of column names.
- createColumn(VcfHeaderInfo vcfHeaderInfo): Creates a column based on VCF header information.
- createColumns(): Creates columns based on the fields.
- eq(int idx, int pos, String ref, String alt): Checks if the entry at the given index matches the specified position, reference, and alternative alleles.
- get(String columnName, int idx): Retrieves data from a column by index.
- getColumn(String name): Retrieves a column by name.
- getRow(int pos, String ref, String alt): Retrieves a row based on position, reference, and alternative alleles.
- find(int pos, String ref, String alt): Finds the index of a row based on position, reference, and alternative alleles.
- hasEntry(int pos, String ref, String alt): Checks if an entry exists for the specified position, reference, and alternative alleles.
- resize(): Resizes and optimizes the memory usage of the data.
- set(String columnName, int idx, Object value): Sets data in a column.
- sizeBytes(): Returns the memory size of the DataFrame.
- stringArrayMemSize(VariantCategory variantCategory, String field): Calculates the memory size for a string array.
- toString(): Returns a string representation of the DataFrame.
- See Also:
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionDataFrame(VariantTypeCounter variantTypeCounter, VariantCategory variantCategory, boolean hasRefs, boolean hasAlts) -
Method Summary
Modifier and TypeMethodDescriptionvoidadd(DataFrameRow row) Add a row to the data framevoidcheck()protected DataFrameColumn<?> createColumn(org.snpeff.vcf.VcfHeaderInfo vcfHeaderInfo) Create a column of a given typeprotected voidCreate columns based on fieldsprotected booleanDoes the entry at possition 'idx' match the given (pos, ref, alt) values?protected intGet data from a column index by searching by position, reference and alternative alleles.protected ObjectGet data from a column by searching by position, reference and alternative alleles.Get a columnGet a 'row' from the data frame.booleanGet data from a column by searching by position, reference and alternative alleles.voidresize()Resize and memory optimize the dataprotected voidSet data in a columnlongMemory size of this objecttoString()
-
Field Details
-
MAX_ROWS_TO_SHOW
public static final int MAX_ROWS_TO_SHOW- See Also:
-
-
Constructor Details
-
DataFrame
public DataFrame(VariantTypeCounter variantTypeCounter, VariantCategory variantCategory, boolean hasRefs, boolean hasAlts)
-
-
Method Details
-
add
Add a row to the data frame -
check
public void check() -
columnNames
-
createColumn
Create a column of a given type -
createColumns
protected void createColumns()Create columns based on fields -
eq
Does the entry at possition 'idx' match the given (pos, ref, alt) values? -
get
Get data from a column by searching by position, reference and alternative alleles. Note: The value can be null -
getColumn
Get a column -
getRow
Get a 'row' from the data frame.- Parameters:
pos- : Positionref- : Reference allelealt- : Alternative allele- Returns:
- A data frame row if found, or null if not found
-
find
Get data from a column index by searching by position, reference and alternative alleles.- Returns:
- The index of the row in the data frame, or -1 if not found
-
hasEntry
Get data from a column by searching by position, reference and alternative alleles. Note: The value can be null -
resize
public void resize()Resize and memory optimize the data -
set
Set data in a column -
sizeBytes
public long sizeBytes()Memory size of this object -
toString
-