Loading a csv file¶
We load a tab separated data file using the load_table() function. The format is inferred from the filename suffix and you will note, in this case, it’s not actually a csv file.
Note
The known filename suffixes for reading are .csv, .tsv and .pkl or .pickle (Python’s pickle format).
Note
If you invoke the static column types argument, i.e.``load_table(…, static_column_types=True)`` and the column data are not static, those columns will be left as a string type.
Loading delimited specifying the format¶
Although unnecessary in this case, it’s possible to override the suffix by specifying the delimiter using the sep argument.
Loading delimited data without a header line¶
To create a table from the follow examples, you specify your header and use make_table().
Using load_delimited()¶
This is just a standard parsing function which does not do any filtering or converting elements to non-string types.
Using FilteringParser¶
Selectively loading parts of a big file¶
Loading a set number of lines from a file¶
The limit argument specifies the number of lines to read.
Loading only some rows¶
If you only want a subset of the contents of a file, use the FilteringParser. This allows skipping certain lines by using a callback function. We illustrate this with stats.tsv, skipping any rows with "Ratio" > 10.
You can also negate a condition, which is useful if the condition is complex. In this example, it means keep the rows for which Ratio > 10.
Loading only some columns¶
Specify the columns by their names.
Or, by their index.
Note
The negate argument does not affect the columns evaluated.
Load raw data as a list of lists of strings¶
We just use FilteringParser.
We just display the first two lines.
Note
The individual elements are all str.
Make a table from header and rows¶
Make a table from a dict¶
For a dict with key’s as column headers.
Specify the column order when creating from a dict.¶
Create the table with an index¶
A Table can be indexed like a dict if you designate a column as the index (and that column has a unique value for every row).
Note
The index_name argument also applies when using make_table().
Create a table from a pandas.DataFrame¶
Create a table from header and rows¶
Create a table from dict¶
make_table() is the utility function for creating Table objects from standard python objects.