Dataitem kinds
At the moment, we support the following kinds:
table
: represents a table
For each different kind, the Dataitem
object has its own subclass with different spec
and status
attributes.
Table
The table
kind indicates that the dataitem is a generic table. It's usefull if you intend to manipulate the dataitem as a dataframe, infact it has some methods to do so. The default dataframe framework we use to represent a table as dataframe is pandas
.
Table spec parameters
Parameter | Type | Description | Default |
---|---|---|---|
path |
str | Path of the dataitem, can be a local path or a remote path, a single filepath or a directory/partition. | required |
schema |
TableSchema | Frictionless table schema | None |
Table methods
The table
kind has the following additional methods:
as_df
Read dataitem file (csv or parquet) as a DataFrame from spec.path. If the dataitem is not local, it will be downloaded to a temporary folder named tmp_dir in the project context folder. If clean_tmp_path is True, the temporary folder will be deleted after the method is executed. It's possible to pass additional arguments to the this function. These keyword arguments will be passed to the DataFrame reader function such as pandas's read_csv or read_parquet.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_format |
str
|
Format of the file. (Supported csv and parquet). |
None
|
engine |
str
|
Dataframe framework, by default pandas. |
None
|
clean_tmp_path |
bool
|
If True, the temporary folder will be deleted. |
True
|
**kwargs |
dict
|
Keyword arguments passed to the read_df function. |
{}
|
Returns:
Type | Description |
---|---|
Any
|
DataFrame. |
write_df
Write DataFrame as parquet/csv/table into dataitem spec.path. keyword arguments will be passed to the DataFrame reader function such as pandas's to_csv or to_parquet.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
Any
|
DataFrame to write. |
required |
extension |
str
|
Extension of the file. |
None
|
**kwargs |
dict
|
Keyword arguments passed to the write_df function. |
{}
|
Returns:
Type | Description |
---|---|
str
|
Path to the written dataframe. |