Entity and methods
Dataitem
Bases: MaterialEntity
A class representing a dataitem.
Source code in digitalhub_data/entities/dataitem/entity/_base.py
DataitemTable
Bases: Dataitem
Table dataitem.
Source code in digitalhub_data/entities/dataitem/entity/table.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 |
|
as_df(file_format=None, engine=None, clean_tmp_path=True, **kwargs)
Read dataitem file (csv or parquet) as a DataFrame from spec.path. If the dataitem is not local, it will be downloaded to a temporary folder named tmp_dir in the project context folder. If clean_tmp_path is True, the temporary folder will be deleted after the method is executed. It's possible to pass additional arguments to the this function. These keyword arguments will be passed to the DataFrame reader function such as pandas's read_csv or read_parquet.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file_format |
str
|
Format of the file. (Supported csv and parquet). |
None
|
engine |
str
|
Dataframe framework, by default pandas. |
None
|
clean_tmp_path |
bool
|
If True, the temporary folder will be deleted. |
True
|
**kwargs |
dict
|
Keyword arguments passed to the read_df function. |
{}
|
Returns:
Type | Description |
---|---|
Any
|
DataFrame. |
Source code in digitalhub_data/entities/dataitem/entity/table.py
write_df(df, extension=None, **kwargs)
Write DataFrame as parquet/csv/table into dataitem spec.path. keyword arguments will be passed to the DataFrame reader function such as pandas's to_csv or to_parquet.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
Any
|
DataFrame to write. |
required |
extension |
str
|
Extension of the file. |
None
|
**kwargs |
dict
|
Keyword arguments passed to the write_df function. |
{}
|
Returns:
Type | Description |
---|---|
str
|
Path to the written dataframe. |
Source code in digitalhub_data/entities/dataitem/entity/table.py
dataitem_from_dict(obj)
Create a new object from dictionary.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj |
dict
|
Dictionary to create object from. |
required |
Returns:
Type | Description |
---|---|
Dataitem
|
Object instance. |
Source code in digitalhub_data/entities/dataitem/builder.py
dataitem_from_parameters(project, name, kind, uuid=None, description=None, labels=None, embedded=True, path=None, **kwargs)
Create a new object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
project |
str
|
Project name. |
required |
name |
str
|
Object name. |
required |
kind |
str
|
Kind the object. |
required |
uuid |
str
|
ID of the object (UUID4, e.g. 40f25c4b-d26b-4221-b048-9527aff291e2). |
None
|
description |
str
|
Description of the object (human readable). |
None
|
labels |
list[str]
|
List of labels. |
None
|
embedded |
bool
|
Flag to determine if object spec must be embedded in project spec. |
True
|
path |
str
|
Object path on local file system or remote storage. It is also the destination path of upload() method. |
None
|
**kwargs |
dict
|
Spec keyword arguments. |
{}
|
Returns:
Type | Description |
---|---|
Dataitem
|
Object instance. |