shapelets.DataSet#
- class shapelets.DataSet(rel: Relation, _DataSet__map_fn: Callable[[Relation], type], isColAttr: bool = False)#
- Attributes:
- alias
- columns
- schema
Methods
add_column
(colname, *genExpr)Adds a new column (colname) to the Dataset and returns the new Dataset.
count
()Returns the number of rows in this DataSet
distinct
([cols])Returns distinct row values found in this dataset
drop_columns
([cols, pattern, full_match, flags])Drops columns in a DataSet
filter
(func)Returns the Dataset filtered according to the conditions set by a lambda function (func)
head
([n])Returns the first n rows.
limit
([n, offset])Returns a new DataSet of n rows.
rename_columns
(new_names)Renames the columns in a DataSet
rewrite_col
(idx, col)It takes idx (index) and col (colname) as params and returns a tuple (idx,new_col) where:
select_columns
([cols, pattern, full_match, ...])Selects or reorganises columns in a DataSet
shape
()Returns the shape of this DataSet, as a tuple containing the number of rows and the column count.
sort_by
(cols[, ascending])Sets a sorting criteria
split_by_column
(colname)This method returns a dictionary of DataSet, each of them corresponding to the different entries of a specific column (colname).
tail
([n])Returns the last n rows.
to_arrow_record_batch_reader
(blocks)Returns an object that can be iterated to consume data in blocks.
to_arrow_table
(blocks)Returns the full result as a table made of chucks of size approx rowsInBatch
to_csv
(file[, delimiter, escape, ...])Materializes a relation and exports the results to a CSV file
cross_product
describe
intersect
minus
printSchema
sample
to_numpy
to_numpy_batch
to_pandas
to_pandas_batch
to_parquet
union