shapelets.DataSet#

class shapelets.DataSet(rel: Relation, _DataSet__map_fn: Callable[[Relation], type], isColAttr: bool = False)#

Attributes:

Methods

`add_column`(colname, *genExpr)	Adds a new column (colname) to the Dataset and returns the new Dataset.
`count`()	Returns the number of rows in this DataSet
`distinct`([cols])	Returns distinct row values found in this dataset
`drop_columns`([cols, pattern, full_match, flags])	Drops columns in a DataSet
`filter`(func)	Returns the Dataset filtered according to the conditions set by a lambda function (func)
`head`([n])	Returns the first n rows.
`limit`([n, offset])	Returns a new DataSet of n rows.
`rename_columns`(new_names)	Renames the columns in a DataSet
`rewrite_col`(idx, col)	It takes idx (index) and col (colname) as params and returns a tuple (idx,new_col) where:
`select_columns`([cols, pattern, full_match, ...])	Selects or reorganises columns in a DataSet
`shape`()	Returns the shape of this DataSet, as a tuple containing the number of rows and the column count.
`sort_by`(cols[, ascending])	Sets a sorting criteria
`split_by_column`(colname)	This method returns a dictionary of DataSet, each of them corresponding to the different entries of a specific column (colname).
`tail`([n])	Returns the last n rows.
`to_arrow_record_batch_reader`(blocks)	Returns an object that can be iterated to consume data in blocks.
`to_arrow_table`(blocks)	Returns the full result as a table made of chucks of size approx rowsInBatch
`to_csv`(file[, delimiter, escape, ...])	Materializes a relation and exports the results to a CSV file