Skip to main content

Read, query and write CSV files

Import the sandbox object

from shapelets.data import sandbox

Convert parquet to CSV

sandbox().from_parquet(
rel_name="taxis",
paths=["../Benchmarks/nyc-taxi/2009/01/*.parquet"]
).to_csv('sample.csv')

Create a sandbox

playground = sandbox()

Load data into sandbox

playground.from_csv(
rel_name="taxis",
paths=["sample.csv"]
)

Execute query

result = playground.from_sql("""
SELECT
AVG(passenger_count)
FROM
taxis
GROUP BY
EXTRACT('day' from dropoff_at), EXTRACT('hour' from dropoff_at)
""").execute()

Visualize / Export data

result.to_pandas()