Skip to main content

Dataset

Wolfram Kernel
Execution environment
Dataset[list_List | a_Association]

represents a structured dataset based on a hierarchy of lists and associations.

warning

Docs are not complete. See original page

Optimizations

Use numerical data or DateObject or boolean type for an entire column to get better performance.

Supported output forms

Examples

Basic examples

ds = Dataset[ Table[<|"a" -> 3 i, "b" -> 3 i + 2, "c" -> 3 i + 5|>, {i, 3}]]

Get the second row

ds[2]

Get the second column:

ds[All, "b"]

Compute the Total of each column:

ds[Total]

Load dataset of passengers

titanic = ExampleData[{"Dataset", "Titanic"}]

Get a random sample of passengers:

RandomSample[titanic, 5]

Count the number of passengers in 1st, 2nd and 3rd class:

titanic[Counts, "class"]

Get a histogram of passenger ages:

titanic[Histogram, "age"]
titanic[GroupBy["class"], Histogram[#, {0, 80, 4}] &, "age"]

Calculate the overall survival ratio:

ratio[list_] := list // Boole // Mean // N;
titanic[GroupBy["sex"], GroupBy["class"], ratio, "survived"]

Get dataset for training neural nets

ro = ResourceObject["Audio Cats and Dogs"];
data = ResourceData[ro];
classes = Normal[data[Union, "Label"]];
RandomSample[data, 2]

Notes

It supports lazy loading for a large set with many rows. The data is stored partially on Kernel if it exceeds 0.5 Mb.

The data is saved, when exported using Static HTML