Dataset
Wolfram Kernel
Execution environment
Dataset[list_List | a_Association]
represents a structured dataset based on a hierarchy of lists and associations.
warning
Docs are not complete. See original page
Optimizations
Use numerical data or DateObject
or boolean type for an entire column to get better performance.
Supported output forms
Examples
Basic examples
ds = Dataset[ Table[<|"a" -> 3 i, "b" -> 3 i + 2, "c" -> 3 i + 5|>, {i, 3}]]
Get the second row
ds[2]
Get the second column:
ds[All, "b"]
Compute the Total of each column:
ds[Total]
Load dataset of passengers
titanic = ExampleData[{"Dataset", "Titanic"}]
Get a random sample of passengers:
RandomSample[titanic, 5]
Count the number of passengers in 1st, 2nd and 3rd class:
titanic[Counts, "class"]
Get a histogram of passenger ages:
titanic[Histogram, "age"]
titanic[GroupBy["class"], Histogram[#, {0, 80, 4}] &, "age"]
Calculate the overall survival ratio:
ratio[list_] := list // Boole // Mean // N;
titanic[GroupBy["sex"], GroupBy["class"], ratio, "survived"]
Get dataset for training neural nets
ro = ResourceObject["Audio Cats and Dogs"];
data = ResourceData[ro];
classes = Normal[data[Union, "Label"]];
RandomSample[data, 2]
Notes
It supports lazy loading for a large set with many rows. The data is stored partially on Kernel if it exceeds 0.5 Mb.
The data is saved, when exported using Static HTML