pretty

Pretty print a fragment of the dataset in an aligned tabular format to standard output.

When the table is large only the first and last few rows and columns of data are shown.

The size of the dataset in number of rows and columns is also displayed at the end.

This is useful for getting a quick overview of the dataset.

Usage

gurita pretty [-h] [-c [COLUMN ...]] [--maxrows NUM] [--maxcols NUM]

Arguments

Argument

Description

Reference

  • -h

  • --help

display help for this command

help

  • -c [COLUMN ...]

  • --col [COLUMN ...]

select columns for display

columns

--maxrows NUM

display at most this many rows of data (default 10)

maximum rows

--maxcols NUM

display at most this many columns of data (default 10)

maximum columns

See also

The describe command prints summary statistics of the columns in a dataset to standard output (stdout).

Simple example

Make a pretty display of the data in the titanic.csv file:

gurita pretty < titanic.csv

The output of the above command is as follows:

 survived  pclass    sex  age  sibsp  ...  adult_male  deck embark_town alive alone
        0       3   male 22.0      1  ...        True   NaN Southampton    no False
        1       1 female 38.0      1  ...       False     C   Cherbourg   yes False
        1       3 female 26.0      0  ...       False   NaN Southampton   yes  True
        1       1 female 35.0      1  ...       False     C Southampton   yes False
        0       3   male 35.0      0  ...        True   NaN Southampton    no  True
      ...     ...    ...  ...    ...  ...         ...   ...         ...   ...   ...
        0       2   male 27.0      0  ...        True   NaN Southampton    no  True
        1       1 female 19.0      0  ...       False     B Southampton   yes  True
        0       3 female  NaN      1  ...       False   NaN Southampton    no False
        1       1   male 26.0      0  ...        True     C   Cherbourg   yes  True
        0       3   male 32.0      0  ...        True   NaN  Queenstown    no  True

[891 rows x 15 columns]

Getting help

The full set of command line arguments for pretty can be obtained with the -h or --help arguments:

gurita pretty -h

Select specific columns to display

-c [COLUMN ...], --col [COLUMN ...]

By default pretty shows information about all columns in a dataset up to the limit imposed by the --maxcols parameter, which defaults to 10.

Alternatively, a subset of the columns can be selected using the -c/--col argument.

As an example, The following commmand only shows summary information for the age and class columns in the file titanic.csv:

gurita pretty --col age class < titanic.csv

The output of the above command is as follows:

 age  class
22.0  Third
38.0  First
26.0  Third
35.0  First
35.0  Third
 ...    ...
27.0 Second
19.0  First
 NaN  Third
26.0  First
32.0  Third

[891 rows x 15 columns]

Note that only the age and class columns are shown. However, the size of the dataset reported at the bottom of the display reflects the entire dataset, not just the columns that are shown.

Limiting the maximum number of rows and columns to display

--maxrows NUM
--maxcols NUM

By default the pretty command will display a maximum of 10 rows and columns. If the dataset is larger than this size then the first 5 and last 5 rows and columns will be shown, and the intervening columns will be elided and shown with ellipses.

These defaults can be overridden by --maxrows and --maxcols respectively.

For example, the following command sets the maximum number of rows to 20 and maximum number of columns to 6:

gurita pretty --maxrows 20 --maxcols 6 < titanic.csv

The output of the above command is as follows:

 survived  pclass    sex  ...  embark_town  alive  alone
        0       3   male  ...  Southampton     no  False
        1       1 female  ...    Cherbourg    yes  False
        1       3 female  ...  Southampton    yes   True
        1       1 female  ...  Southampton    yes  False
        0       3   male  ...  Southampton     no   True
        0       3   male  ...   Queenstown     no   True
        0       1   male  ...  Southampton     no   True
        0       3   male  ...  Southampton     no  False
        1       3 female  ...  Southampton    yes  False
        1       2 female  ...    Cherbourg    yes  False
      ...     ...    ...  ...          ...    ...    ...
        0       3   male  ...  Southampton     no   True
        0       3 female  ...  Southampton     no   True
        0       2   male  ...  Southampton     no   True
        0       3   male  ...  Southampton     no   True
        0       3 female  ...   Queenstown     no  False
        0       2   male  ...  Southampton     no   True
        1       1 female  ...  Southampton    yes   True
        0       3 female  ...  Southampton     no  False
        1       1   male  ...    Cherbourg    yes   True
        0       3   male  ...   Queenstown     no   True

[891 rows x 15 columns]

Usage in a command chain

When used in a command chain the pretty command passes on its input data to the rest of the chain unchanged.

For example, the following command shows pretty followed by a box plot:

gurita pretty + box -x sex -y age < titanic.csv

This command will first run pretty to display the data on the output, and then it will run box to generate a plot on the same input data.

Because pretty just passes the data along from left to right the box command receives the same data as its input that was read from the file.

It is also worth noting that pretty can be used after other transformations have been applied to the data. For example, the data can be filtered first, and then the result of filtering can be fed into pretty:

gurita filter 'age >= 30' + pretty < titanic.csv

The output of the above command is as follows:

 survived  pclass    sex  age  sibsp  ...  adult_male  deck embark_town alive alone
        1       1 female 38.0      1  ...       False     C   Cherbourg   yes False
        1       1 female 35.0      1  ...       False     C Southampton   yes False
        0       3   male 35.0      0  ...        True   NaN Southampton    no  True
        0       1   male 54.0      0  ...        True     E Southampton    no  True
        1       1 female 58.0      0  ...       False     C Southampton   yes  True
      ...     ...    ...  ...    ...  ...         ...   ...         ...   ...   ...
        0       3   male 47.0      0  ...        True   NaN Southampton    no  True
        1       1 female 56.0      0  ...       False     C   Cherbourg   yes False
        0       3   male 33.0      0  ...        True   NaN Southampton    no  True
        0       3 female 39.0      0  ...       False   NaN  Queenstown    no False
        0       3   male 32.0      0  ...        True   NaN  Queenstown    no  True

[330 rows x 15 columns]