Categories &

Functions List

Class Definition: categorical

datatypes: categorical

Array representing categorical data.

A categorical array represents an array of values that correspond to a finite set of discrete categories, which can be either ordinal (having a mathematical ordering) or nominal. It is an efficient way to define groups of rows in a table or to other types of variables.

Each categorical array stores the list of categories as a cell array of character vectors and a numeric array of uint16 type as indices to the categories. The categorical array may also store elements of undefined categorical values, which represent the absense of a given category and correspond to the NaN value for numeric arrays or in general to the missing value for other data types.

categorical arrays do not have any public properties, which can be indexed by using dot notation similarily to structures. However, there are several methods which can be used to modify their categories once they are constructed.

Source Code: categorical

Methods

categorical: C = categorical (A)
categorical: C = categorical (A, valueset)
categorical: C = categorical (A, valueset, catnames)
categorical: C = categorical (…, Name, Value)

C = categorical (A) creates a categorical array C from the input array A, which can be numeric, logical, datetime, duration, string, or cell array of character vectors. Input A can also be another categorical array. The categories in C the sorted unique values from the input array A. When the input array is string or cell array of character vectors, any leading or trailing white spaces are removed. Missing values in the input array correspond to <undefined> elements in the created categorical array. By default, there is no category for undefined values in the output array.

C = categorical (A, valueset) creates a categorical array from input A with the categories specified in valueset, which must be a vector of unique values. The data type of input array A and valueset must be the same, unless they are string or cell arrays of character vectors, in which case they can be used interchangeably. Similarly to intput array A any leading or trailing white spaces are removed, if valueset is a string or cell array of character vectors.

C = categorical (A, valueset, catnames) creates a categorical array from input A with the categories specified in valueset and named after the corresponding values in catnames, which must be specified either as a string array or a cell array of character vectors. If omitted, categorical uses the cellstring representation of valueset to name the specified category names. catnames must not contain any missing values, it may have duplicate names, and it must have the same number of elements as valueset.

C = categorical (…, Name, Value) further specifies additional parameters for creating categorical array C.

  • "Ordinal" must be a logical scalar specifying that the categories in C have a numeric ordering relationship. By default, it is false and categorical creates a non-ordinal array. The elements of unordered categorical arrays can only be compared for equality. Any other relational operator cannot be used. Setting "Ordinal" to true results in a categorical array with mathematically orderred categories. The ordering goes from smallest to largest according to the order in valueset or the order or appearance in input array A, when valueset is not specified, in which case the unique values in {A are not sorted in order to set the categories. Ordinal categorical arrays allow for relational operators such as >=, >, <=, <, as well as statistical operations such as min, max, and median.
  • "Protected" must be a logical scalar specifying that the categories in C are protected. By default, it is false for unordered categorical arrays and it is always true for ordinal categorical arrays. Setting "Protected" to true prevents from assigning new values that do not correspond to existing categories. When false, assigning new values to the array automatically updates the categories. Hence, categorical arrays with differenct sets of categories can be combined/merged into a new array with set operations.

See also: categories, discretize, iscategorical

categorical: cstr = dispstrings (C)

cstr = dispstrings (C) returns a cellstr array of character vectors, cstr, which has the same size as the input categorical C.

categorical: cstr = cellstr (C)

cstr = cellstr (C) returns a cellstr array of character vectors, cstr, which has the same size as the input categorical C.

categorical: CM = char (C)

CM = char (C) returns a character matrix CM, which contains numel (C) rows and each row contains the category name for the corresponding element of C(:).

categorical: out = double (C)

out = double (C) returns a double array indexing the categories in C. Categorical elements of undefined category are returned as NaN.

categorical: out = single (C)

out = single (C) returns a single array indexing the categories in C. Categorical elements of undefined category are returned as NaN.

categorical: out = int64 (C)

out = int64 (C) returns a int64 array indexing the categories in C. Categorical elements of undefined category are returned as 0.

categorical: out = int32 (C)

out = int32 (C) returns a int32 array indexing the categories in C. Categorical elements of undefined category are returned as 0.

categorical: out = int16 (C)

out = int16 (C) returns a int16 array indexing the categories in C. Categorical elements of undefined category are returned as 0. Note that the returned category indices saturate to intmax ('int16'), which is 32767.

categorical: out = int8 (C)

out = int8 (C) returns a int8 array indexing the categories in C. Categorical elements of undefined category are returned as 0. Note that the returned category indices saturate to intmax ('int8'), which is 127.

categorical: out = uint64 (C)

out = uint64 (C) returns a uint64 array indexing the categories in C. Categorical elements of undefined category are returned as 0.

categorical: out = uint32 (C)

out = uint32 (C) returns a uint32 array indexing the categories in C. Categorical elements of undefined category are returned as 0.

categorical: out = uint16 (C)

out = uint16 (C) returns a uint16 array indexing the categories in C. Categorical elements of undefined category are returned as 0.

categorical: out = uint8 (C)

out = uint8 (C) returns a uint8 array indexing the categories in C. Categorical elements of undefined category are returned as 0. Note that the returned category indices saturate to intmax ('uint8'), which is 255.

categorical: summary (C)

summary (C) displays the number of elements in the categorical array C that are equal to each category of C. Any undefined elements in C are summed together and displayed as <undefined>.

categorical: cstr = categories (C)

cstr = categories (C) returns a cell array of character vectors with the names of the categories in C.

categorical: C = countcats (A)
categorical: C = countcats (A, dim)

C = countcats (A) returns the number of elements for each category in A. If A is a vector, C is also a vector with one element for each category in A. If A is a matrix, C is a matrix with each column containing the category counts from each column of A. For multidimensional arrays, countcats operates along the first non-singleton dimension.

C = countcats (A, dim) aperates along the dimension dim.

categorical: N = length (C)

N = length (C) returns the size of the longest dimension of the categorical array C, unless any of its dimensions has zero length, in which case length (C) returns 0.

categorical: sz = size (C)
categorical: dim_sz = size (C, dim)
categorical: dim_sz = size (C, d1, d2, …)
categorical: [rows, columns, …, dim_n_sz] = size (…)

sz = size (C) returns a row vector with the size (number of elements) of each dimension for the categorical array C.

dim_sz = size (C, dim) returns the size of the corresponding dimension specified in dim. If dim is a vector, then dim_sz is a vector of the same length and with each element corresponding to a specified dimension. Multiple dimensions may also be specified as separate arguments.

With a single output argument, size returns a row vector. When called with multiple output arguments, size returns the size of dimension N in the Nth argument.

categorical: out = ndims (C)

out = ndims (C) returns the number of dimensions of the categorical array C.

categorical: out = numel (C)

For compatibility reasons with Octave’s OOP interface and subsasgn behavior, categorical’s numel is defined to always return 1.

categorical: hey = keyHash (C)
categorical: hey = keyHash (C, base)

h = keyHash (C) generates a uint64 scalar that represents the input array C. keyHash utilizes the 64-bit FMV-1a variant of the Fowler-Noll-Vo non-cryptographic hash function.

h = keyHash (C), base also generates a 64-bit hash code using base as the offset basis for the FNV-1a hash algorithm. base must be a uint64 integer type scalar. Use this syntax to cascade keyHash on multiple objects for which a single hash code is required.

Note that unlike MATLAB, this implementation does no use any random seed. As a result, keyHash will always generate the exact same hash key for any particular input across different workers and Octave sessions.

categorical: TF = iscategory (C, catnames)

TF = iscategory (C, catnames) returns a logical array TF of the same size as catnames containing true for each corresponding element of catnames that is a category in categorical array C and false otherwise.

categorical: TF = iscolumn (C)

TF = iscolumn (C) returns a logical scalar TF, which is true, if the categorical array C is a column vector, and false otherwise. A column vector is a 2-D array for which size (X) returns [N, 1] with non-negative N.

categorical: TF = isempty (C)

TF = isempty (C) returns a logical scalar TF, which is true, if the categorical array C is empty, and false otherwise.

categorical: TF = isequal (C1, C2)
categorical: TF = isequal (C1, C2, …)

TF = isequal (C1, C2) returns a logical scalar TF, which is true, if the categorical arrays C1 and C2 contain the same values, and false otherwise. Either C1 or C2 may also be a string array, a missing object array, a character vector, or a cell array of character vectors, which will be promoted to a categorical array prior to comparison.

If categorical arrays C1 and C2 are ordinal, they must have the same set and ordering of categories. If neither are ordinal, the category names of each pair of elements are compared. Hence, they do not need to have the same set of categories.

TF = isequal (C1, C2, …) returns a logical scalar TF, which is true, if all input arguments are equal, and false otherwise.

categorical: TF = isequaln (C1, C2, …)

TF = isequaln (C1, C2) returns a logical scalar TF, which is true, if the categorical arrays C1 and C2 contain the same values or corresponding undefined elements, and false otherwise. Either C1 or C2 may also be a string array, a missing object array, a character vector, or a cell array of character vectors, which will be promoted to a categorical array prior to comparison.

If categorical arrays C1 and C2 are ordinal, they must have the same set and ordering of categories. If neither are ordinal, the category names of each pair of elements are compared. Hence, they do not need to have the same set of categories.

TF = isequaln (C1, C2, …) returns a logical scalar TF, which is true, if all input arguments are equal, and false otherwise.

categorical: TF = ismatrix (C)

TF = ismatrix (C) returns a logical scalar TF, which is true, if the categorical array C is a matrix, and false otherwise. A matrix is an array of any type where ndims (X) == 2 and for which size (X) returns [H, W] with non-negative H and W.

categorical: TF = ismember (A, B)
categorical: TF = ismember (A, B, 'rows')
categorical: [TF, index] = ismember (…)

TF = ismember (A, B) returns a logical array TF of the same size as A containing true for each corresponding element of A that is in B and false otherwise. If A and B are both ordinal, they must both have the same ordered set of categories. If neither A nor B are ordinal, then this restriction is relaxed and comparison is performed using the category names. Comparison between an ordinal and an unordered categorical array is not allowed. A or B may also be a string array or a cell array of character vectors containing one or multiple category names to compare against.

TF = ismember (A, B, 'rows') only applies to categorical matrices with the same number of columns, in which case the logical vector TF contains true for each row of A that is also a row in B. TF has the size number of rows as A.

[TF, index] = ismember (A, B) also returns an index array of the same size as A containing the lowest index in B for each element of A that is a member of B and 0 otherwise. If the 'rows' optional argument is used, then the returning index is a column vector with the same rows as A and it contains the lowest index in B for each row of A that is a member of B and 0 otherwise.

categorical: out = ismissing (C)

TF = ismissing (C) returns a logical array TF of the same size as C containing true for each corresponding element of C that does not have a value from one of the categories in C and false otherwise.

categorical: TF = isordinal (C)

TF = isordinal (C) returns a logical scalar TF, which is true, if the categorical array C is ordinal, and false otherwise.

categorical: TF = isprotected (C)

TF = isprotected (C) returns a logical scalar TF, which is true, if the categorical array C is protected, and false otherwise.

categorical: TF = isrow (C)

TF = isrow (C) returns a logical scalar TF, which is true, if the categorical array C is a row vector, and false otherwise. A row vector is a 2-D array for which size (X) returns [1, N] with non-negative N.

categorical: TF = isscalar (C)

TF = isscalar (C) returns a logical scalar TF, which is true, if the categorical array C is also a scalar, and false otherwise. A scalar is a single element object for which size (X) returns [1, 1].

categorical: TF = issorted (C)
categorical: TF = issorted (C, dim)
categorical: TF = issorted (C, direction)
categorical: TF = issorted (C, dim, direction)
categorical: TF = issorted (…, 'MissingPlacement', MP)

TF = issorted (C) returns a logical scalar TF, which is true, if the categorical array C is sorted in ascending order, and false otherwise.

TF = issorted (C, dim) returns a logical scalar TF, which is true, if the categorical array C is sorted in ascending order along the dimension dim, and false otherwise.

TF = issorted (A, direction) returns a logical scalar TF, which is true, if the categorical array C is sorted in the direction specified by direction, and false otherwise. direction can be any of the following options:

  • 'ascend', which is the default, checks is elements are in ascending order.
  • 'descend' checks if elements are in descending order.
  • 'monotonic' checks if elements are either in ascending or descending order.
  • 'strictascend' checks if elements are in ascending order and there are no duplicate or undefined elements.
  • 'strictdescend' checks if elements are in descending order and there are no duplicate or undefined elements.
  • 'strictmonotonic' checks if elements are either in ascending or descending order and there are no duplicate or undefined elements.

TF = issorted (…, 'MissingPlacement', MP) specifies where missing elements (<undefined>) are placed with any of the following options specified in MP:

  • 'auto', which is the default, places missing elements last for ascending sort and first for descending sort.
  • 'first' places missing elements first.
  • 'last' places missing elements last.
categorical: TF = issortedrows (C)
categorical: TF = issortedrows (C, col)
categorical: TF = issortedrows (C, direction)
categorical: TF = issortedrows (C, col, direction)
categorical: TF = issortedrows (…, 'MissingPlacement', MP)

TF = issortedrows (C) returns a logical scalar TF, which is true, if the rows in the 2-D categorical array C are sorted in ascending order, and false otherwise.

TF = issortedrows (C, col) returns a logical scalar TF, which is true, if the categorical array C is sorted according to the columns specified by the vector col, and false otherwise. col must explicitly contain non-zero integers whose absolute values index existing columns in A. Positive elements sort the corresponding columns in ascending order, while negative elements sort the corresponding columns in descending order.

TF = issortedrows (C, direction) checks if the rows in C are sorted according to the specified direction, which can be any of the following options:

  • 'ascend', which is the default, checks is elements are in ascending order.
  • 'descend' checks if elements are in descending order.
  • 'monotonic' checks if elements are either in ascending or descending order.
  • 'strictascend' checks if elements are in ascending order and there are no duplicate or undefined elements.
  • 'strictdescend' checks if elements are in descending order and there are no duplicate or undefined elements.
  • 'strictmonotonic' checks if elements are either in ascending or descending order and there are no duplicate or undefined elements.

Alternatively, direction can be a cell array array of character vectors specifying the sorting direction for each individual column of A, in which case the number of elements in direction must equal the number of columns in A.

B = issortedrows (A, col, direction) checks if the rows in the categorical array A are sorted according to the columns specified in col using the corresponding sorting direction specified in direction. In this case, the sign of the values in col is ignored. col and direction must have the same length, but not necessarily the same number of elements as the columns in A.

TF = issorted (…, 'MissingPlacement', MP) specifies where missing elements (<undefined>) are placed with any of the following options specified in MP:

  • 'auto', which is the default, places missing elements last for ascending sort and first for descending sort.
  • 'first' places missing elements first.
  • 'last' places missing elements last.
categorical: out = isundefined (C)

TF = isundefined (C) returns a logical array TF of the same size as C containing true for each corresponding element of C that does not have a value from one of the categories in C and false otherwise. <undefined> is the equivalent of NaN in numeric arrays.

categorical: TF = isvector (C)

TF = isvector (C) returns a logical scalar TF, which is true if the categorical array C is a vector and false otherwise. A vector is a 2-D array for which one of the dimensions is equal to 1 (either 1×N or N×1). By definition, a scalar is also a vector.

categorical: B = addcats (A, newcats)
categorical: B = addcats (…, 'After', catname)
categorical: B = addcats (…, 'Before', catname)

B = addcats (A, newcats) appends new categories specified in newcats to the categorical array A at the end of any existing categories. The output categorical array B does not contain elements that belong to the newly added categories.

B = addcats (…, 'After', catname) adds the categories after the existing category specified by catname.

B = addcats (…, 'Before', catname) adds the categories before the existing category specified by catname.

catname must be either a character vector, a cellstr scalar or a string scalar. newcats may be a cell array of character vectors or any type of array that can be converted to a cell array of character vectors with the cellstr function, as long as it does contain any duplicate names and does not reference an existing category in A.

categorical: B = mergecats (A, oldcats)
categorical: B = mergecats (A, oldcats, newcat)

B = mergecats (A, oldcats) merges two or more categories specified by oldcats into a single category with the same name as oldcats(1). In case of ordinal categorical arrays, the categories listed in oldcats must be in consecutive order. All elements of A corresponding to the categories listed in oldcats are re-indexed to correspond to oldcats(1) in B.

B = mergecats (A, oldcats, newcat) merges the categories listed in oldcats into a single new category named as specififed by newcat.

newcat must be either a character vector, a cellstr scalar or a string scalar. oldcats may be a cell array of character vectors or any type of array that can be converted to a cell array of character vectors with the cellstr function. Any names in oldcats that do not reference an existing category are ignored.

categorical: B = removecats (A)
categorical: B = removecats (A, oldcats)

B = removecats (A) removes all unused categories from categorical array A. The output categorical array B has the same size and values as A, but potentially fewer categories.

B = removecats (A, oldcats) removes the categories specified by oldcats. The elements of B that correspond to the removed categories are undefined.

oldcats may be a cell array of character vectors or any type of array that can be converted to a cell array of character vectors with the cellstr function. Any names in oldcats that do not reference an existing category are ignored.

categorical: B = renamecats (A, newnames)
categorical: B = renamecats (A, oldnames, newnames)

B = renamecats (A, newnames) renames all the categories in A, without changing any of its values, with the names specified in newnames. newnames may be a cell array of character vectors or any type of array that can be converted to a cell array of character vectors with the cellstr function, as long as it has the same number of elements as the categories in A.

B = renamecats (A, oldnames, newnames) renames the categories of A specified in oldnames with the names specified in newnames. Both oldnames and newnames may be a cell arrays of character vectors or any type of array that can be converted to a cell array of character vectors with the cellstr function, as long as they have the same number of elements. oldnames must specify a subset of existing categories in A.

categorical: B = reordercats (A)
categorical: B = reordercats (A, neworder)

B = reordercats (A) reorders the categories of A in alphanumeric order.

B = reordercats (A, neworder) reorders the categories of A according to the order specified by neworder, which may be a cell array of character vectors or any type of array that can be converted to a cell array of character vectors with the cellstr function as long as it contains the same set with the existing categories in A.

categorical: B = setcats (A, newcats)

B = setcats (A, newcats) sets categories in the categorical array B according to the elements of the input array A and the categories specified by newcats according to the following rules:

  • Any element of A that corresponds to a category listed in newcats is copied to B with the same categorical value.
  • Any categories of A not listed in newcats are not copied to B and the corresponding elements of B are undefined.
  • New categories listed in newcats that are not present in A are added in B, but without any elements equal to these new categories.

newcats may be a cell array of character vectors or any type of array that can be converted to a cell array of character vectors with the cellstr function.

categorical: C = times (A, B)

C = times (A, B) is the equivalent of the syntax C = A .* B and returns a categorical array whose categories are the Cartesian product of the categories in A and B and each element is indexed to a new category which is the combination of the categories of the corresponding elements in A and B.

A and B must be of common size or scalar categorical arrays.

categorical: TF = eq (A, B)

TF = eq (A, B) is the equivalent of the syntax TF = A == B and returns a logical array of the same size as the largest input with its elements set to true where the corresponding elements of A and B are equal and set to false where they are not. A and B must be size compatible, which translates to they can be the same size, one can be scalar, or for every dimension, their dimension sizes must be equal or one of them must be 1.

If categorical arrays A and B are ordinal, they must have the same set and ordering of categories. If neither are ordinal, the category names of each pair of elements are compared. Hence, they do not need to have the same set of categories.

One of the input arguments can also be a character vector, a cellstr scalar or a string scalar as long as the other is a categorical array. In this case, a logical array of the same size as the categorical array is returned in which every element is tested for equality by comparing its category with that specified by the string argument.

Undefined elements always return false, since they are not comparable to any other categorical values including other undefined elements.

categorical: TF = ge (A, B)

TF = ge (A, B) is the equivalent of the syntax TF = A >= B and returns a logical array of the same size as the largest input with its elements set to true where the corresponding elements of A are greater than or equal to B and set to false where they are not. A and B must be size compatible, which translates to they can be the same size, one can be scalar, or for every dimension, their dimension sizes must be equal or one of them must be 1.

If categorical arrays A and B are both ordinal, they must have the same set and ordering of categories. Unordered categorical arrays cannot be compared for greater than or equal to inequality.

One of the input arguments can also be a character vector, a cellstr scalar or a string scalar as long as the other is a categorical array. In this case, a logical array of the same size as the categorical array is returned in which every element is tested for greater than or equal to inequality by comparing its category with that specified by the string argument.

Undefined elements always return false, since they are not comparable to any other categorical values including other undefined elements.

categorical: TF = gt (A, B)

TF = gt (A, B) is the equivalent of the syntax TF = A > B and returns a logical array of the same size as the largest input with its elements set to true where the corresponding elements of A are greater than B and set to false where they are not. A and B must be size compatible, which translates to they can be the same size, one can be scalar, or for every dimension, their dimension sizes must be equal or one of them must be 1.

If categorical arrays A and B are both ordinal, they must have the same set and ordering of categories. Unordered categorical arrays cannot be compared for greater than inequality.

One of the input arguments can also be a character vector, a cellstr scalar or a string scalar as long as the other is a categorical array. In this case, a logical array of the same size as the categorical array is returned in which every element is tested for greater than inequality by comparing its category with that specified by the string argument.

Undefined elements always return false, since they are not comparable to any other categorical values including other undefined elements.

categorical: TF = le (A, B)

TF = le (A, B) is the equivalent of the syntax TF = A <= B and returns a logical array of the same size as the largest input with its elements set to true where the corresponding elements of A are less than or equal to B and set to false where they are not. A and B must be size compatible, which translates to they can be the same size, one can be scalar, or for every dimension, their dimension sizes must be equal or one of them must be 1.

If categorical arrays A and B are both ordinal, they must have the same set and ordering of categories. Unordered categorical arrays cannot be compared for less than or equal to inequality.

One of the input arguments can also be a character vector, a cellstr scalar or a string scalar as long as the other is a categorical array. In this case, a logical array of the same size as the categorical array is returned in which every element is tested for less than or equal to inequality by comparing its category with that specified by the string argument.

Undefined elements always return false, since they are not comparable to any other categorical values including other undefined elements.

categorical: TF = lt (A, B)

TF = lt (A, B) is the equivalent of the syntax TF = A < B and returns a logical array of the same size as the largest input with its elements set to true where the corresponding elements of A are less than B and set to false where they are not. A and B must be size compatible, which translates to they can be the same size, one can be scalar, or for every dimension, their dimension sizes must be equal or one of them must be 1.

If categorical arrays A and B are both ordinal, they must have the same set and ordering of categories. Unordered categorical arrays cannot be compared for less than inequality.

One of the input arguments can also be a character vector, a cellstr scalar or a string scalar as long as the other is a categorical array. In this case, a logical array of the same size as the categorical array is returned in which every element is tested for less than inequality by comparing its category with that specified by the string argument.

Undefined elements always return false, since they are not comparable to any other categorical values including other undefined elements.

categorical: TF = ne (A, B)

TF = ne (A, B) is the equivalent of the syntax TF = A != B and returns a logical array of the same size as the largest input with its elements set to true where the corresponding elements of A and B are not equal and set to false where they are equal. A and B must be size compatible, which translates to they can be the same size, one can be scalar, or for every dimension, their dimension sizes must be equal or one of them must be 1.

If categorical arrays A and B are ordinal, they must have the same set and ordering of categories. If neither are ordinal, the category names of each pair of elements are compared. Hence, they do not need to have the same set of categories.

One of the input arguments can also be a character vector, a cellstr scalar or a string scalar as long as the other is a categorical array. In this case, a logical array of the same size as the categorical array is returned in which every element is tested for inequality by comparing its category with that specified by the string argument.

Undefined elements always return true, since they are not comparable to any other categorical values including other undefined elements.

categorical: C = min (A)
categorical: [C, index] = min (A)
categorical: C = min (A, [], dim)
categorical: C = min (A, [], vecdim)
categorical: C = min (A, [], 'all')
categorical: [C, index] = min (A, [], …)
categorical: C = min (A, B)
categorical: […] = min (…, missingflag)

C = min (A) returns the smallest element in ordinal categorical vector A. If A is a matrix, min (A) returns a row vector with the smallest element from each column. For multidimensional arrays, min (A) operates along the first non-singleton dimension.

[C, index] = min (A) also returns the indices of the minimum values in index, which has the same size as C. When the operating dimension contains more than one minimal elements, the index of the first one is returned.

C = min (A, [], dim) operates along the dimension specified by dim.

C = min (A, [], vecdim) operates on all the elements contained in the dimensions specified by vecdim, which must be a numeric vector of non-repeating positive integers. Any values in vecdim indexing dimensions larger that the actual array A are ignored.

C = min (A, [], 'all') operates on all dimensions and returns the smallest element in A.

[C, index] = min (A, [], …) also returns the indices of the minimum values in index, using any of the previous syntaxes.

C = min (A, B) returns an ordinal categorical array C with the smallest elements from A and B, which both must be ordinal categorical arrays of compatible sizes with the same set and ordering of categories. Compatible size means that A and B can be the same size, one can be scalar, or for every dimension, their dimension sizes must be equal or one of them must be 1.

[…] = min (…, missingflag) specifies how to handle undefined elements in any of the previous syntaxes. missingflag must be a character vector or a string scalar with one of the following values:

  • 'omitundefined', which is the default, ignores all undefined elements and returns the minimum of the remaining elements. If all elements along the operating dimension are undefined, then it returns an undefined element. 'omitnan' may also be used as equivalent to 'omitundefined'.
  • 'includeundefined' returns an undefined element if there any undefined elements along the operating dimension. 'includenan' may also be used as equivalent to 'includeundefined'.
categorical: B = mink (A, K)
categorical: B = mink (A, K, dim)
categorical: [B, index] = mink (…)

B = mink (A, K) returns the K smallest categories in categorical vector A. If A is a matrix, mink (A) returns the K smallest categories from each column. For multidimensional arrays, mink (A) returns the K smallest categories along the first non-singleton dimension.

B = mink (A, K, dim) returns the K smallest elements in categorical array A along the dimension specified by dim.

[B, index] = mink (…) also returns the indices of the K smallest elements in index, using any of the previous syntaxes.

categorical: C = max (A)
categorical: [C, index] = max (A)
categorical: C = max (A, [], dim)
categorical: C = max (A, [], vecdim)
categorical: C = max (A, [], 'all')
categorical: [C, index] = max (A, [], …)
categorical: C = max (A, B)
categorical: […] = max (…, missingflag)

C = max (A) returns the largest element in ordinal categorical vector A. If A is a matrix, max (A) returns a row vector with the largest element from each column. For multidimensional arrays, max (A) operates along the first non-singleton dimension.

[C, index] = max (A) also returns the indices of the maximum values in index, which has the same size as C. When the operating dimension contains more than one maximal elements, the index of the first one is returned.

C = max (A, [], dim) operates along the dimension specified by dim.

C = max (A, [], vecdim) operates on all the elements contained in the dimensions specified by vecdim, which must be a numeric vector of non-repeating positive integers. Any values in vecdim indexing dimensions larger that the actual array A are ignored.

C = max (A, [], 'all') operates on all dimensions and returns the largest element in A.

[C, index] = max (A, [], …) also returns the indices of the maximum values in index, using any of the previous syntaxes.

C = max (A, B) returns an ordinal categorical array C with the largest elements from A and B, which both must be ordinal categorical arrays of compatible sizes with the same set and ordering of categories. Compatible size means that A and B can be the same size, one can be scalar, or for every dimension, their dimension sizes must be equal or one of them must be 1.

[…] = max (…, missingflag) specifies how to handle undefined elements in any of the previous syntaxes. missingflag must be a character vector or a string scalar with one of the following values:

  • 'omitundefined', which is the default, ignores all undefined elements and returns the maximum of the remaining elements. If all elements along the operating dimension are undefined, then it returns an undefined element. 'omitnan' may also be used as equivalent to 'omitundefined'.
  • 'includeundefined' returns an undefined element if there any undefined elements along the operating dimension. 'includenan' may also be used as equivalent to 'includeundefined'.
categorical: B = maxk (A, K)
categorical: B = maxk (A, K, dim)
categorical: [B, index] = maxk (…)

B = maxk (A, K) returns the K largest categories in categorical vector A. If A is a matrix, maxk (A) returns the K largest categories from each column. For multidimensional arrays, maxk (A) returns the K largest categories along the first non-singleton dimension.

B = maxk (A, K, dim) returns the K largest elements in categorical array A along the dimension specified by dim.

[B, index] = maxk (…) also returns the indices of the K largest elements in index, using any of the previous syntaxes.

categorical: B = median (A)
categorical: B = median (A, dim)
categorical: B = median (A, vecdim)
categorical: B = median (A, 'all')
categorical: B = median (…, missingflag)

B = median (A) returns the median of the elements in ordinal categorical vector A. If A is a matrix, median (A) returns a row vector with the median element from each column. For multidimensional arrays, median (A) operates along the first non-singleton dimension. B is also ordinal with the same ordered categories as A. For even number of elements along the operating dimension, the returned median value is either the midway category between the two middle elements or the larger of the two categories midway between the two middle elements.

B = median (A, dim) operates along the dimension specified by dim.

B = median (A, vecdim) operates on all the elements contained in the dimensions specified by vecdim, which must be a numeric vector of non-repeating positive integers. Any values in vecdim indexing dimensions larger that the actual array A are ignored.

C = median (A, [], 'all') operates on all dimensions and returns the median element in A.

C = median (…, missingflag) specifies how to handle undefined elements in any of the previous syntaxes. missingflag must be a character vector or a string scalar with one of the following values:

  • 'omitundefined' ignores all undefined elements and returns the median of the remaining elements. If all elements along the operating dimension are undefined, then it returns an undefined element. 'omitnan' may also be used as equivalent to 'omitundefined'.
  • 'includeundefined', which is the default, returns an undefined element if there any undefined elements along the operating dimension. 'includenan' may also be used as equivalent to 'includeundefined'.
categorical: M = mode (A)
categorical: [M, F] = mode (A)
categorical: [M, F, C] = mode (A)
categorical: […] = mode (A, dim)
categorical: […] = mode (A, vecdim)
categorical: […] = mode (A, 'all')

M = mode (A) returns the most frequent element in the categorical vector A. If A is a matrix, mode (A) returns a row vector with the most frequent element from each column. For multidimensional arrays, mode (A) operates along the first non-singleton dimension. B is also a categorical array with the same categories as A. For multiple elements with the same maximum frequency along the operating dimension, the element from the category that occurs first in A is returned.

[M, F] = mode (A) also returns a numeric array F, which has the same size as M and it contains the number of occurences of each corresponding element of M.

[M, F, C] = mode (A) also returns a cell array C, which has the same size as M and each element is a sorted categorical vector of all the values with the same maximum frequency of the corresponding element of M.

B = median (A, dim) operates along the dimension specified by dim.

B = median (A, vecdim) operates on all the elements contained in the dimensions specified by vecdim, which must be a numeric vector of non-repeating positive integers. Any values in vecdim indexing dimensions larger that the actual array A are ignored.

C = median (A, [], 'all') operates on all dimensions and returns the most frequent element in A.

categorical: N = histcounts (A)
categorical: N = histcounts (A, cats)
categorical: N = histcounts (…, 'Normalization', normtype)
categorical: [N, cats] = histcounts (…)

N = histcounts (A) returns a numeric vector N with the number of elements of each category in A. A can be a categorical array of any dimensions, but it is converted internally to a single column vector.

N = histcounts (A, cats) returns the number of elements only for the categories of A specified in cats, which may be a categorical array, a string array, or a cell array of character vectors, as long as it specifies unique existing categories in A.

N = histcounts (…, 'Normalization', normtype) specifies how to normalize the histogram values returned in N with any of the following options specified in normtype:

  • 'count', which is the default, returns the number of elements in each category.
  • 'countdensity' is the same as 'count', since the bin width in categorical arrays is always equal to 1.
  • 'probability' returns the number of elements in each category relative to the total number of elements in A.
  • 'pdf' is the same as 'probability', since the bin width in categorical arrays is always equal to 1.
  • 'cumcount' returns the cumulative number of elements in each category and all previous categories.
  • 'cdf' returns the cumulative number of elements in each category and all previous categories relative to the total number of elements in A.

[N, cats] = histcounts (…) also returns the corresponding categories of A for each count in N. cats is a cell array of character vectors with the same size as N.

categorical: B = sort (A)
categorical: B = sort (A, dim)
categorical: B = sort (A, direction)
categorical: B = sort (A, dim, direction)
categorical: B = sort (…, 'MissingPlacement', MP)
categorical: [B, index] = sort (A, …)

B = sort (A) sorts the categorical array A in ascending order. The sorted array B has the same categories as A. If A is a matrix, sort (A) sorts each column of A in ascending order. For multidimensional arrays, mode (A) sorts along the first non-singleton dimension.

B = sort (A, dim) sorts along the dimension specified by dim.

B = sort (A, direction) also specifies the sorting direction, which can be either 'ascend' (default) or 'descend'.

B = sort (…, 'MissingPlacement', MP) specifies where to place the missing elements (<undefined>) returned in B with any of the following options specified in MP:

  • 'auto', which is the default, places missing elements last for ascending sort and first for descending sort.
  • 'first' places missing elements first.
  • 'last' places missing elements last.

[B, index] = sort (A, …) also returns a sorting index containing the original indices of the elements in the sorted array.

  • If A is a vector, then index contains the original linear indices of the elements in the sorted vector B such that B = A(index).
  • If A is an M×N matrix and dim = 1, then index contains the original row indices of the elements in the sorted vector B such that for j = 1:N, B(:,j) = A(index(:,j),j).
categorical: B = sortrows (A)
categorical: B = sortrows (A, col)
categorical: B = sortrows (A, direction)
categorical: B = sortrows (A, col, direction)
categorical: B = sortrows (…, 'MissingPlacement', MP)
categorical: [B, index] = sortrows (A, …)

B = sortrows (A) sorts the rows of the 2-D categorical array A in ascending order. The sorted array B has the same categories as A.

B = sortrows (A, col) sorts A according to to the columns specified by the numeric vector col, which must explicitly contain non-zero integers whose absolute values index existing columns in A. Positive elements sort the corresponding columns in ascending order, while negative elements sort the corresponding columns in descending order.

B = sortrows (A, direction) also specifies the sorting direction, which can be either 'ascend' (default) or 'descend' applying to all columns in A. Alternatively, direction can be a cell array array of character vectors specifying the sorting direction for each individual column of A, in which case the number of elements in direction must equal the number of columns in A.

B = sortrows (A, col, direction) sorts the categorical array A according to the columns specified in col using the corresponding sorting direction specified in direction. In this case, the sign of the values in col is ignored. col and direction must have the same length, but not necessarily the same number of elements as the columns in A.

B = sortrows (…, 'MissingPlacement', MP) specifies where to place the missing elements (<undefined>) returned in B with any of the following options specified in MP:

  • 'auto', which is the default, places missing elements last for ascending sort and first for descending sort.
  • 'first' places missing elements first.
  • 'last' places missing elements last.

[B, index] = sortrows (A, …) also returns an index vector containing the original row indices of A in the sorted matrix B such that B = A(index,:).

categorical: B = topkrows (A, K)
categorical: B = topkrows (A, K, col)
categorical: B = topkrows (A, K, direction)
categorical: B = topkrows (A, K, col, direction)

B = topkrows (A, K) returns the top K rows of the 2-D categorical array A sorted in descending order as a group.

B = topkrows (A, K, col) returns the top K rows of the 2-D categorical array A sorted according to the columns specified by the numeric vector col, which must explicitly contain non-zero integers whose absolute values index existing columns in A. Positive elements sort the corresponding columns in ascending order, while negative elements sort the corresponding columns in descending order.

B = topkrows (A, K, direction) returns the top K rows of the 2-D categorical array A sorted according to direction, which can be either 'ascend' (default) or 'descend' applying to all columns in A. Alternatively, direction can be a cell array array of character vectors specifying the sorting direction for each individual column of A, in which case the number of elements in direction must equal the number of columns in A.

B = topkrows (A, K, col, direction) returns the top K rows of the 2-D categorical array A sorted according to the columns specified in col using the corresponding sorting direction specified in direction. In this case, the sign of the values in col is ignored. col and direction must have the same length, but not necessarily the same number of elements as the columns in A.

categorical: B = unique (A)
categorical: B = unique (A, 'rows')
categorical: [B, ixA, ixB] = unique (…)
categorical: … = unique (…, order)
categorical: … = unique (…, occurence)

B = unique (A) returns the unique values of the categorical array A in the categorical vector B sorted according to the order of categories in A. B retains the same categories as A. If A is a column vector, then B is also a column vector, otherwise unique returns a row vector.

B = unique (A, 'rows') returns the unique rows of the categorical matrix A in the categorical matrix B sorted according to the order of categories in A. B retains the same categories as A.

[B, ixA, ixB] = unique (…) also returns index vectors ixA and ixB such that B = A(ixA) and A = B(ixB), unless the 'rows' optional argument is given, in which case B = A(ixA,:) and A = B(ixB,:).

… = unique (…, order) also specifies the order of the returned unique values. order may be either 'sorted', which is the default behavior, or 'stable', in which case the unique values are returned in order of appearance.

… = unique (…, occurence) also specifies the which index is returned in ixA, where there are repeated values or rows (if opted) in the input categorical array. occurence may be either 'first', which is the default and returns the index of the first occurence of each unique value, or 'last', in which case the last occurence of each unique value is returned.

categorical: C = intersect (A, B)
categorical: C = intersect (A, B, 'rows')
categorical: [C, ixA, ixB] = intersect (…)
categorical: … = intersect (…, order)

C = intersect (A, B) returns the unique common values of the categorical arrays A and B. Either A or B input arguments may be a character vector, a string array, or a cell array of character vectors, which is promoted to a categorical array prior to set intersection. If both A and B are row vectors, then C is also a row vector, otherwise intersect returns a column vector.

If categorical arrays A and B are ordinal, they must have the same set and ordering of categories, which is transfered to C. If neither are ordinal, the category names of each pair of elements are compared (they do not need to have the same set of categories) in which case the categories in C are the sorted union of the categories in A and B.

C = intersect (A, B, 'rows' returns the unique common rows of the categorical matrices A and B, which must have the same number of columns. By default, the rows in categorical matrix C are in sorted order.

[C, ixA, ixB] = intersect (…) also returns index vectors ixA and ixB such that C = A(ixA) and C = B(ixB), unless the 'rows' optional argument is given, in which case C = A(ixA,:) and C = B(ixB,:).

… = intersect (…, order) also specifies the order of the returned unique values. order may be either 'sorted', which is the default behavior, or 'stable', in which case the unique values are returned in order of appearance.

categorical: C = setdiff (A, B)
categorical: C = setdiff (A, B, 'rows')
categorical: [C, ixA] = setdiff (…)
categorical: … = setdiff (…, order)

C = setdiff (A, B) returns the unique common values of the categorical arrays A and B. Either A or B input arguments may be a character vector, a string array, or a cell array of character vectors, which is promoted to a categorical array prior to set difference. If both A and B are row vectors, then C is also a row vector, otherwise intersect returns a column vector.

If categorical arrays A and B are ordinal, they must have the same set and ordering of categories, which is transfered to C. If neither are ordinal, the category names of each pair of elements are compared (they do not need to have the same set of categories) in which case the categories in C are the sorted union of the categories in A and B.

C = setdiff (A, B, 'rows' returns the unique common rows of the categorical matrices A and B, which must have the same number of columns. By default, the rows in categorical matrix C are in sorted order.

[C, ixA] = setdiff (…) also returns the index vector ixA such that C = A(ixA), unless the 'rows' optional argument is given, in which case C = A(ixA,:).

… = setdiff (…, order) also specifies the order of the returned unique values. order may be either 'sorted', which is the default behavior, or 'stable', in which case the unique values are returned in order of appearance.

categorical: C = setxor (A, B)
categorical: C = setxor (A, B, 'rows')
categorical: [C, ixA, ixB] = setxor (…)
categorical: … = setxor (…, order)

C = setxor (A, B) returns the unique common values of the categorical arrays A and B. Either A or B input arguments may be a character vector, a string array, or a cell array of character vectors, which is promoted to a categorical array prior to set exclusive-or. If both A and B are row vectors, then C is also a row vector, otherwise setxor returns a column vector.

If categorical arrays A and B are ordinal, they must have the same set and ordering of categories, which is transfered to C. If neither are ordinal, the category names of each pair of elements are compared (they do not need to have the same set of categories) in which case the categories in C are the sorted union of the categories in A and B.

C = setxor (A, B, 'rows' returns the unique common rows of the categorical matrices A and B, which must have the same number of columns. By default, the rows in categorical matrix C are in sorted order.

[C, ixA, ixB] = setxor (…) also returns index vectors ixA and ixB such that C = A(ixA) and C = B(ixB), unless the 'rows' optional argument is given, in which case C = A(ixA,:) and C = B(ixB,:).

… = setxor (…, order) also specifies the order of the returned unique values. order may be either 'sorted', which is the default behavior, or 'stable', in which case the unique values are returned in order of appearance.

categorical: C = union (A, B)
categorical: C = union (A, B, 'rows')
categorical: [C, ixA, ixB] = union (…)
categorical: … = union (…, order)

C = union (A, B) returns the unique common values of the categorical arrays A and B. Either A or B input arguments may be a character vector, a string array, or a cell array of character vectors, which is promoted to a categorical array prior to set exclusive-or. If both A and B are row vectors, then C is also a row vector, otherwise union returns a column vector.

If categorical arrays A and B are ordinal, they must have the same set and ordering of categories, which is transfered to C. If neither are ordinal, the category names of each pair of elements are compared (they do not need to have the same set of categories) in which case the categories in C are the sorted union of the categories in A and B.

C = union (A, B, 'rows' returns the unique common rows of the categorical matrices A and B, which must have the same number of columns. By default, the rows in categorical matrix C are in sorted order.

[C, ixA, ixB] = union (…) also returns index vectors ixA and ixB such that C = A(ixA) and C = B(ixB), unless the 'rows' optional argument is given, in which case C = A(ixA,:) and C = B(ixB,:).

… = union (…, order) also specifies the order of the returned unique values. order may be either 'sorted', which is the default behavior, or 'stable', in which case the unique values are returned in order of appearance.

categorical: C = cat (dim, A, B, …)

C = cat (dim, A, B, …) concatenates categorical arrays A, B, … along dimension dim. All input arrays must have the same size except along the operating dimension dim. Any of the input arrays may also be string arrays or cell arrays of character vectors of compatible size.

If any input array is an ordinal categorical array, then all inputs must be ordinal categorical arrays with the same set and ordering of categories. In this case, C is also an ordinal categorical array with the same set and ordering of categories. If none of the input arrays are ordinal, then they do not need to have the same set of categories. In this case, categorical array C contains the union of the categories from all input arrays. Protected categorical arrays can only be concatenated with other arrays that have the same set of categories but not necessarily in the same order.

categorical: C = horzcat (A, B, …)

C = horzcat (A, B, … is the equivalent of the syntax B = [A, B, …] and horizontally concatenates the categorical arrays A, B, …. All input arrays must have the same size except along the second dimension. Any of the input arrays may also be string arrays or cell arrays of character vectors of compatible size.

If any input array is an ordinal categorical array, then all inputs must be ordinal categorical arrays with the same set and ordering of categories. In this case, C is also an ordinal categorical array with the same set and ordering of categories. If none of the input arrays are ordinal, then they do not need to have the same set of categories. In this case, categorical array C contains the union of the categories from all input arrays. Protected categorical arrays can only be concatenated with other arrays that have the same set of categories but not necessarily in the same order.

categorical: C = vertcat (A, B, …)

C = vertcat (A, B, … is the equivalent of the syntax B = [A; B; …] and vertically concatenates the categorical arrays A, B, …. All input arrays must have the same size except along the first dimension. Any of the input arrays may also be string arrays or cell arrays of character vectors of compatible size.

If any input array is an ordinal categorical array, then all inputs must be ordinal categorical arrays with the same set and ordering of categories. In this case, C is also an ordinal categorical array with the same set and ordering of categories. If none of the input arrays are ordinal, then they do not need to have the same set of categories. In this case, categorical array C contains the union of the categories from all input arrays. Protected categorical arrays can only be concatenated with other arrays that have the same set of categories but not necessarily in the same order.

categorical: B = repmat (A, n)
categorical: B = repmat (A, d1, …, dN)
categorical: B = repmat (A, dimvec)

B = repmat (A, n) returns a categorical array B containing n copies of the input categorical array A along every dimension of A.

B = repmat (A, d1, …, dN) returns an array B containing copies of A along the dimensions specified by the list of scalar integer values d1, …, dN, which specify how many copies of A are made in each dimension.

B = repmat (A, dimvec) is equivalent to the previous syntax with dimvec = [d1, …, dN].

categorical: B = repelem (A, n)
categorical: B = repelem (A, d1, …, dN)

B = repelem (A, n) returns a categorical vector B containing repeated elements of the input A, which must be a categorical vector. If n is a scalar, each element of A is repeated n times along the non-singleton dimension of A. If n is a vector, it must have the same elemnts as A, in which case it specifies the number of times to repeat each corresponding element of A.

B = repelem (A, d1, …, dN returns an array B with each element of A repeated according to the the list of input arguments d1, …, dN each corresponding to a different dimension 1:ndims (A) of the input array A. d1, …, dN must be either scalars or vectors with the same length as the corresponding dimension of A containing non-negative integer values specifying the number of repetitions of each element along the corresponding dimension.

categorical: B = repelems (A, R)

B = repelems (A, R) returns a categorical vector B containing repeated elements of the input A, which must be a categorical vector. R must be a 2×N matrix of integers. Entries in the first row of R correspond to the linear indexing of the elements in A to be repeated. The corresponding entries in the second row of R specify the repeat count of each element.

categorical: B = reshape (A, d1, …, dN)
categorical: B = reshape (A, …, [], …)
categorical: B = reshape (A, dimvec)

B = reshape (A, d1, …, dN) returns a categorical array B with specified dimensions d1, …, dN, whose elements are taken columnwise from the categorical array A. The product of d1, …, dN must equal the total number of elements in A.

B = reshape (A, …, [], …) returns a categorical array B with one dimension unspecified which is calculated automatically so that the product of dimensions in B matches the total elements in A, which must be divisible the product of specified dimensions. An empty matrix ([]) is used to flag the unspecified dimension.

categorical: B = circshift (A, n)
categorical: B = circshift (A, n, dim)

B = circshift (A, n) circularly shifts the elements of the categorical array A according to n. If n is a nonzero integer scalar, then the elements of A are shifted by n elements along the first non-singleton dimension of A. If n is a vector, it must not be longer that the number of dimensions of A with each value of n corresponding to a dimension in A. The sign of the value(s) in n specify the direction in the elements of A are shifted.

B = circshift (A, n, dim) circularly shifts the elements of the categorical array A along the dimension specified by dim. In this case, n must be a scalar integer value.

categorical: B = permute (A, dims)

B = permute (A, dims) returns the generalized transpose of the categorical array A by rearranging its dimensions according to the permutation vector specified in dims.

dims must index all the dimensions 1:ndims (A) of the input array A, in any order, but only once. The Nth dimension of A gets remapped to the dimension in B specified by dims(N).

categorical: A = ipermute (B, dims)

A = ipermute (B, dims) returns the inverse of the generalized transpose performed by the permute function. The expression ipermute (permute (A, dims), dims) returns the original array A.

dims must index all the dimensions 1:ndims (B) of the input array B, in any order, but only once. The dimension of B specified in dims(N) gets remapped to the Nth dimension of A.

categorical: B = transpose (A)

B = transpose (A) is the equivalent of the syntax B = A.' and returns the transpose of the categorical matrix A.

categorical: B = ctranspose (A)

B = ctranspose (A) is the equivalent of the syntax B = A' and returns the transpose of the categorical matrix A. For categorical arrays, ctranspose is identical to transpose.