## Abstract

We provide a construction for categorical representation learning and introduce the foundations of ‘categorifier’. The central theme in representation learning is the idea of everything to vector. Every object in a dataset S can be represented as a vector in R^{n} by an encoding map E : Obj(S) → R^{n}. More importantly, every morphism can be represented as a matrix E : Hom(S) → R^{n}_{n}. The encoding map E is generally modeled by a deep neural network. The goal of representation learning is to design appropriate tasks on the dataset to train the encoding map (assuming that an encoding is optimal if it universally optimizes the performance on various tasks). However, the latter is still a set-theoretic approach. The goal of the current article is to promote the representation learning to a new level via a category-theoretic approach. As a proof of concept, we provide an example of a text translator equipped with our technology, showing that our categorical learning model outperforms the current deep learning models by 17 times. The content of the current article is part of a US provisional patent application filed by QGNai, Inc.

Original language | English (US) |
---|---|

Article number | 015016 |

Journal | Machine Learning: Science and Technology |

Volume | 3 |

Issue number | 1 |

DOIs | |

State | Published - Mar 2022 |

Externally published | Yes |

## Keywords

- Categorical representation learning
- Category theory
- Natural language processing (NLP)

## ASJC Scopus subject areas

- Artificial Intelligence
- Human-Computer Interaction
- Software