Ingesting data is a crucial part of building AI systems. AI systems require to be able to do two things: Use context and abstract. The context comes from the ingested data.
Ingesting is the process of interpreting the data into something the AI engine understands. Traditionally this meant fitting the data onto a specific model. Another approach, however, is to ingest in terms of relations, data fields, and think in modalities.
Modality
Think multi-modal learning. The idea is that information is experienced under a particular modality. Such that a person can be experienced visually, auditorially, be felt, smelled, etc.
The story is the same regarding pieces of information about an entity. It can exist in terms of text, location, imagery, time, etc. The fundamental realization is that entities span multiple modalities and that modalities represent the interface to the surrounding world.
Types of Ingestion
In the simplest for ingestion happens as data transform from a simple source into an internal representation. An example is tabular data with no need for a context to understand and ingest it.
For more complex cases, we need contextual information to ingest it. The canonical example of this is entity linking when extracting data from written text.