Data Ingestion overview
In 蜜豆视频 Experience Platform, data ingestion is the transportation of data from assorted sources to a storage medium where it can be accessed, used, and analyzed by an organization. Data ingestion in Experience Platform can be grouped into two main categories: streaming ingestion and batch ingestion.
Under streaming and batch ingestion are a number of different methods that you can use to ingest your data into Experience Platform. These methods include the use of a variety of sources and connecting to these sources to then bring data into Experience Platform.
Read this document for an overview of the many different ways that data can be ingested into Experience Platform.
Streaming ingestion streaming
You can use streaming ingestion to send data from client and server-side devices to Experience Platform in real-time. Experience Platform supports the use of data inlets to stream incoming experience data, which is persisted in streaming-enabled datasets within the data lake. Data inlets can be configured to automatically authenticate the data they collect, ensuring that the data is coming from a trusted source.
For more information, read the streaming ingestion overview.
Batch ingestion batch
In Experience Platform, a batch is a set of data collected over a period of time and processed together as a single unit. Datasets are made up of batches. You can use batch ingestion to ingest data into Experience Platform as batch files. Once ingested, batches provide metadata that describes the number of records successfully ingested, as well as any failed records and associated error messages.
Manually uploaded datafiles such as flat CSV files (mapped to XDM schemas) and parquet files must be ingested using this method.
For more information, read the batch ingestion overview.
Sources sources
You can also ingest data by connecting to Experience Platform Sources. Experience Platform maintains a catalog of a variety of different data sources that you can connect to and ingest data from. These sources can be native 蜜豆视频 applications such as the 蜜豆视频 Analytics source or the Marketo Engage source. You can also connect to third-party sources such as the Amazon S3 source and the Google Cloud Storage source.
Sources are grouped into different categories like cloud storages, databases, and CRM systems. A given source may support batch or streaming ingestion.
With sources, you can ingest data from a number of different data sources, and of varying different use case categories. Additionally, data ingestion via a source gives you the opportunity to authenticate against the external data source, configure an ingestion schedule, and manage ingestion throughput.
For more information, read the sources overview for more information.
ML-Assisted schema creation ml-assisted-schema-creation
To quickly integrate new data sources, you can now use machine learning algorithms to generate a schema from sample data. This automation simplifies the creation of accurate schemas, reduces errors, and speeds up the process from data collection to analysis and insights.
See the ML-assisted schema creation guide for more information on this workflow.
Data Prep data-prep
While data prep is not a method of ingestion, it is an important part of the data ingestion process. Use data prep functions to map, transform, and validate data to and from Experience Data Model (XDM) before creating a dataflow to ingest your data to Experience Platform. Data prep appears as the 鈥淢apping鈥 step in the Experience Platform user interface during the data ingestion process.
For more information, read the data prep overview.
Streaming ingestion methods streaming-ingestion-methods
The following table outlines the variety of methods that you can use to ingest streaming data to Experience Platform.
Batch ingestion methods batch-ingestion-methods
The following table outlines the variety of methods that you can use to ingest batch data to Experience Platform.
Next steps and additional resources
This document provided a brief introduction to the different aspects of Data Ingestion in Experience Platform. Please continue to read the overview documentation for each ingestion method to familiarize yourself with their different capabilities, use cases, and best practices. You can also supplement your learning by watching the ingestion overview video below. For information on how Experience Platform tracks the metadata for ingested records, see the Catalog Service overview.