Setup the Matatika platform to deliver and process your data in S3 Parquet in minutes.
S3 Parquet is a file format for storing and processing large amounts of data in a distributed computing environment.
S3 Parquet is a columnar storage format that allows for efficient compression and encoding of data, making it ideal for storing and processing large amounts of data in a distributed computing environment. It is designed to work seamlessly with Amazon S3 and other big data processing tools such as Apache Spark and Hadoop. S3 Parquet allows for faster data processing and analysis, as well as reduced storage costs, making it a popular choice for big data applications.
The path to the S3 bucket and object where the Parquet data is stored.
The access key ID for the AWS account that has permission to access the S3 bucket.
The secret access key for the AWS account that has permission to access the S3 bucket.
The name of the Athena database where the Parquet data will be queried.
Whether or not to add metadata to each record in the Parquet data.
Whether or not to convert the schema of the Parquet data to a string format.
A mapping of column names to stream names for the Parquet data.
Configuration options for the stream maps.
Whether or not to flatten nested structures in the Parquet data.
The maximum depth to which nested structures will be flattened.
Collect and process data from 100s of sources and tools with S3 Parquet.