Running Your Data Import Locally
Time required: 15 minutes
Prerequisites
You must have:
- Own or have administrative access to a workspace
- Git installed
- Python >=3.7 installed
- Meltano installed (virtual environment recommended)
Introduction
Data import pipelines in Matatika workspaces are run using Meltano. Each Matatika workspace is backed by a repository containing a Meltano project hosted on GitHub, which can easily be cloned locally in order to run data import pipelines external to the Matatika platform.
Setup
- Within the Matatika app, switch to the workspace that contains the data import pipeline you wish to run locally
- Navigate to the workspace ‘Settings’ page and copy the repository URL
- Clone the workspace to your local system
git clone https://github.com/MatatikaBytes/example-workspace
- Change into the cloned directory and create a new
.env
file - Head back to the Matatika app and navigate to the workspace
Lab
thenPipelines
page, and expand the data import pipeline you wish to run locally - Select the ‘Environment’ tab and click the
.env
text field to copy the environment configuration - Paste the copied environment configuration into the
.env
file you created earlierTAP_EXAMPLE_CLIENT_ID=clientid TAP_EXAMPLE_CLIENT_SECRET=clientsecret TAP_EXAMPLE_START_DATE=2022-01-01T00:00 TARGET_EXAMPLE_HOST=example.host.com TARGET_EXAMPLE_PORT=1234 TARGET_EXAMPLE_DB=db TARGET_EXAMPLE_SCHEMA=schema TARGET_EXAMPLE_USERNAME=username TARGET_EXAMPLE_PASSWORD=password
Your local workspace repository should now be set up similar to this one: Github Example Link
Running Locally
(activate your virtual environment if you are using one for Meltano)