Skip to main content

Adding sources, destinations and pipelines to your project

Adding a new entity to an existing dlt+ project is easy. You can add a new entity to your project by running the dlt <entity_type> <entity_name> add command. Depending on the entity you are adding different options are available. To see all options for adding a destination for example, you can run dlt destination add --help. Let's individually add a source, destination and pipeline to a new project, replicating the default project we created in the previous chapter.

Add an empty project

Delete all the files in the tutorial folder and run the following command to create an empty project:

dlt project init

This will create a project without any sources, destinations, datasets or pipelines, the project will be named after the folder.

Add all entities

Now we can add all of our entities individually. This way we can also give them their own names which will be useful when having multiple destinations of the same type for example.

Add a source:

# add a new arrow source called "my_arrow_source"
dlt source my_arrow_source add arrow

Add a destination:

# add a new duckdb destination called "my_duckdb_destination"
# this will also create a new dataset called "my_duckdb_destination_dataset"
dlt destination my_duckdb_destination add duckdb

If you want to create a dataset automatically you can use the --dataset-name flag:

# add a new duckdb destination called "my_duckdb_destination"
dlt destination my_duckdb_destination add duckdb --dataset-name my_duckdb_destination_dataset

Now we can add a pipeline that uses the source and destination we just added:

# add a new pipeline called "my_pipeline" which loads from my_arrow_source and saves to my_duckdb_destination
# we select the my_duckdb_destination_dataset with the optional flag
dlt pipeline my_pipeline add my_arrow_source my_duckdb_destination --dataset_name my_duckdb_destination_dataset

Run the pipeline

As in the first chapter, we can now run the pipeline:

dlt pipeline my_pipeline run

And inspect the dataset

dlt dataset my_duckdb_destination_dataset row-counts

Learn more

Next chapter: Configuration and profiles

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.