Skip to main content

Iceberg

The iceberg destination provides additional features on top of the filesystem destination in OSS dlt. This page only documents the additional features—use the documentation provided in OSS dlt for standard functionality.

delete-insert merge strategy with iceberg table format

The delete-insert merge strategy can be used when using the iceberg table format:

@dlt.resource(
primary_key="id", # merge_key also works; primary_key and merge_key may be used together
write_disposition={"disposition": "merge", "strategy": "delete-insert"},
)
def my_resource():
yield [
{"id": 1, "foo": "foo"},
{"id": 2, "foo": "bar"}
]
...

pipeline = dlt.pipeline("loads_iceberg", destination="iceberg")

Table format

iceberg destination automatically assigns iceberg table format to all resources that it will load. You can still fall back to storing files (as specified in file_format) by setting table_format to native on a resource.

Configuration

Iceberg destinations looks for its configuration under destination.iceberg. Otherwise it is configured in the same way as filesystem destination.

[destination.iceberg]
bucket_url = "s3://[your_bucket_name]" # replace with your bucket name,

[destination.iceberg.credentials]
aws_access_key_id = "please set me up!" # copy the access key here
aws_secret_access_key = "please set me up!" # copy the secret access key here

You are still able to use regular filesystem configuration.

from dlt_plus.destinations import iceberg

dest_ = iceberg(destination_name="filesystem")

Known limitations

  • Compound keys are not supported: use a single primary_key and/or a single merge_key.
    • As a workaround, you can transform your resource data with add_map to add a new column that contains a hash of the key columns, and use that column as primary_key or merge_key.
  • Nested tables are not supported: avoid complex data types or disable nesting

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.