Skip to main content

Converting an existing project to use dg

info

dg and Dagster Components are under active development. You may encounter feature gaps, and the APIs may change. To report issues or give feedback, please join the #dg-components channel in the Dagster Community Slack.

Suppose we have an existing Dagster project. Our project defines a Python package with a a single Dagster asset. The asset is exposed in a top-level Definitions object in my_existing_project/definitions.py. We'll consider both a case where we have been using uv with pyproject.toml and pip with setup.py.

tree
.
├── my_existing_project
│   ├── __init__.py
│   ├── assets.py
│   ├── definitions.py
│   └── py.typed
├── pyproject.toml
└── uv.lock

2 directories, 6 files

Before proceeding, we'll make sure we have an activated and up-to-date virtual environment in the project root. Having the virtual environment located in the project root is recommended (particularly when using uv) but not required.

If you don't have a virtual environment yet, run:

uv sync

Then activate it:

source .venv/bin/activate

Install dependencies

Install the dg command line tool into your project virtual environment.

uv add dagster-dg-cli

Update project structure

Add dg configuration

The dg command recognizes Dagster projects through the presence of TOML configuration. This may be either a pyproject.toml file with a tool.dg section or a dg.toml file. Let's add this configuration:

Since our project already has a pyproject.toml file, we can just add the requisite tool.dg section to the file:

pyproject.toml
...
[tool.dg]
directory_type = "project"

[tool.dg.project]
root_module = "my_existing_project"
code_location_target_module = "my_existing_project.definitions"

There are three settings:

  • directory_type = "project": This is how dg identifies your package as a Dagster project. This is required.
  • project.root_module = "my_existing_project": This points to the root module of your project. This is also required.
  • project.code_location_target_module = "my_existing_project.definitions": This tells dg where to find the top-level Definitions object in your project. This actually defaults to [root_module].definitions, so it is not strictly necessary for us to set it here, but we are including this setting in order to be explicit--existing projects might have the top-level Definitions object defined in a different module, in which case this setting is required.

Now that these settings are in place, you can interact with your project using dg. If we run dg list defs we can see the sole existing asset in our project:

dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ my_asset │ default │ │ │ │ │
│ │ └──────────┴─────────┴──────┴───────┴─────────────┘ │
└─────────┴─────────────────────────────────────────────────────┘

Add a dagster_dg_cli.plugin entry point

We're not quite done adding configuration. dg uses the Python entry point API to expose custom component types and other scaffoldable objects from user projects. Our entry point declaration will specify a submodule as the location where our project exposes plugin objects. By convention, this submodule is named <root_module>.lib. In our case, it will be my_existing_project.lib. Let's create this submodule now:

mkdir my_existing_project/components && touch my_existing_project/components/__init__.py
tip

See the plugin guide for more on dg plugins.

We'll need to add a dagster_dg_cli.plugin entry point to our project and then reinstall the project package into our virtual environment. The reinstallation step is crucial. Python entry points are registered at package installation time, so if you simply add a new entry point to an existing editable-installed package, it won't be picked up.

Entry points can be declared in either pyproject.toml or setup.py:

Since our package metadata is in pyproject.toml, we'll add the entry point declaration there:

pyproject.toml
...
[project.entry-points]
"dagster_dg_cli.plugin" = { my_existing_project = "my_existing_project.components"}
...

Then we'll reinstall the package. Note that uv sync will not reinstall our package, so we'll use uv pip install instead:

uv pip install --editable .

If we've done everything correctly, we should now be able to run dg list plugin-modules and see the module my_existing_project.components, which we have registered as an entry point, listed in the output.

dg list plugin-modules
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Module ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ dagster │
│ my_existing_project.components │
└────────────────────────────────┘

We can now scaffold a new component type in our project and it will be available to dg commands. First create the component type:

dg scaffold component Foo

Creating a Dagster component type at /.../my-existing-project/my_existing_project/components/foo.py.
Scaffolded files for Dagster component type at /.../my-existing-project/my_existing_project/components/foo.py.

Then run dg list components to confirm that the new component type is available:

dg list components
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Key ┃ Summary ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ dagster.DefinitionsComponent │ An arbitrary set of dagster definitions. │
├──────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ dagster.DefsFolderComponent │ A folder which may contain multiple submodules, each │
│ │ which define components. │
├──────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ dagster.PipesSubprocessScriptCollectionComponent │ Assets that wrap Python scripts executed with Dagster's │
│ │ PipesSubprocessClient. │
├──────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ my_existing_project.components.Foo │ COMPONENT SUMMARY HERE. │
└──────────────────────────────────────────────────┴───────────────────────────────────────────────────────────────────┘

You should see the my_project.lib.MyComponentType listed in the output.

Create a defs directory

Part of the dg experience is autoloading definitions. This means automatically picking up any definitions that exist in a particular module. We are going to create a new submodule named my_existing_project.defs (defs is the conventional name of the module for where definitions live in dg) from which we will autoload definitions.

mkdir my_existing_project/defs

Modify top-level definitions

Autoloading is provided by a function that returns a Definitions object. Because we already have some other definitions in our project, we'll combine those with the autoloaded ones from my_existing_project.defs.

To do so, you'll need to modify your definitions.py file, or whichever file contains your top-level Definitions object.

You'll autoload definitions using load_defs, then merge them with your existing definitions using Definitions.merge. You pass load_defs the defs module you just created:

import dagster as dg
from my_existing_project.assets import my_asset

defs = dg.Definitions(
assets=[my_asset],
)

Now let's add an asset to the new defs module. Create my_existing_project/defs/autoloaded_asset.py with the following contents:

import dagster as dg


@dg.asset
def autoloaded_asset(): ...

Finally, let's confirm the new asset is being autoloaded. Run dg list defs again and you should see both the new autoloaded_asset and old my_asset:

dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ autoloaded_asset │ default │ │ │ │ │
│ │ ├──────────────────┼─────────┼──────┼───────┼─────────────┤ │
│ │ │ my_asset │ default │ │ │ │ │
│ │ └──────────────────┴─────────┴──────┴───────┴─────────────┘ │
└─────────┴─────────────────────────────────────────────────────────────┘

Now your project is fully compatible with dg!

Next steps