h

Run Scenarios to Automate Your Work on Data

Dataiku Launch Program - Step 4

Automation Overview

Watch Your Work Get Itself Done

Dataiku is built as a platform to run data projects in production. That means running them when new data comes in, connecting them to your existing infrastructure to bring data in or out, sending notifications etc.

To encourage collaboration, many of these capabilities are code-free in visual interfaces.

Discover Automation
h

Scenarios

Schedule Your Project to Run Daily

Create a scenario on the project flow you want to schedule to run. If you don’t have a project yet, start here.

A scenario has:

  • Steps that are actions you configure to run.
  • Triggers that define when to execute a scenario.
  • Reporters that send information or alerts about a scenario via a variety of channels.
Discover Scenarios
h

Scenario Steps

Decide What the Scenario Will Do

Add steps to your scenario. For example, you can:

  • Build or clear a dataset
  • Train a model
  • Verify data quality rules or run checks
  • Send messages
  • Refresh the cache of charts and dashboards

Scenario steps run sequentially. However, you can control whether a step runs based on the outcome of a check.

Scenario steps documentation
h

Triggers

How Do You Want to Trigger Your Automation?

This will launch the scenario at regular intervals. Example: Repeat every 30 minutes.

Triggers documentation

This starts a scenario whenever a change is detected in a dataset.

Triggers documentation

This runs a query at a specified interval and starts the scenario when the output of the query changes. Python triggers will execute a custom Python script that activates a trigger.

Triggers documentation

Reporters

Report On Your Scenario Run

Send an email, a Slack message, or more based on your scenario to close the loop and stay on top of potential issues with your scenario runs.

Reporters documentation
h

Automation Challenges

Be Smart With Automation

Your project doesn’t stop with your flow or hitting play on your scenario; you need to think about the after. As you automate workflows, you’re exposed to risks: ingesting poor-quality data, dataset schemas changing with extra columns added, or models becoming obsolete and drifting.

Automated runs can also be costly for your organization. This is why you may not have access rights to set up scenarios and will have to work with an admin, data engineer, or data ops person at your organization.

h

Metrics and Checks

Keep Everything Under Control

Metrics are metadata to measure datasets or models, for example. Use these to monitor how an object evolves, for example:

  • The number of missing values of a column,
  • The size of a folder,
  • The accuracy of a model.

You can then set up checks based on these metrics, like whether there is missing data, whether the size of a model stays reasonable, or whether model accuracy doesn’t fall under a specific number.

Learn More About Metrics & Checks
Discover metrics and checks
h

Flow Views

Check Your Flow for Scenarios

Use Flow Views to filter and keep an eye on all the steps in your project included in a scenario.

Read more on all the different Flow Views
Explore Flow Views
h

Dashboards

Build Visualizations to Share Insights

From your navigation, access your dashboards to share metrics, charts, datasets, team discussions, or even interactive web applications. With dashboard refresh, anybody you share your dashboard with can access the freshest data.

Learn all about building Dashboards
Discover Dashboards
h

We’re happy you’re here

Join a Community of Dataiku Users

Have a question? Want to explore Dataiku tips? Share an idea you have to improve Dataiku? Explore great stories from our users?

Join the Dataiku Community

Keep Exploring Dataiku


Working Together in Dataiku

Dataiku Launch Program Step 5

Keep Going


Dataiku 12

Explore the newest features of Dataiku

Explore Now