Command Line Interface¶
Airflow has a very rich command line interface that allows for many types of operation on a DAG, starting services, and supporting development and testing.
usage: airflow [-h]
{resetdb,render,variables,connections,pause,task_failed_deps,version,trigger_dag,initdb,test,unpause,dag_state,run,list_tasks,backfill,list_dags,kerberos,worker,webserver,flower,scheduler,task_state,pool,serve_logs,clear,upgradedb}
...
Positional Arguments¶
subcommand | Possible choices: resetdb, render, variables, connections, pause, task_failed_deps, version, trigger_dag, initdb, test, unpause, dag_state, run, list_tasks, backfill, list_dags, kerberos, worker, webserver, flower, scheduler, task_state, pool, serve_logs, clear, upgradedb sub-command help |
Sub-commands:¶
resetdb¶
Burn down and rebuild the metadata database
airflow resetdb [-h] [-y]
Named Arguments¶
-y, --yes | Do not prompt to confirm reset. Use with care! Default: False |
render¶
Render a task instance’s template(s)
airflow render [-h] [-sd SUBDIR] dag_id task_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
task_id | The id of the task |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
variables¶
CRUD operations on variables
airflow variables [-h] [-s KEY VAL] [-g KEY] [-j] [-d VAL] [-i FILEPATH]
[-e FILEPATH] [-x KEY]
Named Arguments¶
-s, --set | Set a variable |
-g, --get | Get value of a variable |
-j, --json | Deserialize JSON variable Default: False |
-d, --default | Default value returned if variable does not exist |
-i, --import | Import variables from JSON file |
-e, --export | Export variables to JSON file |
-x, --delete | Delete a variable |
connections¶
List/Add/Delete connections
airflow connections [-h] [-l] [-a] [-d] [--conn_id CONN_ID]
[--conn_uri CONN_URI] [--conn_extra CONN_EXTRA]
[--conn_type CONN_TYPE] [--conn_host CONN_HOST]
[--conn_login CONN_LOGIN] [--conn_password CONN_PASSWORD]
[--conn_schema CONN_SCHEMA] [--conn_port CONN_PORT]
Named Arguments¶
-l, --list | List all connections Default: False |
-a, --add | Add a connection Default: False |
-d, --delete | Delete a connection Default: False |
--conn_id | Connection id, required to add/delete a connection |
--conn_uri | Connection URI, required to add a connection without conn_type |
--conn_extra | Connection Extra field, optional when adding a connection |
--conn_type | Connection type, required to add a connection without conn_uri |
--conn_host | Connection host, optional when adding a connection |
--conn_login | Connection login, optional when adding a connection |
--conn_password | |
Connection password, optional when adding a connection | |
--conn_schema | Connection schema, optional when adding a connection |
--conn_port | Connection port, optional when adding a connection |
pause¶
Pause a DAG
airflow pause [-h] [-sd SUBDIR] dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
task_failed_deps¶
Returns the unmet dependencies for a task instance from the perspective of the scheduler. In other words, why a task instance doesn’t get scheduled and then queued by the scheduler, and then run by an executor).
airflow task_failed_deps [-h] [-sd SUBDIR] dag_id task_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
task_id | The id of the task |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
trigger_dag¶
Trigger a DAG run
airflow trigger_dag [-h] [-sd SUBDIR] [-r RUN_ID] [-c CONF] [-e EXEC_DATE]
dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
-r, --run_id | Helps to identify this run |
-c, --conf | JSON string that gets pickled into the DagRun’s conf attribute |
-e, --exec_date | |
The execution date of the DAG |
test¶
Test a task instance. This will run a task without checking for dependencies or recording it’s state in the database.
airflow test [-h] [-sd SUBDIR] [-dr] [-tp TASK_PARAMS]
dag_id task_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
task_id | The id of the task |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
-dr, --dry_run | Perform a dry run Default: False |
-tp, --task_params | |
Sends a JSON params dict to the task |
unpause¶
Resume a paused DAG
airflow unpause [-h] [-sd SUBDIR] dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
dag_state¶
Get the status of a dag run
airflow dag_state [-h] [-sd SUBDIR] dag_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
run¶
Run a single task instance
airflow run [-h] [-sd SUBDIR] [-m] [-f] [--pool POOL] [--cfg_path CFG_PATH]
[-l] [-A] [-i] [-I] [--ship_dag] [-p PICKLE]
dag_id task_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
task_id | The id of the task |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
-m, --mark_success | |
Mark jobs as succeeded without running them Default: False | |
-f, --force | Ignore previous task instance state, rerun regardless if task already succeeded/failed Default: False |
--pool | Resource pool to use |
--cfg_path | Path to config file to use instead of airflow.cfg |
-l, --local | Run the task using the LocalExecutor Default: False |
-A, --ignore_all_dependencies | |
Ignores all non-critical dependencies, including ignore_ti_state and ignore_task_deps Default: False | |
-i, --ignore_dependencies | |
Ignore task-specific dependencies, e.g. upstream, depends_on_past, and retry delay dependencies Default: False | |
-I, --ignore_depends_on_past | |
Ignore depends_on_past dependencies (but respect upstream dependencies) Default: False | |
--ship_dag | Pickles (serializes) the DAG and ships it to the worker Default: False |
-p, --pickle | Serialized pickle object of the entire dag (used internally) |
list_tasks¶
List the tasks within a DAG
airflow list_tasks [-h] [-t] [-sd SUBDIR] dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-t, --tree | Tree view Default: False |
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
backfill¶
Run subsections of a DAG for a specified date range
airflow backfill [-h] [-t TASK_REGEX] [-s START_DATE] [-e END_DATE] [-m] [-l]
[-x] [-a] [-i] [-I] [-sd SUBDIR] [--pool POOL]
[--delay_on_limit DELAY_ON_LIMIT] [-dr]
dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-t, --task_regex | |
The regex to filter specific task_ids to backfill (optional) | |
-s, --start_date | |
Override start_date YYYY-MM-DD | |
-e, --end_date | Override end_date YYYY-MM-DD |
-m, --mark_success | |
Mark jobs as succeeded without running them Default: False | |
-l, --local | Run the task using the LocalExecutor Default: False |
-x, --donot_pickle | |
Do not attempt to pickle the DAG object to send over to the workers, just tell the workers to run their version of the code. Default: False | |
-a, --include_adhoc | |
Include dags with the adhoc parameter. Default: False | |
-i, --ignore_dependencies | |
Skip upstream tasks, run only the tasks matching the regexp. Only works in conjunction with task_regex Default: False | |
-I, --ignore_first_depends_on_past | |
Ignores depends_on_past dependencies for the first set of tasks only (subsequent executions in the backfill DO respect depends_on_past). Default: False | |
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
--pool | Resource pool to use |
--delay_on_limit | |
Amount of time in seconds to wait when the limit on maximum active dag runs (max_active_runs) has been reached before trying to execute a dag run again. Default: 1.0 | |
-dr, --dry_run | Perform a dry run Default: False |
list_dags¶
List all the DAGs
airflow list_dags [-h] [-sd SUBDIR] [-r]
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
-r, --report | Show DagBag loading report Default: False |
kerberos¶
Start a kerberos ticket renewer
airflow kerberos [-h] [-kt [KEYTAB]] [--pid [PID]] [-D] [--stdout STDOUT]
[--stderr STDERR] [-l LOG_FILE]
[principal]
Positional Arguments¶
principal | kerberos principal Default: airflow |
Named Arguments¶
-kt, --keytab | keytab Default: airflow.keytab |
--pid | PID file location |
-D, --daemon | Daemonize instead of running in the foreground Default: False |
--stdout | Redirect stdout to this file |
--stderr | Redirect stderr to this file |
-l, --log-file | Location of the log file |
worker¶
Start a Celery worker node
airflow worker [-h] [-p] [-q QUEUES] [-c CONCURRENCY] [-cn CELERY_HOSTNAME]
[--pid [PID]] [-D] [--stdout STDOUT] [--stderr STDERR]
[-l LOG_FILE]
Named Arguments¶
-p, --do_pickle | |
Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code. Default: False | |
-q, --queues | Comma delimited list of queues to serve Default: default |
-c, --concurrency | |
The number of worker processes Default: 16 | |
-cn, --celery_hostname | |
Set the hostname of celery worker if you have multiple workers on a single machine. | |
--pid | PID file location |
-D, --daemon | Daemonize instead of running in the foreground Default: False |
--stdout | Redirect stdout to this file |
--stderr | Redirect stderr to this file |
-l, --log-file | Location of the log file |
webserver¶
Start a Airflow webserver instance
airflow webserver [-h] [-p PORT] [-w WORKERS]
[-k {sync,eventlet,gevent,tornado}] [-t WORKER_TIMEOUT]
[-hn HOSTNAME] [--pid [PID]] [-D] [--stdout STDOUT]
[--stderr STDERR] [-A ACCESS_LOGFILE] [-E ERROR_LOGFILE]
[-l LOG_FILE] [--ssl_cert SSL_CERT] [--ssl_key SSL_KEY] [-d]
Named Arguments¶
-p, --port | The port on which to run the server Default: 8080 |
-w, --workers | Number of workers to run the webserver on Default: 4 |
-k, --workerclass | |
Possible choices: sync, eventlet, gevent, tornado The worker class to use for Gunicorn Default: sync | |
-t, --worker_timeout | |
The timeout for waiting on webserver workers Default: 120 | |
-hn, --hostname | |
Set the hostname on which to run the web server Default: 0.0.0.0 | |
--pid | PID file location |
-D, --daemon | Daemonize instead of running in the foreground Default: False |
--stdout | Redirect stdout to this file |
--stderr | Redirect stderr to this file |
-A, --access_logfile | |
The logfile to store the webserver access log. Use ‘-‘ to print to stderr. Default: - | |
-E, --error_logfile | |
The logfile to store the webserver error log. Use ‘-‘ to print to stderr. Default: - | |
-l, --log-file | Location of the log file |
--ssl_cert | Path to the SSL certificate for the webserver |
--ssl_key | Path to the key to use with the SSL certificate |
-d, --debug | Use the server that ships with Flask in debug mode Default: False |
flower¶
Start a Celery Flower
airflow flower [-h] [-hn HOSTNAME] [-p PORT] [-fc FLOWER_CONF] [-a BROKER_API]
[--pid [PID]] [-D] [--stdout STDOUT] [--stderr STDERR]
[-l LOG_FILE]
Named Arguments¶
-hn, --hostname | |
Set the hostname on which to run the server Default: 0.0.0.0 | |
-p, --port | The port on which to run the server Default: 5555 |
-fc, --flower_conf | |
Configuration file for flower | |
-a, --broker_api | |
Broker api | |
--pid | PID file location |
-D, --daemon | Daemonize instead of running in the foreground Default: False |
--stdout | Redirect stdout to this file |
--stderr | Redirect stderr to this file |
-l, --log-file | Location of the log file |
scheduler¶
Start a scheduler instance
airflow scheduler [-h] [-d DAG_ID] [-sd SUBDIR] [-r RUN_DURATION]
[-n NUM_RUNS] [-p] [--pid [PID]] [-D] [--stdout STDOUT]
[--stderr STDERR] [-l LOG_FILE]
Named Arguments¶
-d, --dag_id | The id of the dag to run |
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
-r, --run-duration | |
Set number of seconds to execute before exiting | |
-n, --num_runs | Set the number of runs to execute before exiting Default: -1 |
-p, --do_pickle | |
Attempt to pickle the DAG object to send over to the workers, instead of letting workers run their version of the code. Default: False | |
--pid | PID file location |
-D, --daemon | Daemonize instead of running in the foreground Default: False |
--stdout | Redirect stdout to this file |
--stderr | Redirect stderr to this file |
-l, --log-file | Location of the log file |
task_state¶
Get the status of a task instance
airflow task_state [-h] [-sd SUBDIR] dag_id task_id execution_date
Positional Arguments¶
dag_id | The id of the dag |
task_id | The id of the task |
execution_date | The execution date of the DAG |
Named Arguments¶
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
pool¶
CRUD operations on pools
airflow pool [-h] [-s NAME SLOT_COUNT POOL_DESCRIPTION] [-g NAME] [-x NAME]
Named Arguments¶
-s, --set | Set pool slot count and description, respectively |
-g, --get | Get pool info |
-x, --delete | Delete a pool |
clear¶
Clear a set of task instance, as if they never ran
airflow clear [-h] [-t TASK_REGEX] [-s START_DATE] [-e END_DATE] [-sd SUBDIR]
[-u] [-d] [-c] [-f] [-r] [-x] [-dx]
dag_id
Positional Arguments¶
dag_id | The id of the dag |
Named Arguments¶
-t, --task_regex | |
The regex to filter specific task_ids to backfill (optional) | |
-s, --start_date | |
Override start_date YYYY-MM-DD | |
-e, --end_date | Override end_date YYYY-MM-DD |
-sd, --subdir | File location or directory from which to look for the dag Default: /home/docs/airflow/dags |
-u, --upstream | Include upstream tasks Default: False |
-d, --downstream | |
Include downstream tasks Default: False | |
-c, --no_confirm | |
Do not request confirmation Default: False | |
-f, --only_failed | |
Only failed jobs Default: False | |
-r, --only_running | |
Only running jobs Default: False | |
-x, --exclude_subdags | |
Exclude subdags Default: False | |
-dx, --dag_regex | |
Search dag_id as regex instead of exact string Default: False |