-
Notifications
You must be signed in to change notification settings - Fork 14.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
import error alert emails #10968
Comments
Why? also, why do you want to keep track on the import errors? what is so special about it in compare for example to a dag that doesn't have import errors but always fails on some other exception. |
Import errors / cluster policy violations mean "this DAG is DOA" but no mechanism for sending alert emails. This might be particularly confusing if there's any amount of dynamic behavior / DAG factory pattern going on. Contrived example: cluster policy limiting the number of tasks in a DAG and 1 / 10 of your generated DAGs violate this policy you won't be alerted unless you come to UI. A DAG always failing on some other exception (e.g. within the task) already has a mechanism for emailing on task failure (TaskInstance email alerts). Perhaps you have even another failure mode I'm not thinking of that is not an import error and not TaskInstance specific. |
I think this is a nice feature, maybe someone will implement it. |
Description
Send alert emails on import errors.
Use case / motivation
This is a feature will be especially useful for admins who use
ClusterPolicyViolation
or those who have pesky folks that push bad DAGs cluttering the UI with notifications that should be triaged to the on-call.Related Issues
I believe that the
import_errors
is a sad place in the airflow codebase and now might be the time to refactor this.I have questions about how to best keep track of import error emails we have sent (as to not spam the recipients with emails every scheduler loop).
It seems to me we can either keep history of import errors in the existing table (this would require a fundamental change to the scheduler logic which clears this table on each loop and a change to how the UI uses this table) or create a new table (e.g.
import_error_history
which has a bool field for email_sent) for this purpose.I'm curious to hear from community if there are other use cases / improvements we should consider in refactoring the import errors.
cc: @mik-laj
The text was updated successfully, but these errors were encountered: