4. Celery

Celery is an asynchronous task queue/job queue based on distributed message passing.

To understand more about celery, here is an article from OpenStack. Celery docs is also great and detailed. The Getting Started tutorial is nice to get a feel for everything quickly. Also, notes about Celery docs can be found in this google doc

4.1. Creating Celery tasks:

  • All Celery Tasks should be added under pearlcertification.celery_tasks.<site> (site can be cert, green_door, pqs or api) depending on which site configuration they need to execute properly. For example, a task that requires models that are specific to Green Door must be added in a new module under pearlcertification.celery_tasks.green_door. Also, long-running tasks (ones that take more than 5 mins to run) should go under pearlcertification.celery_tasks.long_running.

  • The correct celery app should be used to decorate the task function config.<site>.celery_app.<app_name> or config.celery_app.long_running_app. For a task created for the cert site, the decorator would be config.cert.celery_app.cert_app.

  • Then, the path to the module of task must be added to config.<site>.settings.CELERY_IMPORTS if it is not added before or to config.cert.settings.CELERY_INCLUDE if it is a long-running task.

4.2. Current Celery Setup

  • 4 Celery workers, one worker for each site (cert, green door, pqs, api) and a fifth worker for long-running tasks. A dedicated worker per-site is needed because we use different django config per-site.

  • The broker is Redis: 4 queues are created, each one is dedicated for a specific site (cert, green door, pqs, api) and the fifth queue is used for long-running tasks.

  • Result backend is Redis

  • Monitoring (what workers are running, which tasks were executed, task states, queues state, …) is done by Flower

4.3. Celery Beat

Celery Beat takes care of triggering tasks periodically based on a schedule. It acts like cron. django-celery-beat is used to be able to define cronjobs at runtime. All cronjobs that were using django-cron will be moved to celery beat. Celery beat is run automatically in development when a worker is running. In staging, we will have one container for celery. In production, we will have 1 systemd service for celery beat.

4.4. Celery for local development

By default, Celery is deactivated in local development and celery workers are not started and tasks are executed synchronously (rather than asynchronously). To use Celery locally:

  • set the env var USE_CELERY to True

  • start the needed worker in the command line with runcertworker, rungreendoorworker, runpqsworker, runapiworker or run_long_running_worker

  • all the workers can also be started in the background as a systemd service with systemctl start celery.service. Note that starting all workers is memory-intensive, so be sure to provision the necessary memory for the vagrant VM.

  • Redis is always running in the background (much like mysql) so nothing to do for it.

  • To monitor workers, queues and tasks running, you can run flower with runflower then go to http://localhost:5555/

4.5. Celery for Staging and Production

Notes can be found in the google doc