<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>workflows Archives - OVHcloud Blog</title>
	<atom:link href="https://blog.ovhcloud.com/tag/workflows/feed/" rel="self" type="application/rss+xml" />
	<link>https://blog.ovhcloud.com/tag/workflows/</link>
	<description>Innovation for Freedom</description>
	<lastBuildDate>Fri, 06 Mar 2020 23:00:14 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://blog.ovhcloud.com/wp-content/uploads/2019/07/cropped-cropped-nouveau-logo-ovh-rebranding-32x32.gif</url>
	<title>workflows Archives - OVHcloud Blog</title>
	<link>https://blog.ovhcloud.com/tag/workflows/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Doing BIG automation with Celery</title>
		<link>https://blog.ovhcloud.com/doing-big-automation-with-celery/</link>
		
		<dc:creator><![CDATA[Bartosz Rabiega]]></dc:creator>
		<pubDate>Fri, 06 Mar 2020 16:14:18 +0000</pubDate>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Automation]]></category>
		<category><![CDATA[celery]]></category>
		<category><![CDATA[ceph]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[workflows]]></category>
		<guid isPermaLink="false">https://www.ovh.com/blog/?p=17100</guid>

					<description><![CDATA[Intro TL;DR: You might want to skip the intro and jump right into “Celery &#8211; Distributed Task Queue”. Hello! I’m Bartosz Rabiega, and I’m part of the R&#38;D/DevOps teams at OVHcloud. As part of our daily work, we’re developing and maintaining the Ceph-as-a-Service project, in order to provide highly available, solid, distributed storage for various [&#8230;]<img src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fdoing-big-automation-with-celery%2F&amp;action_name=Doing%20BIG%20automation%20with%20Celery&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading">Intro</h2>



<p><strong>TL;DR</strong>: You might want to skip the intro and jump right into “Celery &#8211; Distributed Task Queue”.</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img fetchpriority="high" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/03/2A010EF2-2666-42D4-91C1-F1FAE33148FE-1024x537.png" alt="" class="wp-image-17420" width="512" height="269" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/03/2A010EF2-2666-42D4-91C1-F1FAE33148FE-1024x537.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/2A010EF2-2666-42D4-91C1-F1FAE33148FE-300x157.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/2A010EF2-2666-42D4-91C1-F1FAE33148FE-768x403.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/2A010EF2-2666-42D4-91C1-F1FAE33148FE.png 1200w" sizes="(max-width: 512px) 100vw, 512px" /></figure></div>



<p>Hello! I’m Bartosz Rabiega, and I’m part of the R&amp;D/DevOps teams at OVHcloud. As part of our daily work, we’re developing and maintaining the Ceph-as-a-Service project, in order to provide highly available, solid, distributed storage for various applications. We’re dealing with 60PB+ of data, across 10 regions, so as you might imagine, we’ve got quite a lot of work ahead in terms of replacing broken hardware, handling natural growth, provisioning new regions and datacentres, evaluating new hardware, optimising software and hardware configurations, researching new storage solutions, and much more!</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/03/B87CD670-7779-4325-92D9-F30A1C8C71A2.png" alt="" class="wp-image-17382" width="705" height="471" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/03/B87CD670-7779-4325-92D9-F30A1C8C71A2.png 940w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/B87CD670-7779-4325-92D9-F30A1C8C71A2-300x200.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/B87CD670-7779-4325-92D9-F30A1C8C71A2-768x513.png 768w" sizes="(max-width: 705px) 100vw, 705px" /></figure></div>



<p>Because of the wide scope of our work, we need to offload as many repetitive tasks as possible. And we do that through automation.</p>



<h2 class="wp-block-heading">Automating your work</h2>



<p>To some extent, every manual process can be described as set of actions and conditions. If we somehow managed to force something to automatically perform the actions and check the conditions, we would be able to automate the process, resulting in an automated workflow. Take a look at the example below, which shows some generic steps for manually replacing hardware in our project.</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img decoding="async" width="1024" height="291" src="https://www.ovh.com/blog/wp-content/uploads/2020/03/E9662233-9498-4F2F-9A7E-B640F85EE295-1024x291.png" alt="" class="wp-image-17389" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/03/E9662233-9498-4F2F-9A7E-B640F85EE295-1024x291.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/E9662233-9498-4F2F-9A7E-B640F85EE295-300x85.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/E9662233-9498-4F2F-9A7E-B640F85EE295-768x218.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/E9662233-9498-4F2F-9A7E-B640F85EE295-1536x436.png 1536w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/E9662233-9498-4F2F-9A7E-B640F85EE295.png 1677w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure></div>



<p>Hmm… What could help us do this automatically? Doesn’t a computer sound like a perfect fit? 🙂 There are many ways to force computers to process automated workflows, but first we need to define some building blocks (let’s call them tasks) and get them to run sequentially or in parallel (i.e. a workflow). Fortunately, there are software solutions that can help with that, among which is Celery.</p>



<h2 class="wp-block-heading">Celery &#8211; Distributed Task Queue</h2>



<p>Celery is a well-known and widely adopted piece of software that allows us to process tasks asynchronously. The description of the project on its main page (<a href="http://www.celeryproject.org/" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">http://www.celeryproject.org/</a>) may sound a little bit enigmatic, but we can narrow down its basic functionality to something like this:</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/03/4749B2AA-AA5B-4BEF-BA3A-FC0B67FCD447-1024x539.png" alt="" class="wp-image-17414" width="768" height="404" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/03/4749B2AA-AA5B-4BEF-BA3A-FC0B67FCD447-1024x539.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/4749B2AA-AA5B-4BEF-BA3A-FC0B67FCD447-300x158.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/4749B2AA-AA5B-4BEF-BA3A-FC0B67FCD447-768x404.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/4749B2AA-AA5B-4BEF-BA3A-FC0B67FCD447.png 1294w" sizes="auto, (max-width: 768px) 100vw, 768px" /></figure></div>



<p>Such machinery is perfectly suited to tasks like sending emails asynchronously (i.e. &#8216;fire and forget&#8217;), but it can also be used for different purposes. So what other tasks could it handle? Basically, any tasks you can implement in Python (the main Celery language)! I won’t go too much into the details, as they are available in the Celery documentation. What matters is that since we can implement any task we want, we can use that to create the building blocks for our automation.</p>



<p>There is one more important thing&#8230; Celery natively supports combining such tasks into workflows (Celery primitives: chains, groups, chords, etc.). So let’s get through some examples&#8230;</p>



<p>We’ll use the following task definitions &#8211; single task, printing <em>args</em> and <em>kwargs</em>:</p>



<pre class="wp-block-code"><code class="">@celery_app.task
def noop(*args, **kwargs):
    # Task accepts any arguments and does nothing
    print(args, kwargs)
    return True</code></pre>



<p>Now we can execute the task asynchronously, using the following code:</p>



<pre class="wp-block-code"><code class="">task = noop.s(777)
task.apply_async()</code></pre>



<p>The elementary tasks can be parametrised and combined into a complex workflow using celery methods, i.e. “chain”, “group”, and “chord”. See the examples below. In each of them, the left side shows a visual representation of a workflow, while the right side shows the code snippet that generates it. The green box is the starting point, after which the workflow execution progresses vertically.</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-flow wp-block-group-is-layout-flow">
<h4 class="wp-block-heading">Chain &#8211; a set of tasks processed sequentially</h4>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/03/705AD975-048B-4E6A-8BFF-F68775C9C5C7.png" alt="" class="wp-image-17394" width="92" height="320"/></figure></div>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<pre class="wp-block-code"><code class="">workflow = (
    chain([noop.s(i) for i in range(3)])
)</code></pre>
</div>
</div>



<h4 class="wp-block-heading">Group &#8211; a set of tasks processed in parallel</h4>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<figure class="wp-block-image size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/03/B112B87E-2813-46DD-9105-4B528BB3C110.png" alt="" class="wp-image-17396" width="317" height="169" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/03/B112B87E-2813-46DD-9105-4B528BB3C110.png 633w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/B112B87E-2813-46DD-9105-4B528BB3C110-300x160.png 300w" sizes="auto, (max-width: 317px) 100vw, 317px" /></figure>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<pre class="wp-block-code"><code class="">workflow = (
    group([noop.s(i) for i in range(5)])
)</code></pre>
</div>
</div>



<h4 class="wp-block-heading">Chord &#8211; a group of tasks chained to the following task</h4>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<figure class="wp-block-image size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/03/4E75C373-2CE1-4A68-8599-245E768167A4.png" alt="" class="wp-image-17397" width="311" height="223" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/03/4E75C373-2CE1-4A68-8599-245E768167A4.png 621w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/4E75C373-2CE1-4A68-8599-245E768167A4-300x215.png 300w" sizes="auto, (max-width: 311px) 100vw, 311px" /></figure>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<pre class="wp-block-code"><code class="">workflow = chord(
        [noop.s(i) for i in range(5)],
        noop.s(i)
)

# Equivalent:
workflow = chain([
        group([noop.s(i) for i in range(5)]),
        noop.s(i)
])</code></pre>
</div>
</div>
</div></div>



<p>An important point: the execution of a workflow will always stop in the event of a failed task. As a result, a chain won’t be continued if some task fails in the middle of it. This gives us quite a powerful framework for implementing some neat automation, and that’s exactly what we’re using for Ceph-as-a-Service at OVHcloud! We’ve implemented lots of small, flexible, parameterisable tasks, which we combine together to reach a common goal. Here are some real-life examples of elementary tasks, used for the automatic removal of old hardware:</p>



<ul class="wp-block-list"><li>Change weight of Ceph node (used to increase/decrease the amount of data on node. Triggers data rebalance)</li><li>Set service downtime (data rebalance triggers monitoring probes, but this is expected, so set downtime for this particular monitoring entry)</li><li>Wait until Ceph is healthy (wait until the data rebalance is complete &#8211; repeating task)</li><li>Remove Ceph node from a cluster (node is empty so it can simply be uninstalled)</li><li>Send info to technicians in DC (hardware is ready to be replaced)</li><li>Add new Ceph node to a cluster (install new empty node)</li></ul>



<p>We parametrise these tasks and tie them together, using Celery chains, groups and chords to create the desired workflow. Celery then does the rest by asynchronously executing the workflow.</p>



<h2 class="wp-block-heading">Big workflows and Celery</h2>



<p>As our infrastructure grows, so doo our automated workflows grow, with more tasks per workflow, higher complexity of workflows&#8230; What do we understand as a big workflow? A workflow consisting of 1,000-10,000 tasks. Just to visualize it take a look on following examples:</p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-flow wp-block-group-is-layout-flow">
<h4 class="wp-block-heading">A few chords chained together (57 tasks in total)</h4>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<div class="wp-block-image"><figure class="aligncenter"><img decoding="async" src="https://lh4.googleusercontent.com/XZWOfqmSMu68u7GcbvceB0mc8_HA_v8higDeoG08dlO5oTlRd9R98QBSlf4sMLPuiFB2RPVgM-6i7vG86jtAxMCrKSLTkt0nK4z5JSbYE4QkXF96qkXh3uSJYj1X82UUm-agBMxu" alt=""/></figure></div>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<pre class="wp-block-code"><code class="">workflow = chain([
    noop.s(0),
    chord([noop.s(i) for i in range(10)], noop.s()),
    chord([noop.s(i) for i in range(10)], noop.s()),
    chord([noop.s(i) for i in range(10)], noop.s()),
    chord([noop.s(i) for i in range(10)], noop.s()),
    chord([noop.s(i) for i in range(10)], noop.s()),
    noop.s()
])</code></pre>
</div>
</div>



<h4 class="wp-block-heading">More complex graph structure built from chains and groups (23 tasks in total)</h4>



<div class="wp-block-columns is-layout-flex wp-container-core-columns-is-layout-28f84493 wp-block-columns-is-layout-flex">
<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<div class="wp-block-image"><figure class="aligncenter"><img decoding="async" src="https://lh5.googleusercontent.com/gUQlIa5Nmb4a5oNDbojhBtukEn--6dSxlKrn-enggXk9eCtuBvgVBTxecwAczOMghEoZ0zOtKuz0nohZTsj01QqVBxkbX8bxqyVVvYjC6B1sfrpXN8pferDSgg-RE6TB6v5SOBdL" alt=""/></figure></div>
</div>



<div class="wp-block-column is-layout-flow wp-block-column-is-layout-flow">
<pre class="wp-block-code"><code class=""># | is ‘chain’ operator in celery
workflow = (
    group(
        group(
            group([noop.s() for i in range(5)]),
            chain([noop.s() for i in range(5)])
        ) |
        noop.s() |
        group([noop.s() for i in range(5)]) |
        noop.s(),
        chain([noop.s() for i in range(5)])
    ) |
    noop.s()
)</code></pre>
</div>
</div>
</div></div>



<p>As you can probably imagine, visualisations get quite big and messy when 1,000 tasks are involved! Celery is a powerful tool, and has lots of features that are well-suited for automation, but it still struggles when it comes to processing big, complex, long-running workflows. Orchestrating the execution of 10,000 tasks, with a variety of dependencies, is no trivial thing. There are several issues we encountered when our automation grew too big:</p>



<ul class="wp-block-list"><li>Memory issues during workflow building (client side)</li><li>Serialisation issues (client -&gt; Celery backend transfer)</li><li>Nondeterministic, broken execution of workflows</li><li>Memory issues in Celery workers (Celery backend)</li><li>Disappearing tasks</li><li>And more&#8230;</li></ul>



<p>Take a look at some GitHub tickets:</p>



<ul class="wp-block-list"><li><a href="https://github.com/celery/celery/issues/5000" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://github.com/celery/celery/issues/5000</a></li><li><a href="https://github.com/celery/celery/issues/5286" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://github.com/celery/celery/issues/5286</a></li><li><a href="https://github.com/celery/celery/issues/5327" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://github.com/celery/celery/issues/5327</a></li><li><a href="https://github.com/celery/celery/issues/3723" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://github.com/celery/celery/issues/3723</a></li></ul>



<p>Using Celery for our particular use case became difficult and unreliable. Celery’s native support for workflows doesn’t seem to be the right choice for handling 100/1,000/10,000 tasks. In its current state, it’s just not enough. So here we stand, in front of a solid, concrete wall… Either we somehow fix Celery, or we rewrite our automation using a different framework.</p>



<h2 class="wp-block-heading">Celery &#8211; to fix&#8230; or to fix?</h2>



<p>Rewriting all of our automation would be possible, although relatively painful. Since I’m a rather lazy person, perhaps attempting to fix Celery wasn’t an entirely bad idea? So I took some time to dig through Celery’s code, and managed to find the parts responsible for building workflows, and executing chains and chords. It was still a little bit difficult for me to understand all the different code paths handling the wide range of use cases, but I realised it would be possible to implement a clean, straightforward orchestration that would handle all the tasks and their combinations in the same way. What’s more, I had a glimpse that it wouldn&#8217;t take too much effort to integrate it into our automation (let’s not forget the main goal!). </p>



<p>Unfortunately, introducing new orchestration into the Celery project would probably be quite hard, and would most likely break some backwards compatibility. So I decided to take a different approach &#8211; writing an extension or a plugin that wouldn’t require changes in Celery. Something pluggable, and as non-invasive as possible. That’s how Celery Dyrygent emerged&#8230;</p>



<h2 class="wp-block-heading">Celery Dyrygent</h2>



<p><a href="https://github.com/ovh/celery-dyrygent" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://github.com/ovh/celery-dyrygent</a></p>



<h3 class="wp-block-heading">How to represent a workflow</h3>



<p>You can think of a workflow as a directed acyclic graph (DAG), where each task is a separate graph node. When it comes to acyclic graphs, it is relatively easy to store and resolve dependencies between nodes, which leads to straightforward orchestration. Celery Dyrygent was implemented based on these features. Each task in the workflow has an unique identifier (Celery already assigns task IDs when a task is pushed for execution) and each one of them is wrapped into a workflow node. Each workflow node consists of a task signature (a plain Celery signature) and a list of IDs for the tasks it depends on. See the example below:</p>



<figure class="wp-block-image size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/03/F4601B45-EB13-4710-9325-B9684BF77918-1024x533.png" alt="" class="wp-image-17400" width="512" height="267" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/03/F4601B45-EB13-4710-9325-B9684BF77918-1024x533.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/F4601B45-EB13-4710-9325-B9684BF77918-300x156.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/F4601B45-EB13-4710-9325-B9684BF77918-768x400.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/F4601B45-EB13-4710-9325-B9684BF77918.png 1172w" sizes="auto, (max-width: 512px) 100vw, 512px" /></figure>



<h3 class="wp-block-heading">How to process a workflow</h3>



<p>So we know how to store a workflow in a clean and easy way. Now we just need to execute it. How about using&#8230; Celery? Why not? For this, Celery Dyrygent introduces a <strong>workflow processor</strong> task (an ordinary Celery task). This task wraps a whole workflow and schedules an execution of primitive tasks, according to their dependencies. Once the scheduling part is over, the task repeats itself (it &#8216;ticks&#8217; with some delay). </p>



<p>Throughout the whole processing cycle, workflow processor retains the state of the entire workflow internally. As a result, it updates the state with each repetition. You can see an orchestration example below:</p>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/03/CE6EE688-92F2-4BA5-9A6B-147BD956A0F0-1024x553.png" alt="" class="wp-image-17416" width="512" height="277" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/03/CE6EE688-92F2-4BA5-9A6B-147BD956A0F0-1024x553.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/CE6EE688-92F2-4BA5-9A6B-147BD956A0F0-300x162.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/CE6EE688-92F2-4BA5-9A6B-147BD956A0F0-768x415.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/CE6EE688-92F2-4BA5-9A6B-147BD956A0F0.png 1470w" sizes="auto, (max-width: 512px) 100vw, 512px" /></figure></div>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/03/7764C3D5-1EF9-44A9-A588-4C37A275570B-1024x553.png" alt="" class="wp-image-17417" width="512" height="277" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/03/7764C3D5-1EF9-44A9-A588-4C37A275570B-1024x553.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/7764C3D5-1EF9-44A9-A588-4C37A275570B-300x162.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/7764C3D5-1EF9-44A9-A588-4C37A275570B-768x415.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/7764C3D5-1EF9-44A9-A588-4C37A275570B.png 1470w" sizes="auto, (max-width: 512px) 100vw, 512px" /></figure></div>



<div class="wp-block-image"><figure class="aligncenter size-large is-resized"><img loading="lazy" decoding="async" src="https://www.ovh.com/blog/wp-content/uploads/2020/03/F2E6717E-B355-46AB-AD73-6C98B6CE4B19-1024x553.png" alt="" class="wp-image-17418" width="512" height="277" srcset="https://blog.ovhcloud.com/wp-content/uploads/2020/03/F2E6717E-B355-46AB-AD73-6C98B6CE4B19-1024x553.png 1024w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/F2E6717E-B355-46AB-AD73-6C98B6CE4B19-300x162.png 300w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/F2E6717E-B355-46AB-AD73-6C98B6CE4B19-768x415.png 768w, https://blog.ovhcloud.com/wp-content/uploads/2020/03/F2E6717E-B355-46AB-AD73-6C98B6CE4B19.png 1470w" sizes="auto, (max-width: 512px) 100vw, 512px" /></figure></div>



<p>Most notably, workflow processor stops its execution in two cases:</p>



<ul class="wp-block-list"><li>Once the whole workflow finishes, with all tasks successfully completed</li><li>When it can’t proceed any further, due to a failed task</li></ul>



<h3 class="wp-block-heading">How to integrate</h3>



<p>So how do we use this? Fortunately, I was able to find a way to use Celery Dyrygent quite easily. First of all, you need to inject the workflow processor task definition into your Celery applicationP:</p>



<pre class="wp-block-code"><code class="">from celery_dyrygent.tasks import register_workflow_processor
app = Celery() #  your celery application instance
workflow_processor = register_workflow_processor(app)</code></pre>



<p>Next, you need to convert your Celery defined workflow into a Celery Dyrygent workflow:</p>



<pre class="wp-block-code"><code class="">from celery_dyrygent.workflows import Workflow

celery_workflow = chain([
    noop.s(0),
    chord([noop.s(i) for i in range(10)], noop.s()),
    chord([noop.s(i) for i in range(10)], noop.s()),
    chord([noop.s(i) for i in range(10)], noop.s()),
    chord([noop.s(i) for i in range(10)], noop.s()),
    chord([noop.s(i) for i in range(10)], noop.s()),
    noop.s()
])

workflow = Workflow()
workflow.add_celery_canvas(celery_workflow)</code></pre>



<p>Finally, simply execute the workflow, just as you would an ordinary Celery task:</p>



<pre class="wp-block-code"><code class="">workflow.apply_async()</code></pre>



<p>That’s it! You can always go back if you wish, as the small changes are very easy to undo.</p>



<h3 class="wp-block-heading">Give it a try!</h3>



<p>Celery Dyrygent is free to use, and its source code is available on Github (<a href="https://github.com/ovh/celery-dyrygent" data-wpel-link="external" target="_blank" rel="nofollow external noopener noreferrer">https://github.com/ovh/celery-dyrygent</a>). Feel free to use it, improve it, request features, and report any bugs! It has a few additional features not described here, so I&#8217;d encourage you to take a look at the project’s readme file. For our automation requirements, it&#8217;s already a solid, battle-tested solution. We’ve been using it since the end of 2018, and it has processed thousands of workflows, consisting of hundreds of thousands of tasks. Here are some productions stats, from June 2019 to February 2020:</p>



<ul class="wp-block-list"><li>936,248 elementary tasks executed</li><li>11,170 workflows processed</li><li>4,098 tasks in the biggest workflow so far</li><li>~84 tasks per workflow, on average</li></ul>



<p>Automation is always a good idea!</p>
<img loading="lazy" decoding="async" src="//blog.ovhcloud.com/wp-content/plugins/matomo/app/matomo.php?idsite=1&amp;rec=1&amp;url=https%3A%2F%2Fblog.ovhcloud.com%2Fdoing-big-automation-with-celery%2F&amp;action_name=Doing%20BIG%20automation%20with%20Celery&amp;urlref=https%3A%2F%2Fblog.ovhcloud.com%2Ffeed%2F" style="border:0;width:0;height:0" width="0" height="0" alt="" />]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
