Road Forward Towards Deployment As An EBRAINS CWL Workflow
Introduction
The EBRAINS CWL workflow is a crucial step towards deploying the Cobrawap workflow on the EBRAINS workflow system. In this article, we will discuss the current state of the Cobrawap workflow, the identified issues, and the proposed solutions to enable the deployment of Cobrawap as an EBRAINS CWL workflow.
Current State of Cobrawap
Cobrawap is a Snakemake-based workflow that relies on a Cobrawap YAML config file to define input parameters and a list of blocks to run for each stage. The current development branch allows for an alternative invocation of the cobrawap run
command based on CWL (Common Workflow Language). This command creates a tool-CWL YAML file for each block based on the Python command-line arguments and a workflow-CWL YAML file that chains the individual blocks as indicated in the Cobrawap YAML file.
Identified Issues
Inconsistent Parameter Names
The mapping of Cobrawap YAML config parameters to command-line options of individual blocks is not standardized. This leads to a complex parsing and mapping of parameters when constructing the workflow-CWL YAML file within the cobrawap run
command. Additionally, the workflow-CWL file cannot be used as a config file since it contains parameters that control the execution flow (e.g., inputs and outputs) that should be hidden from the user.
Solution: Refactor the Cobrawap config file to use a hierarchical system of stage/block/parameter name that reflects the parameters of each individual block.
Manual Construction of Workflow Order
Currently, two custom codes govern the creation of the block order in the Snakemake ruleset and in cobrawap run
code, respectively. This may lead to inconsistency and is prone to errors in the configuration that is not reported to the user.
Solution: Identify a common description of block interdependencies that is used by both, the Snakemake ruleset and CWL creator in cobrawap run
. This requires further discussion on how "rigid" or "flexible" the composition of allowed workflows should be.
Enabling Execution of Cobrawap Blocks on the EBRAINS Workflow System
To enable the execution of Cobrawap blocks on the EBRAINS workflow system, we propose the following steps:
- Outsource tool-CWL file generation: The generation of tool-CWL files is outsourced to either a separate script or to a separate
cobrawap generate-cwl
command. - Create a GitHub action: A GitHub action is created that engineers CWL files for each block after each successful pull request to master/main. If possible, these CWL files are added to the repository. From then on,
cobrawap run --workflow-engine=cwl
uses these precompiled tool-CWL by default instead of building them on the fly. - Release CWL files and Python scripts: When a new version is released, a separate GitHub action pushes the released CWL files and Python scripts belonging to the blocks to a separate repository for the EBRAINS workflow components (i.e., Cobrawap blocks). From this, T4.3 develops a way to curate these CWL files further and makes them available to the EBRAINS workflow system.
- Submit workflow-CWL file for execution:
cobrawap run
gets a new parameter--workflow-engine=ebrains
, which builds the workflow-CWL file and submits it for execution to the EBRAINS workflow service. This will need to check that Cobrawap versions match and then execute the workflow-CWL description using the tool-CWL files registered with the service. Input filename in the config file must be adapted accordingly.
Potential Foreseeable Problem
Inclusion of custom workflow blocks is difficult, e.g., for data entry. Therefore, the data entry stage may need to be performed on the local machine.
Conclusion
The deployment of Cobrawap as an EBRAINS CWL workflow is a crucial step towards enabling the execution of Cobrawap blocks on the EBRAINS workflow system. By addressing the identified issues and implementing the proposed solutions, we can ensure a seamless deployment of Cobrawap on the EBRAINS workflow system.
Future Work
Further discussion is required on how "rigid" or "flexible" the composition of allowed workflows should be. Additionally, the development of a way to curate CWL files further and make them available to the EBRAINS workflow system is necessary.
References
- [1] EBRAINS 2.0 Event in Heidelberg, March 2025.
- [2] Cobrawap documentation.
- [3] CWL documentation.
Introduction
In our previous article, we discussed the current state of the Cobrawap workflow, the identified issues, and the proposed solutions to enable the deployment of Cobrawap as an EBRAINS CWL workflow. In this article, we will address some of the frequently asked questions (FAQs) related to the deployment of Cobrawap as an EBRAINS CWL workflow.
Q: What is the current state of the Cobrawap workflow?
A: The Cobrawap workflow is currently a Snakemake-based workflow that relies on a Cobrawap YAML config file to define input parameters and a list of blocks to run for each stage. The current development branch allows for an alternative invocation of the cobrawap run
command based on CWL (Common Workflow Language).
Q: What are the identified issues with the current workflow?
A: The identified issues include inconsistent parameter names, manual construction of workflow order, and the need for a more standardized approach to workflow execution.
Q: How will the deployment of Cobrawap as an EBRAINS CWL workflow address these issues?
A: The proposed solutions include refactoring the Cobrawap config file to use a hierarchical system of stage/block/parameter name, identifying a common description of block interdependencies, and outsourcing tool-CWL file generation to a separate script or command.
Q: What is the role of the GitHub action in the deployment process?
A: The GitHub action will engineer CWL files for each block after each successful pull request to master/main. If possible, these CWL files will be added to the repository, and cobrawap run --workflow-engine=cwl
will use these precompiled tool-CWL by default instead of building them on the fly.
Q: How will the deployment of Cobrawap as an EBRAINS CWL workflow enable the execution of Cobrawap blocks on the EBRAINS workflow system?
A: The deployment of Cobrawap as an EBRAINS CWL workflow will enable the execution of Cobrawap blocks on the EBRAINS workflow system by submitting the workflow-CWL file for execution to the EBRAINS workflow service. This will check that Cobrawap versions match and then execute the workflow-CWL description using the tool-CWL files registered with the service.
Q: What are the potential foreseeable problems with the deployment of Cobrawap as an EBRAINS CWL workflow?
A: The potential foreseeable problems include the difficulty of inclusion of custom workflow blocks, e.g., for data entry, which may need to be performed on the local machine.
Q: What is the next step in the deployment process?
A: The next step in the deployment process is to develop a way to curate CWL files further and make them available to the EBRAINS workflow system.
Q: Who is responsible for the development of the CWL files and Python scripts for the EBRAINS workflow components?
A: T4.3 is responsible for developing a way to curate CWL files further and making them available to the EBRAINS workflow system.
Q: What is the expected outcome of the deployment of Cobrawap as an EBRAINS CWL workflow?
A: The expected outcome of the deployment of Cobrawap as an EBRAINS CWL workflow is the seamless execution of Cobrawap blocks on the EBRAINS workflow system.
Conclusion
The deployment of Cobrawap as an EBRAINS CWL workflow is a crucial step towards enabling the execution of Cobrawap blocks on the EBRAINS workflow system. By addressing the identified issues and implementing the proposed solutions, we can ensure a seamless deployment of Cobrawap on the EBRAINS workflow system.
Future Work
Further discussion is required on how "rigid" or "flexible" the composition of allowed workflows should be. Additionally, the development of a way to curate CWL files further and make them available to the EBRAINS workflow system is necessary.
References
- [1] EBRAINS 2.0 Event in Heidelberg, March 2025.
- [2] Cobrawap documentation.
- [3] CWL documentation.