REANA-Commons

image image image image image image image

REANA-Commons is a component of the REANA reusable and reproducible research data analysis platform. It provides common utilities and schemas shared by the REANA cluster components.

Features

  • common API clients for internal communication

  • centralised OpenAPI specifications for REANA components

  • AMQP connection management and communication

  • utility functions for cluster components

Usage

The detailed information on how to install and use REANA can be found in docs.reana.io.

Configuration

REANA Commons configuration.

reana_commons.config.COMMAND_DANGEROUS_OPERATIONS = ['sudo ', 'cd /']

Operations in workflow commands considered dangerous.

reana_commons.config.DEFAULT_WORKSPACE_PATH = '/var/reana'

Default workspace path defined by the admin.

reana_commons.config.HTCONDOR_JOB_FLAVOURS = {'espresso': 1200, 'longlunch': 7200, 'microcentury': 3600, 'nextweek': 604800, 'testmatch': 259200, 'tomorrow': 86400, 'workday': 28800}

HTCondor job flavours and their respective runtime in seconds.

reana_commons.config.INTERACTIVE_SESSION_TYPES = ['jupyter']

List of supported interactive systems.

reana_commons.config.K8S_CERN_EOS_AVAILABLE = None

Whether EOS is available in the current cluster or not.

This a configuration set by the system administrators through Helm values at cluster creation time.

reana_commons.config.K8S_CERN_EOS_MOUNT_CONFIGURATION = {'volume': {'hostPath': {'path': '/var/eos'}, 'name': 'eos'}, 'volumeMounts': {'mountPath': '/eos', 'mountPropagation': 'HostToContainer', 'name': 'eos'}}

Configuration to mount EOS in Kubernetes objects.

For more information see the official documentation at https://clouddocs.web.cern.ch/containers/tutorials/eos.html.

reana_commons.config.K8S_USE_SECURITY_CONTEXT = True

Whether to use Kubernetes security contexts or not.

This (enabled by default) runs workflows as the WORKFLOW_RUNTIME_USER_UID and WORKFLOW_RUNTIME_USER_GID. It should be set to False for systems (like OpenShift) that assign ephemeral UIDs.

reana_commons.config.KRB5_CONFIGMAP_NAME = 'reana-krb5-conf'

Kerberos configMap name.

reana_commons.config.KRB5_CONTAINER_IMAGE = 'docker.io/reanahub/reana-auth-krb5:1.0.1'

Default docker image of KRB5 sidecar container.

reana_commons.config.KRB5_INIT_CONTAINER_NAME = 'krb5-init'

Name of KRB5 init container.

reana_commons.config.KRB5_RENEW_CONTAINER_NAME = 'krb5-renew'

Name of KRB5 sidecar container used for ticket renewal.

reana_commons.config.KRB5_STATUS_FILE_CHECK_INTERVAL = 15

Time interval in seconds between checks to the status file.

reana_commons.config.KRB5_STATUS_FILE_LOCATION = '/krb5_cache/status_file'

Status file path used to terminate KRB5 renew container when the main job finishes.

reana_commons.config.KRB5_TICKET_RENEW_INTERVAL = 21600

Time interval in seconds between renewals of the KRB5 ticket.

reana_commons.config.KRB5_TOKEN_CACHE_FILENAME = 'krb5_{}'

Name of the Kerberos token cache file.

reana_commons.config.KRB5_TOKEN_CACHE_LOCATION = '/krb5_cache/'

Directory of Kerberos tokens cache, shared between job/engine & KRB5 container. It should match default_ccache_name in krb5.conf.

reana_commons.config.KUBERNETES_MEMORY_FORMAT = '(?:(?P<value_bytes>\\d+)|(?P<value_unit>(\\d+[.])?\\d+)(?P<unit>[EPTGMK])(?P<binary>i?))$'

Kubernetes valid memory format regular expression e.g. Ki, M, Gi, G, etc.

reana_commons.config.KUBERNETES_MEMORY_UNITS = ['E', 'P', 'T', 'G', 'M', 'K']

Kubernetes valid memory units

reana_commons.config.MQ_CONNECTION_STRING = 'amqp://test:1234@reana-message-broker.default.svc.cluster.local//'

Message queue (RabbitMQ) connection string.

reana_commons.config.MQ_DEFAULT_EXCHANGE = ''

Message queue (RabbitMQ) exchange.

reana_commons.config.MQ_DEFAULT_FORMAT = 'json'

Default serializing format (to consume/produce).

reana_commons.config.MQ_DEFAULT_QUEUES = {'jobs-status': {'durable': False, 'exchange': '', 'routing_key': 'jobs-status'}, 'workflow-submission': {'durable': True, 'exchange': '', 'max_priority': 100, 'routing_key': 'workflow-submission'}}

Default message queues.

reana_commons.config.MQ_HOST = 'reana-message-broker.default.svc.cluster.local'

Message queue (RabbitMQ) server host name.

reana_commons.config.MQ_MAX_PRIORITY = 100

Declare the queue as a priority queue and set the highest priority number.

reana_commons.config.MQ_PASS = '1234'

Message queue (RabbitMQ) password.

reana_commons.config.MQ_PORT = 5672

Message queue (RabbitMQ) service port.

reana_commons.config.MQ_PRODUCER_MAX_RETRIES = 3

Max retries to send a message.

reana_commons.config.MQ_USER = 'test'

Message queue (RabbitMQ) user name.

reana_commons.config.OPENAPI_SPECS = {'reana-job-controller': ('http://0.0.0.0:5000', 'reana_job_controller.json'), 'reana-server': ('http://0.0.0.0:80', 'reana_server.json'), 'reana-workflow-controller': ('http://reana-workflow-controller.default.svc.cluster.local:80', 'reana_workflow_controller.json')}

REANA Workflow Controller address.

class reana_commons.config.REANAConfig

REANA global configuration class.

classmethod load(kind)

REANA-UI configuration.

reana_commons.config.REANA_COMPONENT_NAMING_SCHEME = '{prefix}-{component_type}-{id}'

The naming scheme the components created by REANA should follow.

It is a Python format string which take as arguments: - prefix: the REANA_COMPONENT_PREFIX - component_type: one of REANA_COMPONENT_TYPES - id: unique identifier for the component, by default UUID4.

reana_commons.config.REANA_COMPONENT_PREFIX = 'reana'

REANA component naming prefix, i.e. my-prefix-job-controller.

Useful to find the correct fully qualified name of a infrastructure component and to correctly create new runtime pods.

reana_commons.config.REANA_COMPONENT_PREFIX_ENVIRONMENT = 'REANA'

Environment variable friendly REANA component prefix.

reana_commons.config.REANA_COMPONENT_TYPES = ['run-batch', 'run-session', 'run-job', 'secretsstore']

Type of REANA components.

Note: this list is used for validation of on demand created REANA components names, this is why it doesn’t contain REANA infrastructure components.

run-batch: An instance of reana-workflow-engine-_ run-session: An instance of an interactive session run-job: An instance of a workflow’s job secretsstore: An instance of a user secret store

reana_commons.config.REANA_COMPUTE_BACKENDS = {'htcondor': 'HTCondor', 'kubernetes': 'Kubernetes', 'slurm': 'Slurm'}

REANA supported compute backends.

reana_commons.config.REANA_CVMFS_PVC = {'metadata': {'name': 'reana-cvmfs', 'namespace': 'default'}, 'spec': {'accessModes': ['ReadOnlyMany'], 'resources': {'requests': {'storage': 1}}, 'storageClassName': 'reana-cvmfs'}}

PersistentVolumeClaim used to mount CVMFS repositories.

reana_commons.config.REANA_CVMFS_PVC_NAME = 'reana-cvmfs'

Name of the PersistentVolumeClaim used to mount CVMFS repositories.

reana_commons.config.REANA_CVMFS_STORAGE_CLASS_NAME = 'reana-cvmfs'

Name of the StorageClass used to mount CVMFS repositories.

reana_commons.config.REANA_DEFAULT_SNAKEMAKE_ENV_IMAGE = 'docker.io/snakemake/snakemake:v7.32.4'

Snakemake default job environment image.

reana_commons.config.REANA_INFRASTRUCTURE_COMPONENTS = ['ui', 'server', 'workflow-controller', 'cache', 'message-broker', 'db']

REANA infrastructure pods.

reana_commons.config.REANA_INFRASTRUCTURE_COMPONENTS_HOSTNAMES = {'cache': 'reana-cache.default.svc.cluster.local', 'db': 'reana-db.default.svc.cluster.local', 'message-broker': 'reana-message-broker.default.svc.cluster.local', 'server': 'reana-server.default.svc.cluster.local', 'ui': 'reana-ui.default.svc.cluster.local', 'workflow-controller': 'reana-workflow-controller.default.svc.cluster.local'}

REANA infrastructure pods hostnames.

Uses the FQDN of the infrastructure components (which should be behind a Kubernetes service) following the Kubernetes DNS-Based Service Discovery

reana_commons.config.REANA_INFRASTRUCTURE_KUBERNETES_NAMESPACE = 'default'

Kubernetes namespace in which REANA infrastructure is currently deployed.

reana_commons.config.REANA_INFRASTRUCTURE_KUBERNETES_SERVICEACCOUNT_NAME = None

REANA infrastructure service account.

reana_commons.config.REANA_JOB_CONTROLLER_CONNECTION_CHECK_SLEEP = 10.0

How many seconds to wait between job controller connection checks.

reana_commons.config.REANA_JOB_HOSTPATH_MOUNTS = []

List of dictionaries composed of name, hostPath and mountPath.

  • name: name of the mount.

  • hostPath: path in the Kubernetes cluster host nodes that will be mounted into job pods.

  • mountPath: path inside job pods where hostPath will get mounted. This is optional, by default the same path as the hostPath will be used

This configuration should be used only when one knows for sure that the specified locations exist in all the cluster nodes. For example, if all nodes in your cluster have a directory /usr/local/share/mydata, and you pass the following configuration:

REANA_JOB_HOSTPATH_MOUNTS = [
    {"name": "mydata",
     "hostPath": "/usr/local/share/mydata",
     "mountPath": "/mydata"},
]

All jobs will have /mydata mounted with the content of /usr/local/share/mydata from the Kubernetes cluster host node.

reana_commons.config.REANA_LOG_FORMAT = '%(asctime)s | %(name)s | %(threadName)s | %(levelname)s | %(message)s'

REANA components log format.

reana_commons.config.REANA_LOG_LEVEL = 20

Log verbosity level for REANA components.

reana_commons.config.REANA_MAX_CONCURRENT_BATCH_WORKFLOWS = 30

Upper limit on concurrent REANA batch workflows running in the cluster.

reana_commons.config.REANA_RESOURCE_HEALTH_COLORS = {'critical': 'red', 'healthy': 'green', 'warning': 'yellow'}

REANA mapping between resource health statuses and click-compatible colors.

reana_commons.config.REANA_RUNTIME_BATCH_KUBERNETES_NODE_LABEL = {}

Kubernetes label (with format label_name=label_value) which identifies the nodes where the runtime batch workflows should run.

If not set, the runtime pods run in any available node in the cluster.

reana_commons.config.REANA_RUNTIME_JOBS_KUBERNETES_NODE_LABEL = {}

Kubernetes label (with format label_name=label_value) which identifies the nodes where the runtime jobs should run.

If not set, the runtime pods run in any available node in the cluster.

reana_commons.config.REANA_RUNTIME_KUBERNETES_KEEP_ALIVE_JOBS_WITH_STATUSES = []

Keep alive Kubernetes user runtime jobs depending on status.

Keep alive both batch workflow jobs and invididual step jobs after termination when their statuses match one of the specified comma-separated values (possible values are: finished, failed). By default all jobs are cleaned up.

Example: REANA_RUNTIME_KUBERNETES_KEEP_ALIVE_JOBS_WITH_STATUSES="finished,failed" would keep jobs that terminated successfully and jobs that failed.

reana_commons.config.REANA_RUNTIME_KUBERNETES_NAMESPACE = 'default'

Kubernetes namespace in which REANA runtime pods should be running in.

By default runtime pods will run in the same namespace as the infrastructure pods.

reana_commons.config.REANA_RUNTIME_KUBERNETES_SERVICEACCOUNT_NAME = None

REANA runtime service account.

If no runtime namespace is deployed it will default to the infrastructure service account.

reana_commons.config.REANA_RUNTIME_SESSIONS_KUBERNETES_NODE_LABEL = {}

Kubernetes label (with format label_name=label_value) which identifies the nodes where the runtime sessions should run.

If not set, the runtime sessions run in the same nodes as runtime jobs if REANA_RUNTIME_JOBS_KUBERNETES_NODE_LABEL is set, otherwise, they will be allocated in any available node in the cluster.

reana_commons.config.REANA_SHARED_PVC_NAME = 'reana-shared-persistent-volume'

Name of the shared CEPHFS PVC which will be used by all REANA jobs.

reana_commons.config.REANA_STORAGE_BACKEND = 'local'

Storage backend deployed in current REANA cluster [‘local’|’cephfs’].

reana_commons.config.REANA_USER_SECRET_MOUNT_PATH = '/etc/reana/secrets'

Default mount path for user secrets which is mounted for job pod & workflow engines.

reana_commons.config.REANA_WORKFLOW_ENGINES = ['yadage', 'cwl', 'serial', 'snakemake']

Available workflow engines.

reana_commons.config.REANA_WORKFLOW_NAME_ILLEGAL_CHARACTERS = ['.']

List of illegal characters for workflow name validation.

reana_commons.config.REANA_WORKFLOW_UMASK = 2

Umask used for workflow workspace.

reana_commons.config.SHARED_VOLUME_PATH = '/var/reana'

Default shared volume path.

reana_commons.config.WORKFLOW_RUNTIME_GROUP_NAME = 'root'

Default OS group name for running job controller.

reana_commons.config.WORKFLOW_RUNTIME_USER_GID = 0

Default group id for running job controller/workflow engine apps & jobs.

If the group id is changed to a value different than zero, then also the WORKFLOW_RUNTIME_GROUP_NAME needs to be changed to a value different than root.

reana_commons.config.WORKFLOW_RUNTIME_USER_NAME = 'reana'

Default OS user name for running job controller.

reana_commons.config.WORKFLOW_RUNTIME_USER_UID = 1000

Default user id for running job controller/workflow engine apps & jobs.

reana_commons.config.WORKFLOW_TIME_FORMAT = '%Y-%m-%dT%H:%M:%S'

Time format for workflow starting time, created time etc.

reana_commons.config.WORKSPACE_PATHS = {}

Dictionary of available workspace paths with pairs of cluster_node_path:cluster_pod_mountpath.

reana_commons.config.default_workspace()

Obtain default workspace path.

reana_commons.config.kubernetes_node_label_to_dict(node_label)

Load Kubernetes node label to Python dict.

reana_commons.config.reana_yaml_schema_file_path = '/home/docs/checkouts/readthedocs.org/user_builds/reana-commons/checkouts/latest/reana_commons/validation/schemas/reana_analysis_schema.json'

REANA specification schema location.

reana_commons.config.workspaces(paths)

Tranform list of mounted workspaces as strings, to dictionary of pairs as cluster_node_path:cluster_pod_mountpath.

API

REANA API client

REANA REST API base client.

class reana_commons.api_client.BaseAPIClient(service, http_client=None)

REANA API client code.

class reana_commons.api_client.JobControllerAPIClient(service, http_client=None)

REANA-Job-Controller http client class.

check_if_cached(job_spec, step, workflow_workspace)

Check if job result is in cache.

check_status(job_id)

Check status of a job.

get_logs(job_id)

Get logs of a job.

submit(workflow_uuid='', image='', cmd='', prettified_cmd='', workflow_workspace='', job_name='', cvmfs_mounts='false', compute_backend=None, kerberos=False, kubernetes_uid=None, kubernetes_memory_limit=None, unpacked_img=False, voms_proxy=False, rucio=False, htcondor_max_runtime='', htcondor_accounting_group='', slurm_partition='', slurm_time='', kubernetes_job_timeout: int | None = None)

Submit a job to RJC API.

Parameters:
  • workflow_uuid – UUID of the workflow.

  • job_name – Name of the job.

  • image – Identifier of the Docker image which will run the job.

  • cmd – String which represents the command to execute. It can be modified by the workflow engine i.e. prepending cd /some/dir/.

  • prettified_cmd – Original command submitted by the user.

  • workflow_workspace – Path to the workspace of the workflow.

  • cvmfs_mounts – String with CVMFS volumes to mount in job pods.

  • compute_backend – Job compute backend.

  • kerberos – Decides if kerberos should be provided for job container.

  • voms_proxy – Decides if grid proxy should be provided for job container.

  • rucio – Decides if a rucio environment should be provided for job.

  • kubernetes_uid – Overwrites the default user id in the job container.

  • kubernetes_memory_limit – Overwrites the default memory limit in the job container.

  • unpacked_img – Decides if unpacked iamges should be used.

  • htcondor_max_runtime – Maximum runtime of a HTCondor job.

  • htcondor_accounting_group – Accounting group of a HTCondor job.

  • slurm_partition – Partition of a Slurm job.

  • slurm_time – Maximum timelimit of a Slurm job.

  • kubernetes_job_timeout – Timeout for the job in seconds.

Returns:

Returns a dict with the job_id.

reana_commons.api_client.get_current_api_client(component)

Proxy which returns current API client for a given component.

REANA Kubernetes API client

Kubernetes API Client.

reana_commons.k8s.api_client.create_api_client(api: str = 'BatchV1')

Create Kubernetes API client using config.

Parameters:

api – String which represents which Kubernetes API to spawn. By default BatchV1.

Returns:

Kubernetes python client object for a specific API i.e. BatchV1.

REANA Kubernetes volumes.

reana_commons.k8s.volumes.create_cvmfs_persistent_volume_claim()

Create CVMFS persistent volume claim.

reana_commons.k8s.volumes.get_k8s_cvmfs_volumes(cvmfs_repositories)

Get volume mounts and volumes need to mount CVMFS in pods.

Parameters:

cvmfs_repositories – List of CVMFS repositories to be mounted.

reana_commons.k8s.volumes.get_reana_shared_volume()

Return REANA shared volume as k8s spec.

Depending on the configured storage backend REANA will use just a local volume in the host VM or a persistent volume claim which provides access to a network file system.

Returns:

k8s shared volume spec as a dictionary.

reana_commons.k8s.volumes.get_shared_volume(workflow_workspace)

Get shared CephFS/hostPath volume to a given job spec.

Parameters:

workflow_workspace – Absolute path to the job’s workflow workspace.

Returns:

Tuple consisting of the Kubernetes volumeMount and the volume.

reana_commons.k8s.volumes.get_workspace_volume(workflow_workspace)

Get shared CephFS/hostPath workspace volume to a given job spec.

Parameters:

workflow_workspace – Absolute path to the job’s workflow workspace.

Returns:

Tuple consisting of the Kubernetes volumeMount and the volume.

REANA AMQP Publisher

REANA-Commons module to manage AMQP connections on REANA.

class reana_commons.publisher.BasePublisher(queue, routing_key, connection=None, exchange=None, durable=False, max_priority=None)

Base publisher to MQ.

close()

Close connection.

class reana_commons.publisher.WorkflowStatusPublisher(**kwargs)

Progress publisher to MQ.

publish_workflow_status(workflow_uuid, status, logs='', message=None)

Publish workflow status using the configured.

Parameters:
  • workflow_uudid – String which represents the workflow UUID.

  • status – Integer which represents the status of the workflow, this is defined in the reana-db Workflow models.

  • logs – String which represents the logs which the workflow has produced as output.

  • message – Dictionary which includes additional information can be attached such as the overall progress of the workflow.

class reana_commons.publisher.WorkflowSubmissionPublisher(**kwargs)

Workflow submission publisher.

publish_workflow_submission(user_id, workflow_id_or_name, parameters, priority=0, min_job_memory=0, retry_count: int | None = None)

Publish workflow submission parameters.

REANA AMQP Consumer

REANA-Commons module to manage AMQP consuming on REANA.

class reana_commons.consumer.BaseConsumer(queue=None, connection=None, message_default_format=None)

Base RabbitMQ consumer.

get_consumers(Consumer, channel)

Map consumers to specific queues.

Parameters:
  • Consumer – A class:kombu.Consumer to use for instantiating consumers.

  • channel – A class:kombu.transport.virtual.AbstractChannel.

on_message(body, message)

Implement this method to manipulate the data received.

Parameters:
  • body – The received message already decoded in the specified format.

  • message – A class:kombu.transport.virtual.Message.

REANA Serial workflow utilities

REANA Workflow Engine Serial implementation utils.

reana_commons.serial.check_htcondor_max_runtime(specification)

Check if the field htcondor_max_runtime has a valid input.

Parameters:

reana_specification – reana specification of workflow.

reana_commons.serial.serial_load(workflow_file, specification, parameters=None, original=None, **kwargs)

Validate and return a expanded REANA Serial workflow specification.

Parameters:

workflow_file – A specification file compliant with REANA Serial workflow specification.

Returns:

A dictionary which represents the valid Serial workflow with all parameters expanded.

REANA utilities

REANA-Commons utils.

reana_commons.utils.build_caching_info_message(job_spec, job_id, workflow_workspace, workflow_json, result_path)

Build the caching info message with correct formatting.

reana_commons.utils.build_progress_message(total=None, running=None, finished=None, failed=None, cached=None)

Build the progress message with correct formatting.

reana_commons.utils.build_unique_component_name(component_type, id=None)

Use REANA component type and id build a human readable component name.

Parameters:
  • component_type – One of reana_commons.config.REANA_COMPONENT_TYPES.

  • id – Unique identifier, if not specified a new UUID4 is created.

Returns:

String representing the component name, i.e. reana-run-job-123456.

reana_commons.utils.calculate_file_access_time(workflow_workspace)

Calculate access times of files in workspace.

reana_commons.utils.calculate_hash_of_dir(directory, file_list=None)

Calculate hash of directory.

reana_commons.utils.calculate_job_input_hash(job_spec, workflow_json)

Calculate md5 hash of job specification and workflow json.

reana_commons.utils.check_connection_to_job_controller(port=5000)

Check connection from workflow engine to job controller.

reana_commons.utils.click_table_printer(headers, _filter, data, colours=None)

Generate space separated output for click commands.

reana_commons.utils.copy_openapi_specs(output_path, component)

Copy generated and validated openapi specs to reana-commons module.

reana_commons.utils.format_cmd(cmd)

Return command in a valid format.

reana_commons.utils.get_disk_usage(directory, summarize=False, search=None, to_human_readable_units=None)

Retrieve directory disk usage information.

Parameters:
  • directory – Disk usage directory.

  • summarize – Displays a total size of a directory.

  • search – Filter parameters to show only files that match certain filtering.

  • to_human_readable_units – Callback to transform bytes to human readable units.

Returns:

List of dicts with file name and size.

reana_commons.utils.get_disk_usage_info_paths(absolute_path, command, name_filter)

Retrieve the path for disk usage information.

Parameters:
  • absolute_path – System path to reana filesystem.

  • command – Command to get the disk usage from reana filesystem.

  • name_filter – Name filter parameters if any.

Returns:

List of disk usage info containing the file path and size.

reana_commons.utils.get_files_recursive_wildcard(directory_path, path)

Get file(s) fitting the wildcard from the workspace.

Parameters:
  • directory_path – Directory to get files from.

  • path – Wildcard pattern to use for the extraction.

Returns:

list of paths sorted by length.

reana_commons.utils.get_quota_resource_usage(resource: Dict, human_readable_or_raw: str) Tuple[str, str | None]

Return quota resource usage and health.

Parameters:
  • resource – Dict representing quota resource obtained from get_quota_usage()

  • human_readable_or_raw – One of (“human_readable”, “raw”)

Returns:

Tuple containing quota resource usage string and resource health. i.e. (“1 MiB out of 10 MiB used (10%)”, “healthy”)

reana_commons.utils.get_usage_percentage(usage: int, limit: int) str

Usage percentage.

reana_commons.utils.get_workflow_status_change_verb(status: str) str

Give the correct verb conjugation depending on status tense.

Parameters:

status – String which represents the status the workflow changed to.

reana_commons.utils.is_directory(directory_path, path)

Whether the given path matches a directory or not.

Parameters:
  • directory_path – Directory to check files from.

  • path – Optional wildcard pattern to use for the check.

Returns:

Full path if it is a directory, False if not.

reana_commons.utils.remove_upper_level_references(path)

Remove upper than ./ references.

Collapse separators/up-level references avoiding references to paths outside working directory.

Parameters:

path – User provided path to a file or directory.

Returns:

Returns the corresponding sanitized path.

reana_commons.utils.run_command(cmd, display=True, return_output=False, stderr_output=False)

Run given command on shell in the current directory.

Exit in case of troubles.

Parameters:
  • cmd (str) – shell command to run

  • display (bool) – should we display command to run?

  • return_output (bool) – shall the output of the command be returned?

REANA errors

REANA Commons errors.

exception reana_commons.errors.MissingAPIClientConfiguration

REANA Server URL is not set.

exception reana_commons.errors.REANAConfigDoesNotExist(message)

Validation error.

exception reana_commons.errors.REANAEmailNotificationError(message)

Email notification error.

exception reana_commons.errors.REANAJobControllerSubmissionError(message)

REANA Job submission exception.

exception reana_commons.errors.REANAKubernetesMemoryLimitExceeded(message)

Kubernetes memory value exceed max limit.

exception reana_commons.errors.REANAKubernetesWrongMemoryFormat(message)

Kubernetes memory value has wrong format.

exception reana_commons.errors.REANAMissingWorkspaceError(message)

Missing workspace error.

exception reana_commons.errors.REANAQuotaExceededError(message='User quota exceeded.')

Quota exceeded error.

exception reana_commons.errors.REANASecretAlreadyExists

The referenced secret already exists.

exception reana_commons.errors.REANASecretDoesNotExist(missing_secrets_list=None)

The referenced REANA secret does not exist.

exception reana_commons.errors.REANAValidationError(message)

Validation error.

exception reana_commons.errors.REANAWorkspaceError

Error accessing and managing a workspace.

Changelog

0.9.8 (2024-03-01)

Build

  • python: change extra names to comply with PEP 685 (#446) (9dad6da)

  • python: require smart-open<7 for Python 3.6 (#446) (17fd581)

  • python: restore snakemake reports extra (#446) (904178f)

Continuous integration

  • commitlint: allow release commit style (#447) (1208ccf)

0.9.7 (2024-02-20)

Build

Documentation

  • authors: complete list of contributors (#442) (4a74c10)

0.9.6 (2024-02-13)

Features

  • config: allow customisation of runtime group name (#440) (5cec305)

  • snakemake: upgrade to Snakemake 7.32.4 (#435) (20ae9ce)

Bug fixes

  • cache: handle deleted files when calculating access times (#437) (698900f)

Code refactoring

Continuous integration

  • commitlint: addition of commit message linter (#432) (a67906f)

  • commitlint: check for the presence of concrete PR number (#438) (d3035dc)

  • release-please: initial configuration (#432) (687f2f4)

  • shellcheck: check all shell scripts recursively (#436) (709a685)

  • shellcheck: fix exit code propagation (#438) (85d9a2a)

0.9.5 (2023-12-15)

  • Fixes installation by pinning bravado-core to versions lower than 6.1.1.

0.9.4 (2023-11-30)

  • Changes the REANA specification schema to use the draft-07 version of the JSON Schema specification.

  • Changes validation of REANA specification to expose functions for loading workflow input parameters and workflow specifications.

  • Changes validation of REANA specification to make the environment property mandatory for the steps of serial workflows.

  • Changes validation of REANA specification to raise a warning for unexpected properties for the steps of serial workflows.

  • Changes CVMFS support to allow users to automatically mount any available repository.

  • Fixes the mounting of CVMFS volumes for the REANA deployments that use non-default Kubernetes namespace.

0.9.3 (2023-09-26)

  • Adds support for Python 3.12.

  • Adds the OpenAPI specification support for prune_workspace endpoint that allows to delete files that are neither inputs nor outputs from the workspace.

  • Adds support for tests.files in reana.yaml allowing to specify Gherkin feature files for testing runnable examples.

  • Changes the OpenAPI specification to include the run_stopped_at property in the workflow progress information returned by the workflow list and workflow status endpoints.

  • Changes the OpenAPI specification to include the maximum_interactive_session_inactivity_period value to the info endpoint.

  • Changes the email sending utility to allow configuring authentication and encryption options.

  • Changes validation of REANA specification to emit warnings about unknown properties.

  • Fixes the verbs used to describe changes to the status of a workflow in order to avoid incorrect grammatical phrases such as workflow has been failed.

  • Fixes the loading of Snakemake and CWL workflow specifications when no parameters are specified.

  • Fixes the OpenAPI specification of GitLab OAuth endpoint return statuses.

  • Fixes container image names to be Podman-compatible.

  • Fixes the email sending utility to not send emails when notifications are disabled globally.

0.9.2.1 (2023-07-19)

  • Changes PyYAML dependency version bounds in order to fix installation on Python 3.10+.

0.9.2 (2023-02-10)

  • Fixes wcmatch dependency version specification.

0.9.1 (2023-01-18)

  • Changes Kerberos renew container’s configuration to log each ticket renewal.

0.9.0 (2022-12-13)

  • Adds support for Python 3.11.

  • Adds support for Rucio.

  • Adds REANA specification validation and loading logic from reana-client.

  • Adds common utility functions for managing workspace files.

  • Adds OpenAPI specification support for launch endpoint that allows running workflows from remote sources.

  • Adds OpenAPI specification support for get_workflow_retention_rules endpoint that allows to retrieve the workspace file retention rules of a workflow.

  • Adds generation of Kerberos init and renew container’s configuration.

  • Adds support for Unicode characters inside email body.

  • Changes OpenAPI specification to include missing response schema elements and some other small enhancements.

  • Changes the Kubernetes Python client to use the networking/v1 API.

  • Changes REANA specification loading functionality to allow specifying different working directories.

  • Changes REANA specification to allow enabling Kerberos for the whole workflow.

  • Changes REANA specification to allow specifying retention_days for the workflow.

  • Changes REANA specification to allow specifying slurm_partition and slurm_time for Slurm compute backend jobs.

  • Changes the loading of Snakemake specifications to preserve the current working directory.

  • Fixes the submission of jobs by stripping potential leading and trailing whitespaces in Docker image names.

0.8.5 (2022-02-23)

  • Adds retry_count parameter to WorkflowSubmissionPublisher.

0.8.4 (2022-02-08)

  • Adds new configuration variable to toggle Kubernetes security context. (K8S_USE_SECURITY_CONTEXT)

  • Changes installation to revert Yadage dependency versions.

0.8.3 (2022-02-04)

  • Changes installation to remove upper version pin on kombu.

0.8.2 (2022-02-01)

  • Adds support for Python 3.10.

  • Adds workflow name validation utility.

  • Changes Snakemake loaded specification to include compute backends.

  • Changes OpenAPI specification with respect to return supported compute backends in info endpoint.

  • Fixes file system usage calculation on CephFS shares in get_disk_usage utility function.

0.8.1 (2021-12-21)

  • Adds OpenAPI specification support for kubernetes_job_timeout handling.

  • Changes OpenAPI specification for cluster health status endpoint.

  • Changes Yadage dependencies to allow 0.21.x patchlevel-version updates.

  • Changes installation to require Python-3.6 or higher versions.

0.8.0 (2021-11-22)

  • Adds get_disk_usage utility function to calculate disk usage for a directory.

  • Adds Yadage workflow specification loading utilities.

  • Adds workspace validation utilities.

  • Adds Snakemake workflow engine integration.

  • Adds custom objects API instance to k8s client.

  • Adds available worklow engines configuration.

  • Adds environment variable to define time between job controller connection checks.

  • Adds cluster health status endpoint.

  • Adds OpenAPI specifications with respect to user quotas.

  • Changes workflow-submission queue as a priority queue and allows to set the priority number on workflow submission.

  • Changes OpenAPI specifications with respect to turning workspaces endpoint into info.

  • Changes publisher logging level on error callback.

  • Removes support for Python 2.

0.7.5 (2021-07-02)

  • Adds support for glob patterns when listing workflow files.

  • Adds support for specifying kubernetes_memory_limit for Kubernetes compute backend jobs.

0.7.4 (2021-03-17)

  • Adds new functions to serialise/deserialise job commands between REANA components.

  • Changes reana_ready function location to REANA-Server.

0.7.3 (2021-02-22)

  • Adds new configuration variable to toggle runtime user jobs clean up depending on their statuses. (REANA_RUNTIME_KUBERNETES_KEEP_ALIVE_JOBS_WITH_STATUSES)

  • Adds central class to instantiate workflow engines with more resilience. (workflow_engine.create_workflow_engine_command)

0.7.2 (2021-02-02)

  • Adds support for Python 3.9.

  • Fixes minor code warnings.

  • Fixes a helper function that calculates directory hashes.

  • Changes OpenAPI specifications with respect to sign-up form.

  • Changes OpenAPI specifications with respect to email confirmation.

  • Changes CI system to include Python flake8 checker.

0.7.1 (2020-11-09)

  • Adds support for restarting yadage workflows (through accept_metadir operational option).

  • Allows htcondor_max_runtime and htcondor_accounting_group to be specified for HTC jobs.

  • Adds new field in REANA-Server OpenAPI spec to return server version.

  • Changes CI system from Travis to GitHub Actions.

0.7.0 (2020-10-20)

  • Adds new utility to send emails.

  • Adds centralised validation utility for workflow operational options.

  • Adds new configuration variable to set the maximum number of running workflows. (REANA_MAX_CONCURRENT_BATCH_WORKFLOWS)

  • Adds new configuration variable to set prefix of REANA cluster component names. (REANA_COMPONENT_PREFIX)

  • Adds new configuration variable for the runtime pod node selector label. (REANA_RUNTIME_KUBERNETES_NODE_LABEL)

  • Adds new configuration variable to define the Kubernetes namespace in which REANA infrastructure components run. (REANA_INFRASTRUCTURE_KUBERNETES_NAMESPACE)

  • Adds new configuration variable to define the Kubernetes namespace in which REANA runtime components components run. (REANA_RUNTIME_KUBERNETES_NAMESPACE)

  • Adds possibility to specify unpacked container images for running jobs.

  • Adds support for initfiles operational option for the Yadage workflow engine.

  • Fixes memory leak in Bravado client instantiation.

  • Changes CephFS Persistent Volume Claim name. (REANA_SHARED_PVC_NAME)

  • Changes default logging level to INFO.

  • Changes default CVMFS volume list to include LHCb Gaudi related workflows.

  • Changes code formatting to respect black coding style.

  • Changes underlying requirements to use Kubernetes Python library version 11.

  • Changes underlying requirements to use latest CVMFS CSI driver version.

  • Changes documentation to single-page layout.

0.6.1 (2020-05-25)

  • Upgrades Kubernetes Python client.

0.6.0 (2019-12-19)

  • Adds new API for Gitlab integration.

  • Adds new Kubernetes client API for ingresses.

  • Adds new APIs for management of user secrets.

  • Adds EOS storage Kubernetes configuration.

  • Adds HTCondor and Slurm compute backends.

  • Adds support for streaming file uploads.

  • Allows unpacked CVMFS and CMS open data volumes.

  • Adds Serial workflow step name and compute backend.

  • Adds support for Python 3.8.

0.5.0 (2019-04-16)

  • Centralises log level and log format configuration.

  • Adds new utility to inspect the disk usage on a given workspace. (get_workspace_disk_usage)

  • Introduces the module to share Celery tasks accross REANA components. (tasks.py)

  • Introduces common Celery task to determine whether REANA can execute new workflows depending on a set of conditions such as running job count. (reana_ready, check_predefined_conditions, check_running_job_count)

  • Allows the AMQP consumer to be configurable with multiple queues.

  • Introduces new queue for workflow submission. (workflow-submission)

  • Introduces new publisher for workflow submissions. (WorkflowSubmissionPublisher)

  • Centralises Kubernetes API client configuration and initialisation.

  • Adds Kubernetes specific configuration for CVMFS volumes as utils.

  • Introduces a new method, copy_openapi_specs, to automatically move validated OpenAPI specifications from components to REANA Commons openapi_specifications directory.

  • Centralises interactive session types.

  • Introduces central REANA errors through the errors.py module.

  • Skips SSL verification for all HTTPS requests performed with the BaseAPIClient.

0.4.0 (2018-11-06)

  • Aggregates OpenAPI specifications of REANA components.

  • Improves AMQP re-connection handling. Switches from pika to kombu.

  • Enhances test suite and increases code coverage.

  • Changes license to MIT.

0.3.1 (2018-09-04)

  • Adds parameter expansion and validation utilities for parametrised Serial workflows.

0.3.0 (2018-08-10)

  • Initial public release.

  • Provides basic AMQP pub/sub methods for REANA components.

  • Utilities for caching used in different REANA components.

  • Click formatting helpers.

Contributing

Bug reports, issues, feature requests, and other contributions are welcome. If you find a demonstrable problem that is caused by the REANA code, please:

  1. Search for already reported problems.

  2. Check if the issue has been fixed or is still reproducible on the latest master branch.

  3. Create an issue, ideally with a test case.

If you create a pull request fixing a bug or implementing a feature, you can run the tests to ensure that everything is operating correctly:

$ ./run-tests.sh

Each pull request should preserve or increase code coverage.

License

MIT License

Copyright (C) 2018, 2019, 2020, 2021, 2022, 2023, 2024 CERN.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

In applying this license, CERN does not waive the privileges and immunities granted to it by virtue of its status as an Intergovernmental Organization or submit itself to any jurisdiction.

Authors

The list of contributors in alphabetical order: