]> This object property is used to annotate a step in the workflow, a group of steps, a subworkflow, or a workflow with a data motif. This object property is used to annotate a step in the workflow, a group of steps, a subworkflow, or a workflow with a workflow motif. Certain activities such as analysis or visualizations could be performed through interaction with stateful (web) services that allow for creation of jobs over remote grid environments. These are typically performed via invocation of multiple operations at a service endpoint. This motif is used to characetrize the workflows that perform an atomic unit of functionality, which effectively requires no sub-workflow usage. Typically these workflows are designed to be included in other work- flows. Atomic workflows are the main mechanism of modu- larizing functionality wthin scientific workflows. The usage of sub-workflows ap- pears as a motif for exploiting modular functionality from multiple workflows. This motif refers to all those workflows that have one or more sub-workflows included in them (in some cases, these sub-workflows offer different views of the global workflow). This motif is used to refer to activities that do not require human inputs during their execution. This motif refers to a rather broad category of tasks in diverse domains. An important number of workflows are designed with the purpose of analyzing differ- ent features of input data, ranging from simple comparisons between the datasets to complex protein analysis to see if two molecules can be docked successfully. We have observed the steps for cleaning and curating data as a separate category from data preparation and filtering. Typically these steps are undertaken by sophisticated tooling/services, or by human interactions. A cleaning step essentially preserves and enriches the content of data (e.g., by a user’s annotation of a result with additional information, detecting and removing inconsistencies on the data, etc.). A data motif describes the data manipulation and/or transformation carried out by a step in the workflow, a collection of steps in the workflow or a subworkflow. Certain analysis activities that are per- formed via external tools or services require the submission of data to a location accessible by the service/tool (i.e., a web or a local directory respectively). In such cases the workflow contains dedicated step(s) for the upload/transfer of data to these locations. The same applies to the outputs, in which case a data download/retrieval step is used to chain the data to the next steps of the workflow. The datasets brought into a pipeline may not be Data could further be filtered, sampled or could be subject to extraction of various subsets. In addition to filtering, certain tasks are dedicated to merging data sets created by different branches of workflows. Sorting or grouping results is also observed under this category. Data, as it is originally retrieved, may need several transformations before being able to be used in a workflow step. The steps in the workflow performing such transformations can be annotated used the Data Preparation motif. Workflows exploit heterogeneous data sources, remote databases, repositories or other web resources mostly exposed via SOAP or REST services. Scientific data deposited in these repositories are retrieved through query and retrieval steps inside workflows. Certain tasks within the workflow are responsible for retrieving data from such exter- nal source into the workflow environment. Being able to show the results is as important as producing them in some workflows. Scientists use visualizations to show the conclusions of their experiments and to take important decisions in the pipeline itself. Therefore certain steps in workflows are dedicated to generation of plots and graph outputs from input data. Heterogeneity of formats in data representation is a known issue in many scientific disciplines. Workflows that bring together multiple access or analysis activities usually contain steps for format transformations. Typically, these steps, also known as ”Shims”, preserve the contents of data, while converting its repre- sentation format. This motif is used to characetrize the activities that require human inputs during their execution. Data access and analysis steps that are handled by external services or tools typically require well formed query strings or structured requests as input parameters. Certain tasks in workflows are dedicated to the generation of these queries through an aggregation of multiple parameters. Motif observed among workflows. This motif refers to those groups of steps in the workflow that correspond to repetitive patterns of combining tasks. Motif observed within the workflow. Outputs of data access or analysis steps could be subject to data extraction or splitting to allow the conversion of data from the service specific format to the workflows internal data carrying structures (nested lists). Unlike Asynchronous Invocation, Synchronous Invocation requires a single step for service or tool invocation. A workflow motif describes how a data motif is realized (i.e., implemented) within a workflow. This motif is used to characetrize workflows that are used to operate over different input parameter types. An example is performing an analysis over a String input parameter, or performing it over the contents of a specified File. Overloading is a direct response to the heterogeneity of environments in which workflows are used.