Ir para o conteúdo

Storage considerations among variables, temporary storage, cloud caching, and Cloud Datastore in Jitterbit Integration Studio

Introduction

Integration Studio provides multiple approaches for handling data in your integrations. These range from passing simple values between operations to storing and retrieving data. Each approach is designed for specific use cases and performance requirements. This page outlines the recommended approaches and is not a comprehensive list of all approaches.

Variables for data passing

Variables are designed for passing values, configuration settings, and small amounts of data between different components in your integration. Variables are useful when you need to share information like session IDs, configuration parameters, or calculated values across scripts, transformations, and operations. These variable types are available for use:

Type Scope Best for
Local Single script Calculations and temporary values
Global Operation chain Passing data between operations
Project Entire project Configuration and credentials
Jitterbit System-defined Runtime information

For examples and detailed information about each variable type, see their individual documentation pages.

Storage connectors for data persistence

Storage connectors handle the storing and retrieving of files and persistent data within your integrations. These storage connectors are useful when you're working with actual data files, need to temporarily store processing results, or require data that persists beyond a single operation chain:

Connector Persistence Size limits Best for
Variable Operation chain 50 MB Small files and testing
Temporary Storage Operation chain 50 GB (cloud) Large files and processing
Cloud Datastore (Beta) Varies by storage type Varies by purchased tier, see Limits Lookup tables, cross operation data, and status-based workflows

Variable

Variable endpoints (Read and Write activities, which read from or write to a global variables), are easy to code and reduce complexity. However, they have certain limitations.

We recommend using a Variable endpoint for scenarios where an integration works with small datasets, such as web service requests and responses, or small files with a few hundred records.

When the dataset reaches the megabyte range, the Variable endpoint becomes slower and degradation starts to occur when the data exceeds 4 MB in size.

When the dataset is in the larger multi-megabyte range, there is a risk of data truncation. We recommend limiting your use of Variable endpoints to 50 MB to be conservative and prevent any risk of truncation.

Using Variable endpoints in asynchronous operations requires special consideration. There is a limit of 7 KB on the size of a dataset used in a Variable endpoint that is used in an asynchronous operation. In this scenario, exceeding that limit can result in truncation. See the RunOperation function for a description of calling an asynchronous operation.

Variable endpoints enable reuse and reduce complexity

Using a Variable endpoint for small datasets can enable reuse and reduce complexity. For example, when building operations chains, each operation can have activities that function as sources (Read activities) and targets (Write activities). Instead of building individual source or target combinations for each operation, you can use a common Variable target and source.

To increase reusability and standardization, you can build a reusable script that logs the content of the variable. This approach can also be accomplished using temporary storage, but additional scripting is needed to initialize the path and filename.

When using a Variable endpoint, its scope is the operation chain. Variable endpoint values are unique to a particular operation chain and are cleared when the operation chain completes. This is not the case with a Temporary Storage endpoint (described in the following section).

When performing operation unit testing, using a Variable endpoint is useful for loading test data. You can add a script at the beginning of the operation chain to write test data:

  $memory = "a,b,c";

In contrast, writing data to a Temporary Storage endpoint looks like this:

  WriteFile("<TAG>activity:tempstorage/Temporary Storage/tempstorage_write/Write</TAG>", "a,b,c");

  FlushFile("<TAG>activity:tempstorage/Temporary Storage/tempstorage_write/Write</TAG>");

Likewise, reading data is simpler with a Variable endpoint:

  myLocalVar= $memory;

In contrast, this is how you read data from a Temporary Storage endpoint:

  myLocalVar = ReadFile("<TAG>activity:tempstorage/Temporary Storage/tempstorage_read/Read</TAG>");

Temporary Storage

Temporary Storage endpoints are frequently used in operations on both cloud and private agents. These are distinct from Local Storage endpoints, which can only be used on private agents.

When using a Temporary Storage endpoint, temporary files are written to and read from the default operating system's temp directory on the agent that is performing the work:

  • In the case of a single private agent, the temporary storage directory is that private agent server's default temp directory.
  • If you are using more than one private agent in a private agent group, the temporary storage directory is the default temp directory of the specific private agent server doing the work.
  • As cloud agents are clustered in a cloud agent group, the temporary storage directory is the default temp directory of the specific cloud agent server doing the work.

When using either a private agent group with multiple agents or a cloud agent group, temporary storage operations will stay on the same server as long as they're part of an operation chain. However, if you have two separate chains where Chain A writes to temporary storage file myfile and Chain B later reads from myfile, Chain B might not access the same agent server that Chain A used for writing.

Note

Chained operations will always run on the same agent as the parent operation, regardless of synchronicity.

When using temporary storage, keep these guidelines in mind:

  • When using private agents, to make your project upgrade-proof, use temporary storage in such a way that moving from a single private agent to a multiple-agent agent group does not require refactoring.

  • To ensure temporary storage reads and writes happen on the same agent server, keep all Read and Write activities that reference the same temporary storage within a single operation chain. This applies whether you're using a private agent group with multiple agents or a cloud agent group.

  • Temporary storage on private agents is deleted after 24 hours by default by the Jitterbit file cleanup service. The cleanup service frequency can be configured through the private agent configuration file under the [FileCleanup] section. However, on cloud agents, temporary files may be deleted immediately.

  • Cloud agents have a temporary storage file size limit of 50 GB per file. Temporary files larger than 50 GB are possible only when using private agents.

  • When writing to temporary storage, the default is to overwrite files. This can be changed with the Append to File checkbox in a Temporary Storage Write activity. Usually this then requires that after the source is read that the file be deleted or archived. A simple way to do this is to use the post-processing options Delete File or Rename File in a Temporary Storage Read activity.

  • Filename keywords are available that can be used when creating a filename.

    Example

    You can use the [unique] filename keyword in a Temporary Storage Write activity to automatically generate unique filenames and prevent file overwrites. For example, naming your file processing_[unique].csv creates files like processing_20240820143052123.csv.

  • A file in temporary storage can be read by building a script with the ReadFile function. For example: ReadFile("<TAG>activity:tempstorage/Temporary Storage/tempstorage_read/Read</TAG>"). Bear in mind that this works reliably only if there is a single private agent.

Cloud Datastore (Beta)

Cloud Datastore (Beta) provides a persistent storage solution where data persistence varies by storage type, unlike temporary storage solutions that clear data after an operation completes.

Cloud Datastore (Beta) addresses two main use cases:

  • Key-value pair solutions: Store lookup data such as US > United States that can be referenced across multiple operations and projects.
  • Data processing based on status workflows: Manage data that moves through different processing states, with automated cleanup for processed records.

Cloud Datastore (Beta) supports two storage types with different persistence characteristics:

  • Lookup by Key: Data persists until explicitly deleted, ideal for reference data and lookup tables.
  • Lookup by Value: Data with status workflows where records are retained for a maximum of 90 days. However, once data reaches Processed status, it's automatically deleted after 60 days.

Warning

Cloud Datastore (Beta) is not recommended for storing sensitive information such as passwords or endpoint credentials, as the data returned by its activities are in plain text.

Cloud caching functions

Beyond the core variables and storage connectors, Integration Studio offers cloud caching functions to enhance your integration performance and functionality.

The cloud caching functions ReadCache and WriteCache are used to assign data spaces that are available across projects and across environments. A cached value is visible to all operations running in the same scope until it expires, regardless of how that operation was started or which agent it runs on. By caching data in Harmony, rather than relying on local or agent-specific data stores such as Temporary Storage or Variable connectors, data can be shared between separate operations and across projects.

These are additional uses of cloud caching:

  • Data can be shared between asynchronous operations within a project.
  • Errors that are generated across different operations could be stored to a common cache. By accumulating operation results in this manner, more comprehensive alerts can be built.
  • Login tokens can be shared across operations.