Analytics Data Replication

I was very interested to read about the remote links capability for the analytics server in this blog.
I wanted to understand, when the blog refers to remote links ingesting data:

  • Does it mean that the data is physically replicated to the remote analytics server (so that additional storage is needed according to the size of the source bucket)?

  • If data is replicated, does it mean that I cannot run a query (such as the Customer to Order join in the blog example) until replication has reached steady state?

Hi @JonathanGoldberg,

  • Yes, data is physically replicated for remote links (as it is for local links). In both cases the goal is to isolate the analytical workload from the operational workload. We achieve this goal by physically replicating the data and by separating the computation for analytical workloads from the computation for operational workloads.
  • Queries can be run at any time during replication. If replication has not reached steady state the queries will run on the available data. To ensure that the replicated data is sufficient to answer a query the scan_consistency parameter can be set to request_plus (using the REST API, the SDK, or the Analytics Workbench).
1 Like

Thanks so much, this was very helpful.