In recent years, we have witnessed an insatiable thirst for computational power and better performance, especially from data scientists and analysts. To meet this demand, vendors have developed innovative tools and techniques. This innovation has enabled tremendous improvements across different levels of architecture such as databases design, application design, technology architecture and business architecture. In this article, we will discuss one of the key aspects of architecture design that is integration (or sync) strategies.
Integrating data from multiple sources is very common in any organization, especially for the purposes of business intelligence in municipalities. Similarly, in GIS, it is frequently necessary to interface one’s system with external applications—not only for reporting or data analytics but for day-to-day operations as well. Although integrations are largely influenced by available software and tools within your current IT infrastructure, data sync strategies discussed in this blog remain relevant irrespective of the IT infrastructure and environment. These sync strategies can very well be applied when integration with third-party systems from ArcGIS Enterprise, from Cityworks to ERP systems, from ArcGIS Indoors to work management/asset management solutions or ArcGIS Roads and Highways with work management systems.
We will explore common sync strategies between applications, especially around the three main categories of integration with special focus on Cityworks and ArcGIS:
- One-way syncs
- Two-way syncs
- Real-time syncs
We will cover various techniques and methods in each category while remaining agnostic regarding software and tools for ETL—for example, syncing of records incrementally in one direction between Cityworks and other applications. The focus will be on the approach and strategy that realizes a functional requirement rather than diving into any specifics of any integration software. This first blog mainly covers two areas of Cityworks integration: Master data and Transaction data:
- Data coming into Cityworks through integration with Payroll data such as an “Employee” Master data
- Data going out of Cityworks, such as Labour Costs integration with Payroll or Equipment Costs integration with Finance
Cityworks has been successfully integrated with financial systems including Agresso UNIT4, JD Edwards, Microsoft Great Plains Dynamics, Microsoft Navision, Oracle PeopleSoft, SAP, SmartStream, Bellamy, Kronos and many other systems to track resource costs such as labour/payroll transactions, material inventory transactions, equipment costs and contractor costs, etc. Integration enables the organization to identify cost recovery from external chargeable or internal recoverable work against their tangible (and intangible) capital assets. It also permits the tracking of resource costing within the Operations solution as a Sub Ledger and to the ERP General Ledger. This functionality allows municipal staff to identify asset ownership costs and manage budget information to the required levels as well as provide a summarized budget.
Esri Canada has implemented interfaces for many of the above systems with Cityworks using various system interfacing techniques. These techniques include basic based batch import to SOA/web services-based (see the section below on Integration mechanism supported by Cityworks). Various middleware products have also been used for interfaces such as iWay, FME, BizTalk, SSIS (SQL Server Integration Services), custom Web application / Web API’s and others, to develop interfaces with Cityworks.
Cityworks provides RESTful API, Action Manager for Web hooks and Activity update and provides SDKs for software development in various language like C#, Python and Typescript. Cityworks Integrations with third-party solutions will depend of various factors but generally dictated by data governance and system of record. The system of record is the primary location of the given data element, for example see the table below:
|Data Element||System of Record|
|Budgeted projects, materials||Financial information system|
|Customer data, meter reads||CRM or customer information system|
|Geospatial attributes / assets||GIS|
One Way Sync
Designing and implementing an incremental sync requires more effort than simply syncing all records every time and prevents one from exceeding API limits. We will cover the incremental sync in detail between Cityworks and other apps. An incremental (or “delta”) sync is one that only processes the data records that have changed (created or modified) since the last time the integration ran as opposed to processing the entire data set every time. It is the preferred and common approach for most integration scenarios because, by limiting the number of records being processed to only what has changed, it allows the integration to run more quickly and efficiently, which in turn allows it to run more frequently to keep systems up to date. Here are some recommended approaches to identify the best strategy for your situation between the three 1) Extract by “FLAG” field value and 2) Extract by Last Modified Date 3) Change data capture.
Master data and transaction data from authoritative sources typically involve records being synced incrementally in one direction. For instance, master data, such as employees and pay codes from the payroll system (ERP) are updated in Cityworks. In this payroll system being the authoritative source, data steward and data custodian.
Master data and transaction data from authoritative sources typically involve records being synced incrementally in one direction. For instance, master data, such as employees and pay codes from the payroll system (ERP) are updated in Cityworks. In this payroll being the authoritative source, data steward and data custodian.
Consider an example of an employee interface, Cityworks core identifies employee by employee name in several components of the software (e.g. Employee Relates, Work Order “Submit To”). It is critical that the combination of First Name, Middle Initial, and Last Name columns form a unique identifier for the employee record within Cityworks. When a sync process runs, the fields indicated above can help create, update, and deactivate employees in Cityworks. Sync strategy can be greatly enhanced by leveraging “Custom Codes” and “Custom Data fields”. Custom codes can be created and managed using Cityworks Designer. They can be used to configure default settings for any interface process runs, for example, Default active directory domain and license settings can be configured at the end of the interface process run.
More examples of implementations using one-way sync for master data or transaction data will be discussed below. The diagram below lists some of the examples of One-Way sync master data integrations from various source systems to Cityworks.
The figure above shows the various data elements of “Master Data” from source systems into Cityworks.
Let’s dive deeper into the three approaches for one-way sync between Cityworks and other systems:
- Extract by “FLAG” field value
- Extract by Last Modified Date
- Change Data capture
Extract by Flag Field Data Value
Consider the situation in which the interface will need to keep Cityworks updated with employee data from an external Payroll or ERP system. In order to keep Employee master in sync with Payroll, employee records are extracted based on the value of some status or “flag”. Upon successful completion, the integration updates this field to a different value, so it will not be extracted again. This option provides a lot of flexibility and lets end-users easily control the sync behavior by modifying values within the end application. In my view, this is the best approach of all three given that the third-party software—such as Payroll in this case—allows you to edit one of their fields used for the flag or use an intermediate table to give more flexibility between systems.
Moreover, the main advantage of this method is that it allows records to be synced indeterminate of one another. If one employee record in the group fails, the other can be updated successfully and will not be reprocessed. Also, if the field is accessible to end-users, it allows them to easily reset, retry, or ignore individual employee records in a self-service fashion rather than rely on the integration team. This also keeps the integration stateless, meaning nothing is stored on the integration layer, such as the last run date.
From a technical architecture perspective, the interface can be designed to select records with certain values for the flag fields and perform the sync. After successfully syncing the record, the interface can then ensure the process updates the flag field in the source application appropriately. The flag field can be of various forms and will depend on the external source system. It can be actual an “Status” field (integration specific or business-related), a true/false “ready to sync” field, the existence of an “external ID”, etc. It could be a combination of fields and values as well. This process is ideal for records that should be synced once and only once, such as transaction data like Equipment costs or Labour hours, etc.
The figure above shows the various data elements of “Transaction Data” from a source system into Cityworks
Extract by Last Modified Date
This method of one-way sync is simpler and considered valuable when there is no way of using the flag fields option mentioned above as well as when the external applications automatically capture the timestamp of when records are created or modified. This provides great value when you only want to get records that have been created or modified since the last time integration ran. In this approach, records are extracted based on the value of their last modified date field being greater than another date, such as the last time the integration ran or the most recently synced records.
Basically, the process captures the current system timestamp when the integration begins and saves it upon successful completion. To avoid any time discrepancies (usually small) it is best to capture the most recent last modified date from the records themselves and maintain that value upon successful completion. Because the timestamp is from the source application itself, you can be sure no records will be missed due to server time differences. This integration design is simpler than the flag field approach because you do not need to upgrade the source application at the end of the sync.
Change Data Capture
Change Data Capture is one of the last resorts available to us when you cannot use a flag field, there is no last modified date and the application or API only gives you the entire data set every time. With this option, the integration must remember the entire data set from the last execution in a “cache” or intermediate table to be able to compare the current data set against it. Through this comparison, the records that were added, modified or removed can be identified. This can be used for master data and can cautiously be used for transaction data as well. It is important to ensure that the entire dataset is obtained from the source each time and this makes the application stateful. Some business intelligence and integration software such as SQL server integration services, SAP Business objects, Talend etc., does provide “Change data capture” as an out-of-the-box tool that offers an effective solution to the challenge of efficiently performing incremental loads from source tables to the data warehouses.
Getting data from point A and point B comes with a number of options and considerations. Keep these incremental sync techniques in mind when designing your next one-way integration and you will be well on your way to building a lean, mean, efficient sync. In the next blog article, we will discuss application integration techniques on two-way syncs and in real-time sync and how it is used today when integrating with ArcGIS Indoors and ArcGIS Roads & Highways.