And When to Use Them
I can’t remember a single enterprise-level ERP implementation that I have been on that had no integrations. Virtually every enterprise implementation will require some type of integration. Unfortunately, integrations bring the highest level of risk to a project. Most of the time it is because architectural best practices are not followed for the integrations.
This article will discuss the 5 types of commonly seen integrations and the pros and cons of each type. I wanted to call these integration patterns, but by software pattern standards these 5 types of integrations are too high-level. There are various patterns used to implement these integration types, but I don’t think there is enough definition here to call any of these a pattern.
Table of Contents
1. File Transfer Integrations
The file transfer integration type is when a source system generates an output payload that is stored, sent, and processed as a file. This integration type is implemented using various mechanisms that include a shared network storage location, an FTP site, and/or even cloud storage providers like Google Drive or Dropbox.
File transfer is by far the most common integration type I have seen. This is most likely due to the fact that it is easy to program and historically this has been one of the only options supported by legacy applications. When data was exchanged by carrying a tape or disk from one system to another the natural method was to use a file transfer integration type.
Advantages of These Integrations
Many Applications Have Built-in Support
The ability to export and/or import CSV and Excel files is very prevalent today. For example, I have worked with several systems that provide a user interface to create an automated import that will take an Excel file that uses a defined layout and will process records from the file. This can be scheduled to run periodically.
Simple to Program
It is easy to write code that handles files. There are some "gotchas" that I will discuss in the disadvantages, but simply taking a file, moving it, copying it, deleting it, and sending it to an FTP site or cloud storage is all simple. Processing the file may be a different story, but file handling is easy.
Disadvantages of These Integrations
Difficult to Do Updates and Deletes Several times I have seen scenarios when a source system "drops" a file with its current list of customers (or products or other master data items). The target system then needs to deduce what records are new customers, what records have changed, and what records are missing because they were deleted. The logic for this is not as easy as it may sound. While handling the file transfer is easy to program, the bulk nature of file integration may make the business logic quite complicated. File Handling Is Fragile One of the most frequent comments I hear from users that depend on file-based integrations is: "the integration doesn't always work." The reason for this is most commonly due to the fragile nature of file handling. It can be something as simple as the scheduled transfer didn't complete before the target system attempted to process the file to something major like running out of disk space. Inefficient for Transactions File transfer integrations work well with batch scenarios like periodic archiving of large recordsets. However, if you want to send orders from an e-commerce system to the fulfillment system in near real-time a file transfer integration type is not an efficient mechanism. Consider the scenario that I have encountered in the agriculture industry. A scale system delivers its output transactions as a file. There is the option to batch it and send it periodically or to have each transaction as its own file. If you wanted to integrate with that system to process the transactions nightly, the file transfer integration type will be adequate. However, if you want to process in near real-time, you must check regularly for the existence of files. This is an expensive operation. If you check every minute, you may find one or more files to process 25% of the time. That means 75% of the file handling is wasted process time. Once the integration picks up the file, it has to either archive or delete the file, so it isn't processed again. If the file processing fails, it will have to retrieve the file from the archive to try the process again. Security I debated whether to bring this up because this issue is easily mitigated by using a secured network resource for file storage. However, I have so often seen file transfer integrations utilize a public folder or in some cases a local folder on a PC. In these cases, the data being transferred is exposed to many people who have no business accessing it. The file could be deleted or tampered with, and your integration would have no way of knowing.
When To Use
There are times when it is appropriate to use a file transfer integration type. Here are some of the times when you should use this integration type.
The Only Available Option Frankly, sometimes this is the only option available. If you are working with a legacy application that only provides file export and/or import as its only option to exchange data, you may be forced into this option. Easily Supported by Both Systems I have worked with systems that "watch" file folders and provide all the logic necessary to parse the files and handle the complexities for adds and edits. In one case, the same system also provided a Simple Object Access Protocol (SOAP) interface. The cost to develop a SOAP integration versus a file integration was very large. It made the most sense to use a file transfer integration type in this case. Non-transactional, Single Direction, Time Insensitive Data Transfer If you need to move or copy non-transactional data from one system to another and the data is not time-sensitive, it may be appropriate to utilize a file transfer integration type.
When Not to Use
You Must Write the Integration If there is no existing integration and you are designing one, you should consider several other options before deciding to use a file transfer integration type. Data Synchronization If you need to handle adds, edits, and deletes, especially in a bidirectional way, you should not use a file transfer integration type. Time Sensitive If the data is time-sensitive, if you need near-real-time processing, you should not use a file transfer integration type. Multiple Targets If you want to send the same information to multiple locations, the file transfer integration type may not be the ideal solution. There are other factors to consider when making this determination. However, if you try to implement an integration that sends a single file to a single output location and multiple target systems need to access the file for integration purposes, you are using the wrong integration type. This will be a very fragile system.
2. Direct Database Connection Integrations
Another common integration type that I frequently stumble upon is utilizing a shared database. There are several ways to implement this integration type.
In some cases, the source and target applications utilize the same database instance for their respective data. Often, when this occurs, one or both of the systems were developed in-house.
In other cases, the target application reads and sometimes writes data from the source system. We see this mostly with an in-house developed application integrating into a legacy, on-premises system.
I have also seen a couple of applications utilize a data table in a shared database instance to stage data for integration. Basically, it mimicked a message queue.
Advantages of These Integrations
Performance Although other technologies are beginning to catch up, generally speaking, a direct database access will outperform other connected technologies. One primary reason for this is database connections tend to be made on a local area network. This allows for significantly more bandwidth than cloud connections. Having said that, in many cases, the overall performance of a database connected integration compared to a properly implemented, event-driven, messaging integration may be inferior. If your only options are a direct connection or some other form of CRUD based access (i.e. OData), the database connection will provide better performance. Security There is no inherent security advantage that databases have over many other integration types. However, I wanted to point out that it is relatively easy to secure the data being exchanged between database connected integration systems. Simple Programming One reason why direct database connection integration types are used so often is because people know how to use them. Database access is a very common skill among developers, even those maintaining legacy systems. This is not true for some other options.
Disadvantages of These Integrations
Concurrency Although concurrency issues are not common in my experience, I have encountered them when troubleshooting database connection integrations. I will also say that using other methods does not entirely eliminate these issues. If you are reading and/or writing database records directly, handling concurrency issues becomes your responsibility and you need to consider them. Internal Knowledge Required One of the most common issues I have seen with direct database connection integrations is a lack of understanding of the data model and/or the business logic that uses that data model. I have seen integrations that assume edits in the source system modify the record when in fact the source system simply marks it for deletion (using an "effective" date time column) and creates a new record. If you don't understand the date effectivity pattern being used by the application logic, your integration will be completely incorrect. Any time you access data directly from an application database you will need to understand the related data model (including referential integrity) and the business logic using that data model. Server Access Required Although this is not universally true, many on-premises legacy systems use databases that require a local network connection. When this is the case, it can be difficult to utilize a connection from a cloud instance and may require an on-premises integration application or the use of a gateway.
When to Use
The Only Available Option I have encountered legacy systems where the only way to access the data needed in an integration is by using the database containing the data. Sometimes you simply have no other options. In these cases, you may need to utilize a direct database connection integration type. If you do use this integration type, you should take steps to mitigate the potential issues mentioned above. Read Only and On-Prem If you have a need to integrate an application that is running on the same network as the source application and you need to read the data from the source application, then a direct database connection may be an appropriate integration type. I would still recommend that you explore other options and use this option only if it is your best option.
When Not to Use
Don't Have Internal Knowledge If you have no knowledge of the internal business logic that uses the database you should avoid this integration type if possible. There are so many ways for your integration to be invalid or incorrect without knowing how the applications are utilizing their respective data models. Cloud Computing If you are building a modern integration using cloud technology you should avoid creating a direct database connection, especially for on-premises applications. Although there are ways to make this work with gateways and other technologies, you should really try to find other options.
3. Point-to-Point Service Call Integrations
Most modern applications provide a service layer called an Application Programming Interface (API). We use a Point-to-Point service call integration type by writing code in one system to utilize the API on the other system to perform the integration tasks. The API can take the form of a Remote Procedure Call or a SOAP service or a REST service.
One example of this could be an ERP application that exposes an address change API for a customer record. You may have an e-commerce site or customer portal that allows a customer to update their address. A Point-to-Point Service Call integration would have the e-commerce or customer portal call the ERP application API directly to propagate the address changes captured by the customer-facing application.
Another implementation may have some middleware recognize when a customer changes their address and uses an API on the customer-facing application to obtain the changes and then make the call to the ERP application API to propagate the address changes on behalf of the customer-facing application.
In both cases, the integration has two specific endpoints. One for the source system and another for the target system. The integration code talks directly to the target system via a service call.
Advantages of These Integrations
Quick and Simple (With Tool Help) If you only have a small number of integrations to build and they don't use the same data, it can be quick to utilize a Point-to-Point integration. You simply point a code generation tool at the endpoint metadata (i.e. WSDL or Swagger / OpenAPI) and you have client code that can interact with that endpoint. If you don't have the tools available for that, there may be a lot more to consider, but there are very few scenarios that will not allow tooling to do much of the heavy lifting for you. Facilitates Service Encapsulation In some cases, the reason for integration is addressed through a provided service layer in one application that can be utilized by the integrating application. If you are writing both sides of the integration, you can build a service in one application that does everything you want and then call that service from the other application. This can be relatively inexpensive and easily encapsulated.
Disadvantages of These Integrations
Cost of Multipoint Integrations One of the disadvantages of the Point-to-Point service call integration is when you need to integrate the same data for multiple applications. For example, if you utilize the same customer information in multiple applications you will have to integrate from one source system to multiple source systems. Each endpoint requires a certain amount of overhead to manage the endpoint configuration, connect to the endpoint, react to various return information and close the connection. If the data is not the same, the source data may need to be transformed from the source format to the target format. Even if you use much of the same code for this, it is not free. Unfortunately, it is not always possible to execute the same code, and that adds more code management complications. Tight Coupling Coupling is when an interdependency exists between software modules or applications. Tightly coupled applications may create ripple effects when one application changes. It is generally accepted that tightly coupled applications increase the total cost of ownership over time. Integrations based on Point-to-Point Service Calls have a higher level of coupling than integrations based on some form of asynchronous messaging pattern.
When to Use
Single Integration With Limited Budget It wasn't that long ago that Point-to-Point Service Calls were considered a best practice. I still see this form of integration being built today. I may even recommend it under certain conditions. I just wrote an integration like this. We are tracking our time in a cloud-based time clock application and we needed to integrate it into our project billing system. There was an API available in our project billing system that added a time entry to a project. It was built specifically to accomplish the task I needed. All I had to do was build an integration that took the data from the cloud-based time clock application and call the API to add it to the appropriate project. We only need to run this once a week and it can be run, monitored, and verified by a user in less than 5 minutes. We really didn't need any other integration type. If we need that same time clock data in another application, however, I would have to add the code to send it to that application utilizing another endpoint with another API. The money I saved with Point-to-Point Service Calls will start to dwindle.
When Not to Use
Several Integrations If you have several integrations using the same data, you should not use a Point-to-Point Service Call integration if you can avoid it. Even if the data is different, if you have several different integrations a Point-to-Point Service Call may be the wrong type of integration. For example, if you have data coming from a Human Capital Management (HCM) system, a Payroll system, an e-Commerce system, and a Point of Sale (POS) system all integrating into an Enterprise Resource Planning (ERP) system, you will likely benefit from utilizing an integration framework as middleware to handle all the integrations. Certainly, if you have the same or similar data going to or from multiple systems, you will definitely benefit from utilizing an integration framework as middleware to handle those integrations.
4. Point-to-Point Messaging Integrations
You may be wondering, “what is the difference between Point-to-Point Service Calls and Point-to-Point messaging?” As the name may suggest, the primary difference is the messaging versus service calls. With Point-to-Point Service Calls, one application (or a piece of middleware) utilizes a proprietary API to call a service on another application. With Point-to-Point Messaging, one application (or a piece of middleware) sends a message to a queue for another application (or a proxy for that application). Point-to-Point Messaging can be implemented as asynchronous fire-and-forget processing. This means the message producer sends a message to a queue and does not expect to receive a response. Point-to-Point Messaging can also handle asynchronous request/reply processing by having the message producer send a message on one queue and then wait on a reply queue for the response from the receiver.
Advantages of These Integrations
Ensures Only One Receiver If you have a requirement to guarantee that only a specific endpoint receives a specific message, then Point-to-Point Messaging may be the proper integration. With Point-to-Point Messaging, a single message is sent specifically to a single endpoint.
Disadvantages of These Integrations
Only One Receiver If you have a requirement to send the same message to multiple endpoints, then Point-to-Point Messaging is not the proper integration for you. I have been involved with implementations that start out only having a single destination but later need to be able to have multiple endpoints that consume the same message. This is not a job for a Point-to-Point Messaging integration.
When to Use
I don’t want to be redundant, but you should use this messaging type if your requirements include a guarantee that only a specific endpoint receives a specific message. This is an excellent integration type to meet that requirement.
When Not to Use
If your implementation requires multiple endpoints receiving the same message or you anticipate that it will in the future, you should not use Point-To-Point Messaging.
5. Publish and Subscribe Messaging Integrations
I thought about using “Multicast Messaging” for this integration type, but the reality is that I most often see Publish and Subscribe (Pub/Sub) as the primary “Multicast Messaging” integrations implemented. There are other ways to provide multicast functionality but I have seen very few of those actually used.
Pub/Sub Messaging is a type of integration that allows the sending application to be completely decoupled from the receiving application. Basically, the source application sends and forgets the message. The Pub/Sub Messaging system (commonly implemented as a service bus) determines if any applications care about the message and if so, how to deliver the message to all applications that are subscribed to the channel (commonly called a topic).
Advantages of These Integrations
Decoupling The primary advantage of the Pub/Sub Messaging integration type is that it provides great decoupling of applications. The source application requires no knowledge of the applications that receive its messages. Scaling This may be redundant because scaling comes primarily because of decoupling. It is a good reason to decouple applications. But the reality is, Pub/Sub Messaging integrations can massively scale. You can literally have millions of subscribers (target application) and it will have no impact on the publisher (source application) and no impact on each subscriber. The main challenge for scaling is in the service bus and/or the messaging middleware. These are commonly provided by cloud providers and built for scaling, so the provider solves that problem for you. Resiliency In my experience, Pub/Sub systems tend to be very resilient messaging systems. Because they are implemented as stand-alone middleware systems, they tend to be robust and "always on." Because they hold messages in queues for subscribers, there is no requirement for the target system to be "always on." Although the code implementation of these systems can be complex, the operations are quite simple. There is not much to go wrong. I have had systems that run millions of messages per day with virtually no downtime and no lost messages. No Lost Messages The reliable transfer of data in an integration is the most important aspect of an integration. Yet, one of the most common issues I have seen with integrations is lost data. Pub/Sub Messaging systems can be configured to never lose messages. You may want to "throw away" unhandled messages after a period of time-based on the integration requirements, but you don't have to. This can be handy for synchronization systems that are only occasionally connected.
Disadvantages of These Integrations
Message Size Most Pub/Sub systems have a limitation on the message size. It is usually quite small. You often cannot send files with this type of integration. There are ways to work around that (like a message containing a URL from which to retrieve the file), but those workarounds add complexity. No Direct Reply This isn't really a disadvantage of only Pub/Sub Messaging integrations. It is more a challenge of any asynchronous integration. If you don't "wait" for a reply but you require one, you will need to have code that correlates the correct reply to the correct message. With other forms of integrations, this can be handled quite readily with a separate task that sends the message (or service call, file, etc.) and "waits" for the reply. With Pub/Sub messaging, the level of decoupling does not allow this correlation to be so easily handled.
When to Use
Frankly, I use this type of integration whenever I can. There are many ways to work around the message size and reply correlation challenges, and there are so many advantages to this type of integration. This is my first choice. I may choose not to use it for various reasons, but I always contemplate it before rejecting this type of integration.
Messaging Multiple Targets If you need to send a specific message to potentially multiple targets, Pub/Sub is usually your best choice. As I mentioned above, there are other multicast methods, but Pub/Sub is often the best choice. This is especially true if the list of targets is dynamic.
When Not to Use
Simple Integrations If you have a single integration that is quite simple and there is no anticipation the data will need to integrate to multiple applications, you should probably use a simpler type of integration. Synchronization or Correlation Required If your integration requirements included synchronization or correlation, Pub/Sub Messaging may not be the right integration type. Although these requirements can be met with this integration type, it adds unnecessary complications that are eliminated with other forms of integrations. Integrating Large Payloads If your integration requirements include the transfer of large data payloads such as videos or high-resolution images, Pub/Sub Messaging may not be the best integration type.
Summary
Integrations are difficult to get “right.” The “right” integration depends on many combinations of requirements. This article does not provide enough information for you to make a definitive decision on the best integration for your scenario. However, I hope that it helps you understand why there are so many options and helps you begin to know what questions you need to ask before making that decision.
If you would like some help evaluating integration options for your specific requirements, please reach out to me. I’d be happy to assist you if I can.
About the Author
Tory Bjorklund, a seasoned leader in the consulting, manufacturing, and software sectors, currently holds the position of CEO at Victoria Fide. With a remarkable career that spans roles such as CEO, CTO, CIO, and Chief Software Architect, Tory consistently demonstrates his bold and visionary thinking. His enthusiasm for harnessing technology to transform businesses is evident, and he fervently advocates for reshaping conventional norms in digital transformation through Making Change Positive. Connect with him on LinkedIn to follow his journey.