Well designed solutions that include 3rd party/external data are hard. At one time or another, every developer has been in the following situation: the client has 3rd party/external data that needs to be integrated directly into their website project. Problems arise when the data itself is in a constant state of flux, constantly changing, or the format may need to be adjusted based on what the client is currently providing. My goal is to help you make the best decision on how to design your solutions to help both current you and future you. While I’m going to be discussing the technical decisions to be made it is important to always consider any timeline or budgetary constraints when working with a client. Everyone has an important role to play and usually the developer or technical architect is expected to provide the pros and cons of the technical implementations and some very rough estimates of effort to implement. A good solution typically involves coming together as a team to make a decision that is in the best interests of the client, even if it hurts our developer sensibilities.
The Three Common Forms of External Data
We’ll focus on the three most common forms of data consumption and storage that are experienced by Episerver and Ektron developers:
- All in the CMS
- Integration through 3rd Party APIs
- Custom storage/middleware interface
These cover the largest number of integrations and scenarios you’re likely to see during a site implementation. They each come with their own problems and advantages that can easily influence the appropriate decision.
All in the CMS
The first and most frequently implemented scenario is to port all the data directly into the CMS and use this as a primary store for information. Usually, this is introduced as a solution when the client is migrating from an existing system that already has this information stored and it would be simple to script that content into the new CMS. As Episerver and Ektron developers, we tend to treat those systems as swiss army knives. When you’re working with small quantities of data or data that is relatively persistent this is a good thing. Having a single location for all the data you would need means that you can rely on a single set of API calls to leverage when building out your content. This is most appropriate for data that is fairly static, needs an easy interface to edit that doesn’t currently exist and doesn’t need to be published or reused.
The downside is that migration to a different system or providing this data to a 3rd party or application can be difficult (as an example: migration from Ektron to Episerver). It's important to consider whether data will be in reuse in the relatively near future. If the client is likely to consider a redesign or restructure in less than a year or two, it might be worth the investment to provide a secondary, decoupled interface to this data to speed up any migration of that data.
Integration through 3rd Party APIs
The second approach sometimes involves pulling data from a 3rd party API. Usually, this means that the client has another interface that they use to edit or maintain this information. Hopefully, that API has a well-documented set of calls that can be made (whether via web services or some other mechanism of consumption) to facilitate this exchange of information. If the 3rd party provider doesn’t have these calls well documented, this would be a good time to raise a hand and ask them to provide that before moving forward. Often times the documentation exists in some format but if it doesn’t and you’re asked to provide the format, data, and access preferences you should sit down with the client and map out exactly what data is needed in what settings.
In either case, you’ll want to evaluate the way in which the data is going to be used. Sometimes clients have different expectations for data presentation on the web versus the way in which they edit and maintain the data. It's important to have a plan of action for what format the data should be in for you to use and any adjustments that need to be made to facilitate its usage on the site. Don’t be afraid to take your time. It is important for you and the client to bottom out on expectations and that can sometimes mean having hard conversations about data formats and realistic ways that data can be used in its current state.
This third solution is usually a hard sell. It involves pulling all the information the client needs into a single repository that is organized specifically to the needs of presentation on the website. This can be more cost intensive and it requires a very thorough knowledge of the client’s existing data and knowledge of the middleware's capabilities. Sometimes this is overkill because the client is providing a very similar experience on their website as their data business requirements. In these cases developing a set of 3rd party APIs for consumption is usually sufficient for the client needs and defers some of the cost to another project or internal team. In the scenario where the client needs to present the data in a format that is not consistent with their business needs, you’ll be best served to investigate creating your own custom integration or middleware.
I’ve covered the negatives above. This solution can be costly, it involves a very specific knowledge of client data and puts the onus on you to make sure that things are communicating smoothly. This isn’t all bad though.
This is also the most flexible of all the solutions. You’re likely to see some significant increases in speed and direct access to data. You can provide any caching layer you require and you’re open to providing this information to any 3rd party system or service you or the client require. In the case of large and complex data sets, this can also allow you tightly couple the data to CMS content that would otherwise require some sort of search or API specific integration to associate. All-in-all large data sets can be made decidedly less complex and tailored to the specific needs of the client. A well-normalized dataset can also be adjusted easily to support any number of presentation or business needs.
Stop, take some time to think
No matter which solution you’re leaning in favor of implementing the biggest and most important step you can take is to really think it through. Ask the important questions of the client, make sure you consider future you or the future developer who isn’t you that might not have all the domain-specific knowledge needed to move forward.