Cloud Development Challenges – Part II
This is the second part of my thoughts about unique challenges of cloud development. The first part, Cloud Development Challenges – Part 1, discussed issues of platform versioning and rate limiting among other things.
After a few attempts to standardize the authentication process, nearly all cloud vendors have settled on OAuth2. For those not familiar with it, the OAuth flow as it’s called is actually pretty straightforward. Your application redirects the user to the cloud app’s sign in page. The user identifies themselves to the service in the usual way and grants access to the original application. The application can then get an access and refresh token, and an access expiration. At the point, the calling application can use the access token to interact with the cloud system on behalf of the authenticated user.
A lot of cloud services use OAuth as the sole authentication mode. The challenge for high-end service applications like SeeUnity’s Echo or Velocity products is that managing those precious tokens in a high-speed multi-threaded service can be tricky. Access tokens only last for a fairly short amount of time before having to be refreshed, usually in a matter of minutes or an hour, and refresh tokens can be used only once. If you’re running a migration pumping millions of documents into some cloud DMS the last thing you want to do is stop every hour and have a user re-enter credentials. Managing those tokens in real time is a must.
For products like Echo or Velocity, a common technique to access ALL the data in a DMS is through some master account. This works well when everything in a system is considered shared, or at least sharable, and therefore accessible by a single user.
However, a lot of cloud systems support the idea of each user having their own personal area, accessed only by the owner. Prominent examples include OneDrive, Google Drive, Box, and most other file sharing apps. Obviously, making each user authenticate for some background process isn’t practical.
What we’re seeing in these cases is the concept of a system account. These are typically set and enabled by an administrator at the highest level, and are specifically meant for service applications like SeeUnity. Once SeeUnity is provided with this special account, then all user personal, as well as shared areas, can be accessed in the background. Note that this is slightly different than impersonation, where one account can be granted access to pretend to be multiple users.
I get the feeling a lot of development teams have had some troubles getting used to a new way of interacting with cloud systems programmatically. In the past, you would take a combination of a marginally documented API, access to the database, maybe access to the file system, a look at system logs and performance counters, and manage to figure it all out.
Some vendors even used direct database access to update data directly, like updating meta-data or access control lists. SeeUnity has always avoided this for a variety of reasons including potential future changes in data schemas, unintended side-effects with the core DMS system, and possible vendor support issues.
With cloud systems, direct access and updates essentially go away. Your sole interaction with the system is through a user interface and an API. The reliance on the API for all programmatic access means it has to be well documented, it has to be fast enough for high-volume applications like the ones from SeeUnity, returned error messages have to be robust, and support from the API vendor has to competent.
The silver lining for all of this means that there’s a lot more standardization of APIs centered on the REST model. A really good example of this is date-time format and meaning. For most REST APIs, a date-time value is represented in the industry standard and is UTC. There are plenty of other examples of this standardization.