Over the years, I have had the opportunity to work on a number of large-scale content management projects. For frequent readers and listeners of the podcast, you know I spent a few years developing a research discovery system that included scanning tens of thousands of hand-written research and lab notebooks. Back in the mid-1990s, we had to cobble together various frameworks and tools that were in the early stages of development, and that made it difficult for us to provide a simple and easy to manage search system. Another problem we had was the only way to really make the documents available was through a tedious process whereby we had to scan each document, have librarians codify the material, have technical teams re-create hand-drawings, and then upload all that content to a file server.
Process and content often go hand-in-hand
Fast-forward to today and there are companies that have solved many of the problems of getting your documents online, searchable, and discoverable. The interfaces have gone from clunky to usable, as there are often a number of manual steps to getting a document online. Very often, you will see research documents starting in their electronic form and are automatically uploaded and available to those with the rights to search and view the content. Drawings and formulas are moving from sketches on a piece of paper to electronic form with machine learning and hand-writing recognition automatically converting content in real-time.
That does not mean we have even come close to solving content management challenges. For example, I work with various clients that require I complete the document then make sure it is properly formatted so some of their systems “understand” the document, and then I can upload the document. In other cases, I have to use a form of some sort before uploading the document. It asks things like “what is the title of the document?” and has me pick from a list of categories for the data like “design specification” or “business plan,” and so on. What happens in those situations is people tend to just ignore uploading the document or just share it via email for others to review.
Very often, people need documents to do their jobs. Salespeople need to know they are opening the latest version of a sales deck. Government agencies need to know you have properly completed a form and followed the proper processes in order for that document to get approved so, for example, you can get a license. That means when you think about content or document management, you also need to consider that some documents have business-critical processes behind them, and we should not just think about documents as forms or Microsoft Word documents either.
Rethinking content, documents, and processes
If I stood in a room full of IT people and said “what is content and document management,” I think most would answer with comments about it being a place to store important documents, categorize that content, and make it discoverable. That answer is similar to what I spoke about earlier in this article, and that is how often I think of the topic as well. However, in speaking with John for this podcast, he reminded me that content is much bigger than Office documents and team sites.
For one visit to a doctor and for a single prescription order, there are dozens of manual and automated processes that take the place of forms, that are — at their core — documents. In trading you have transactional documents, tallying up to (I’m guessing) billions of documents a year. Those types of documents have to be processed in real-time.
Completing a market trade may take some human interaction, no matter how small. The process for completing a form and processing it properly can take a long time for programmers, information workers, and testers to design and build complicated processes. Then, after all that complicated work is done, how do you search for the content and unlock the value and potential of that content?
The future of content management
John Newton and his company, Alfresco, are trying to solve some of the bigger questions and problems we face in managing content and business processes. We had a great conversation to define the content services market, learn about how they operated in the past, look at how we are using them now, and then drilled-down into what is next. That what is next part is pretty exciting.
Stick around to the end of the podcast to learn how machine learning and artificial intelligence can be applied to solve billion-dollar problems, while not forgetting the simpler things in life, like vastly reducing the friction to sharing documents.
Thank you to John Newton for providing his time to discuss the content services market. Connect with John on Twitter @johnnewton.
Special thanks to Sara Black from BoSpar for making the introduction.