DevOps The Hard Way
A tale of frustration. But I won.
Dateline March 2017
I'm going to be doing more management and 'glue' business the next year or so. Part of this business is selling and personifying the value of DevOps. Like Cloud, this is something that is insufficiently understood at a deep(er) and nuanced level. So, as is customary, I'm going to tell a story.
The story, like most of mine, comes from an experience that burned. Something that left scars and had me wondering how people cold get into this mess. And so like the man says, share your scars. The situation was that I was on the very cutting edge of what I could do with Essbase + Essbase Studio. The requirement for drill through was fairly obvious. We all knew the limits of how much data we can squirrel into a multidimensional cube. So I used Essbase Studio to map back to the Oracle DB and bring back some records. Now it turns out that my customers wanted something on the order of 10,000 records in this detail. Well that doesn't seem like much. You could grab 10,000 records across a dozen columns, cut and paste them from one Excel spreadsheet to another right? That should only take a few seconds. Not the drill-through. That data had to come over the network. Well you could copy a spreadsheet with a dozen MB of data from a network drive to your desktop, right? That should only take a minute. Not the drill-through. We had to fulfill a query request from a database.
My queries were taking 7 minutes and 30 seconds and then dying.
I had to find out why. Thus began my painful birth of being DevOps. The first thing I had to learn was the difference between a view and a materialized view. Well that wasn't so difficult to learn. But I had always assumed that my DBA was materializing data for me. Well he didn't have enough disk space to do that for ad-hoc queries. So that meant I had to learn the procedure for requesting new disk space from the DBAs. How much did I need? I don't know. A terabyte? Impossible! Impossible? I can go to Best Buy and get a terabyte. Yeah but one live terabyte means four other terabytes according to our backup and DR, and we're at the limit of the current server which means we'd have to get a SAN device and... well how about an NFS drive? Nope. Can't have an NFS drive that would slow down everything I need local storage. Well, we'll get back to you. But how can you be sure that the database is the bottleneck? I don't know.
I had to find out where. What is timing out? Was it the Excel add-in? No. Was it the java middleware? Maybe. Who knows how to read the profile of the java middleware? Well there's no documentation for that, you'll have to call the engineers at Oracle. OK. Open up a service request and get an appointment. Who has access to the middle tier? Get access to the middle tier so you can log on. Oh by the way, the one support engineer is in Mumbai. That means you stay late, past 7pm Pacific time to get your answers, when he's available. OK change the profile, add in this line for the timeout. That didn't work? Oh you have to get the latest patch. Will it work with the version of Essbase Studio we're running here? Oh snap, we're going to have to burn a new version for you, but you're going to have to upgrade your java app server. OK now the explicit timeout is 15 minutes.
Still times out.
I had to find out how. What is the mechanism that creates the time out. Get this new tool called Fiddler, it will help you debug the HTML stream. Debugging HTML streams? Well, maybe it's the size of the download that's stopping things. OK did that. It's not the size. Well the corporate standard timeout is 10 minutes.. What corporate standard? The corporate standard on the firewalls between the users and the data center. Well can we get an exception? Maybe.
So it basically took six weeks for me to deal with the various network engineers, database admins, support staff and their management to prod them all to buy what I was trying to sell, which was the viability of this entire project. My only leverage was that I was consistently riding herd on the problem and I was a very expensive third party contractor. So the project was late and the entire overhead of the difficulty in justifying business as usual in the various departments was the only thing that motivated people to go to extraordinary lengths to solve the problem. Everybody wanted the problem to be somebody else's problem. And until we found out exactly what the problem was, everyone was pointing fingers until the last possible minute. It turned out to be a default in one of the load balancers that everyone assumed was set to 10 minutes, but communicated 7.5 as an override to the other. Those machines required firmware upgrades as well.
I have been accustomed, throughout my entire career in BI to be responsible for the entire data supply chain. That I could do. But middle-tier service configurations, firewall settings and DR disk availability was all above my pay grade. I was not paid to know and I was too expensive to be paid to learn. In that way, I'm accustomed to being like the wiley developer whose time is too valuable to waste learning these operational details. At the same time, I was equally demanding of all those dependencies. Give me more memory on the app server! Open up the damned ports I want! Get more disk, you lummox! Of course let me not forget the memory constraints on the end user machines.
All of this was a terrestrial implementation and it had other setbacks too, but it was a fascinating six month engagement. I of course learned a lot about these other systems with respect to how they affected my entire piece of the data warehousing applications. I sensed that I had the capacity to understand, but I'd never remember unless I had some responsibility and permission to make changes. That would be impossible without the cloud. But even when I had the cloud, it was more than just having control of the associated systems but really understanding how they worked. That's a story for another day. What was clear was that it was very difficult to manage all of the departmental areas, and get the priority within those departments (at their various locations) to solve a showstopper problem in this one application. It was 2011 and we were testing the very limits of the IT capabilities of a global corporation. DevOps might be a cool thing to talk about with web startups, IE a DevOps engineer would be cool for your website, but I saw the fundamental management problem that had everything to do with the way multimillion dollar Enterprise applications were built and maintained, and essentially why they were one-shot deals.
How Essbase Helped
Staying on the edge prepares you for the next thing. It's not always a revolution.
For a long time, I was an Essbase guy. Like since before `my.yahoo.com`. I'm writing about how I came to understand the importants of DevOps as a kind of testimony about a culmination of the lessons of my prior positions. If you don't know, Essbase was voted one of the top 10 compute technologies of the 2000s by Information Age. And I quote:
The multi-dimensional database technology that put online analytical processing (OLAP) on the business intelligence map. Developed by Arbor Software (now part of Hyperion Solutions), it spurred the creation of scores of rival OLAP products – and billions of OLAP cubes.
There were two essential qualities to Essbase implementations that spoiled me and gave me a particular set of instincts that serve me now.
Essbase was important, and it was disruptive.
Essbase was always in production.
Let's talk about the second thing first. Essbase was built with the understanding that business rules were always going to be changing as business changed. So Essbase employed a semantic layer that was able to be edited in real time and then applied to its data models immediately. While it took some expertise to understand the batch time implications of any particular code, Essbase never supported a kind of migration framework to move its metadata from DEV to QA to PROD. This annoyed a lot of people, especially those fun folks who were the DBAs and ERP managers whose data was upstream from our Essbase cubes. Moreover, Essbase itself made the making of multidimensional models easier and cheaper than those built with traditional older tech. So the very idea of having multiple business models was part and parcel of this. So: multiple olap cubes, deployable in production with each of those cubes with multiple scenario dimensions and the facility supporting dynamic definitions of data-driven business rules. It was, quite frankly, more flexibility than most finance and planning organizations could wrap their heads around.
These days we call that kind of adaptability in production systems 'Continuous Deployment', which is a DevOps term. Now what Essbase did not have was Continuous Integration with systems outside of its aegis. But it did have tight semantic integration with its 'Essbase Ready' clients. Which meant that if master data changed in the database model, no additional changes had to be made to the reporting components. I am not aware of any next-gen databases that do that. That ability was part and parcel of the Essbase server engineers designing an API for their clients rather than just a pipe of text.
This combination of deployability and tight integration allowed us Essbase guys to grab data, get it into the model quickly and evolve the model right in front of our customers. That was why Essbase saved so many lost data warehouse projects. We surfaced changes quickly. We allowed our customers to explore data immediately and made multiple looks of flowing data happen on a daily basis, by design.
--
Essbase's importance and disruptive nature may sound subjective but it was not. I had the privilege of working for a hot Valley company that proved that it could IPO and make money. So when people decided to buy our database, it was a big deal and a high priority project. More often than not we were swapping out systems to the benefit of the financial guys and amidst the howls of IT. So when we in the field service group were assigned to get it up and running, the hot visibility was on us. Essbase lived up to the hype. In 3 years I never lost a trial.
The point is that when companies decided to buy Essbase, they went all in. It wasn't an experiment. It had to replace the company's financial reporting - all of it, or it was a failure. Essbase was originally not scalable to retail solutions, and that's one reason competing companies like Microstrategy were able to survive. But for the right fit, our customers swore by it. It changed the way people looked at IT because it just worked and it eliminated a large number of problems. These benefits were clearly presented up front before companies made their purchasing decisions. They bought the technology with the understanding that it would change their behavior, make business processes faster and improve transparency. It wasn't about having Essbase because all the cool kids had Essbase. It wasn't a market sweeping phenomenon. After all, we were competing in a crowded market. But for people who could see how this technology could improve their business, the way was clear.
I cannot say that what I learned technically using Essbase or any of the BI tools themselves helped me understand the broad variety of open source technologies that are associated with DevOps. Those were subjects that required their own mastery. But these two phenomenon having to do with the effect of the right technology on the business were instructive in my understanding of the conflicts and benefits that would arise upon implementation. I now know the effects of these changes in the organization and what kind of management approaches are necessary. In short, I have the experience of breaking ITIL standards and old habits of getting things done in medium and large organizations. Needless to say, you cannot simply drop technology into production and expect behaviors to change, but when you understand how functional behaviors need to change for the good of the business, it goes a long way in gaining acceptance to radical tech and process change.
PS. You can check, but I should be the person who first created the Essbase entry on Wikipedia.



