Tribune Media rebuilds IT from the ground up and is living the dream

With everything software controlled, workloads can be moved at will, and VMware’s NSX offsets the need for more than $1m of network gear

tribune media
Tribune Media

Tribune Media CIO David Giambruno got the rare opportunity to build IT for a $2b company from the ground up after the Tribune Company was split in two in 2014.  The split created Tribune Publishing and Tribune Media, the latter being the largest independent broadcasting company with 42 TV stations, a movie studio and divisions involved in everything from sports metadata to real estate. Giambruno lives in Raleigh, North Carolina, and has offices in Raleigh, New York and Chicago.  Network World Editor in Chief John Dix recently talked to Giambruno about what is possible when you’re given a blank sheet of paper. Perhaps not surprisingly, the answer resembles the fabled Software Defined Data Center.

Tribune Media CIO David Giambruno

Tribune Media CIO David Giambruno 

It must have been a monumental job to divvy up a company while it was still in flight, so to speak.

When you’re splitting a company you have to keep it running, so there was a decision made that Publishing was going to keep all of the legacy [gear] and we would have a transition services agreement to support us in the interim.  Then essentially I got to build greenfield. I had to consolidate all our “stuff” from 54 data centers into something new. There are very few times in your career you’re totally unencumbered, so I looked around and said, “What can we do?”

We knew we had to build the traditional backend to support the services side of IT, so DNS, email, all of that stuff no one ever sees.   There’s no cosmic joy to backend systems.  Everybody just expects them to work, like electricity.   To visualize the job of splitting the company I used a grilled cheese sandwich metaphor:  It looks nice and neat, but when you cut it and pull it apart you get all the gooey, messy stuff.  Every corner cutting, field-expedient fix from the last 30 years comes back to haunt you.   The only downside of the metaphor was getting the picture for the PowerPoint.  My 8 year old daughter and I had to make five grilled cheese sandwiches to get the right shot, and I had to eat them all.

SDN showdown: Examining the differences between VMware's NSX and Cisco's ACI

But the fun was figuring out how we were going to build that next generation platform.  How were we going to keep costs down, how were we going to automate it and avoid boxing ourselves in? Technically that meant building a platform that could adapt, consume, scale, eject and execute with predictable precision using cloud, containers, XaaS and whatever Silicon Valley throws out for the next several years.  Financially it meant disconnecting capability and cost.  If I want to add five things I don’t want to have to pay five bucks, I want to pay two bucks. 

So we decided to build a private cloud since we only had five months to split everything.  The first target was to get to 90% virtualized from 60% so we could move everything.  And one of the first things we did was a bakeoff between VMware and OpenStack.  We had 26 people doing OpenStack (because it was cool) and four people working on VMware, and the goal was to get 1,000 servers running inside of a month. At the end of the first week the VMware team was done and dusted.  At the end of the month the OpenStack people still had nothing.  Choice made.

So we started the process of lighting up the VMware and migrating our applications.  Essentially we were running two horses. One being the infrastructure and all of their services, and second moving all of the applications.  The team building the entire infrastructure backend was only nine people.  That was it.  I’m incredibly proud of them.

I presume you were migrating to an x86 hardware environment?

All x86.  No Mainframes, AS400s, etc.  I have Wintel platforms and only have five physical servers that don’t run virtualized.  Otherwise we’re running roughly 1,200 servers on 79 physical hosts. 

When we were getting ready for the network one vendor quoted me $1.5 million to do my core.  We ended up going with NSX from VMware and $70,000 worth of Juniper in the core because, with NSX, I can use much less expensive stackable hardware since much of the intelligence for logical redundancy is sitting in software.  Juniper also has the best XML parser.

And if I understand it right, one of the reasons you could consolidate so much of your compute resources was because you offloaded some workloads to the cloud?

Right.  I don’t have PeopleSoft Financials or PeopleSoft HR anymore.  Those are 800-pound gorillas and we replaced them with Workday Financials and Workday HR, which are SaaS services, and Anaplan for budgeting and FP&A [Financial Planning & Analysis]. So that huge erg of horsepower that took up a chunk of the data center is no longer onsite. In raw numbers, about 80% of our applications are still on-prem and 20% are in the cloud, but in a compute sense we’re about 50%/50%. 

Literally, this is like a $2 billion startup, or re-start. And we did the whole thing in five months. Management’s assignment to me was to build a “Frictionless enterprise,” which is a very succinct and clear goal.  The power in that vision is the focus it enables me to give my team.

People say, “Wow.  You did all this?”  But it is like Captain Kirk and the Star Trek’s Kobayashi Maru test.  I say, “Yeah, but I cheated.”  It’s just what is possible now. I did not have the baggage from a legacy enterprise and I had clear purpose of mission.  The outcome was like going from the Flintstones to the Jetsons. 

We set up the environment, got everything running and ready by the end of May, 2014.  We went live August 4, moved all apps, and collapsed 54 data centers onto seven racks with nine help desk calls.  It’s one of these funny visuals because we’re a pretty big company.  You walk into my data center and expect to see rows and rows of stuff; it’s literally seven racks.  I have to take people in and show them, saying, “Really, that’s it … My data center that got caught in the dryer.”

The magic to me of an internal cloud is all of my data is in a single place.  All the other benefits pale compared to the ability to have all my data in one place and to copy it at will anywhere, giving me what I call indiscriminate compute.  So as we put in our API layer it makes it really easy to move information in and out, to control that information.  We’re still going through the whole micro-segmentation piece, but the ability to wrap our data with a common security profile and push that out externally, it changes the operating metaphors. 

I use the term indiscriminate compute, but it is really compute, storage and the network -- being able to move and extend that anywhere the business needs while knowing where it is, what it’s doing and who has access, so it still stands up to an audit.

If I want to take my internal servers and go to AWS or Azure or some other provider down the road, we’ll be able to do that.  We’ve already pushed some stuff to AWS as a test.  We just don’t have a need right now for public cloud because we have capacity and because of latency problems in public clouds.  That’s quickly going away but it’s still there.  I always joke, bandwidth is cheap, but latency is priceless.

Going back to NSX, did you discover anything along the way that NSX couldn’t do?

In the beginning the hardest part was not the technology, it was the ecosystem.  It’s fairly young, and the hardest part was getting other vendors to offer virtual instances of their physical hardware.  Everybody talks about virtual appliances, but it’s a big shift.  “I’ve been creating physical boxes forever and now you just want me to give you a piece of software?”  It is a total frame of reference shift for the suppliers and their business and revenue models.

We even had problems just getting SKUs.  So you really have to work on the ecosystem.   That was really the only frustration. The technology itself worked.  But we got to walk into it. We weren’t lighting up 50,000 nodes.  We slowly lit stuff up, we learned, we got better at it.  I highly recommend the crawl, walk, run approach … but it is eminently doable.  You do need the right people. I am blessed with an awesome team that loves the challenge and has that one key quality: curiosity.  Curiosity has to be fostered, and that comes down to leadership and supporting your team. 

We’re all about simplicity. I call it the Southwest of computing. Southwest Airlines uses one type of airplane, so they have one set of mechanics and one set of parts. So what I strive for is, get really good at a set of technology, own it and wield it and get the most out of it we possibly can.  Wield the technology.  I don’t worry about vendor lock-in because my threat is binary.  What I mean by that is, if you make us really, really mad, we’ll just take everything and rip it out and replace it.  We work very hard at getting really good relationships with our vendors.  But if it goes bad it’s divorce court.  You’re not going to lose 10% of my business; it’s all gone. 

So that’s the way I approach it because I think simplicity wins in the long run. Workday is similar in that everybody runs on the same version.  It’s like an apartment building where you get the same floor plan.  You can change your paint and your sink but pretty much everything else is the same.  We just went through a Workday 25 upgrade and I’m used to SAP and Oracle upgrades that take months of prep, lots of money and lots of consultants.  This was a team of eight, two weeks, a four-hour upgrade, you’re done.  You go, “Wow.  That was easy.”

The same metaphor applies to infrastructure now.  We’re still bound by applications, but it is really about how you wield the tech.  It is how do you disconnect the cost and become scalable and run those things that need to be run in the background without anyone getting sticker shock from cost or effort at every change. The best compliment I get from the management team is they don’t have to think about me.

Give us some perspective on how you’re benefiting here in the new world.

Before we split the company in half we had 585-ish people in IT, so you would assume I would end up with 200 to 300 people.  I am running everything, infrastructure, apps and support and development, with 43 people.  Even better, I didn’t have to fire anyone.  Only 25 people transitioned from the combined company to Tribune Media.  

Wow.

But the thing I look at most is business alignment, and I look at it really simply: It is your ability to do more with less. I am not revenue generating, but it doesn’t mean we can’t be innovative and engaging.  My team uses technology to give the business a competitive advantage: speed.  The more I can focus my team on delivering projects, the better the business is.

I still need the “bump in the night, worst case scenario team,” but the paradigm shift keeps reducing operational risk and enabling my team to work on business projects. This is quantified in the number of projects we deliver.  The new capabilities enable us to get the infrastructure out of the way so we can do more.  In 2014 I believe we got a little over 140 projects done.  Year-to-date we’ve crushed it with over 245. And these are massive projects.  We’ve built an entire backend for a company.  We’ve built shared services to big data, we’ve built all this stuff.  We’ve taken the infrastructure out of the way so everything becomes easier.  One of the best examples is the speed at which we deployed Workday Financials.  We had the fastest go live ever for a company our size.

What you see is people getting more done faster and that changes people’s frame of reference about what can be done and how long it takes.  One of our team’s rallying cries is, “Everything begins and ends with an IP Address.”   That combines with my team’s pension for automation.

Mike Cannella, one of our Cloud Engineers, worked with InfoBlox and VMware to create an awesome integration.   Now it’s literally a click of a button to automate VM provisioning, birth to death. Click, and a VM gets an IP address that’s entered in the DNS (depending on its naming standard, so, production, dev, test), it gets a life span, and it gets an owner. If it’s a test box, the owner starts getting nagged at 75 days to see if they still need it, and if they don’t respond it just gets deleted at 90 days. This is where the operating metaphor of a Software Defined Data Center shines. 

How are you handling storage?

Storage is in major flux.  One of my Cloud Engineers is really awesome in storage, Ben Gent, and he gave me this PowerPoint titled, “How do I fire myself?”  That obviously took a certain degree of courage, but reflects the new operating possibilities in storage.

He goes, “Here’s what I want to do. I want to complete the virtualization stack with VSAN and set up a storage sandwich.  Pure Storage (flash) on top to address high performance needs, VSAN on cheap servers in the middle to support the bulk of our needs, and Cohesity on the bottom for backups, replication, de-duplication and recovery, and then we’ll automate the whole thing so the help desk can provision storage and give you millions back in savings over the next three years.”    

Can’t really argue with that.  I asked him how long it will take and it turns out he already had it running in the lab.   So we are moving to a model where 25-30% of our capacity runs on flash, the lower 70% runs in VSAN, and Cohesity, which is coming online soon, does the deduplication and disaster recovery.

Related:
1 2 Page 1
Page 1 of 2
The 10 most powerful companies in enterprise networking 2022