The Commonwealth Bank has slashed the time for operating system upgrades from 6 months to 10 minutes after mounting public cloud costs and disillusionment with conventional virtual machines drove it towards a bare-metal infrastructure based on OpenStack-based private cloud tools.
The bank’s commitment to OpenStack – which has also been adopted in institutions including Bankwest, Macquarie Bank and Melbourne University – has rapidly evolved in response to a growing realisation that the cost benefits of public cloud services begin to dissipate as scale increases.
“Once you get past 1000 servers and you’re running at a high utilisation rate, the economics quickly flip on you and they don’t make sense,” Quinton Anderson, head of engineering and platform products with CBA, told attendees at the recent OpenStack Australia Day in Melbourne.
“You have to focus really hard on costs that you previously accepted because of convenience.”
Virtual machine-based private cloud environments also pose technical limitations, Anderson said, noting that VMs “are uncomfortable for us” because the bank is “very I/O-centric and VMs don’t necessarily represent the best performance".
VMs are possessive about the architecture they’re sitting on, deal with partitions in a “very specific way”, and tend to be large and unwieldy as well as obstructing regular upgrades because of the way they are structured.
“We have a fair footprint in the public cloud but there are reasons not to put everything there,” Anderson said.
“Once you take out the operational response, the VM starts to feel like it’s maybe not the best solution. We want to go for variability and use bare metal [servers]; it’s quite easy to buy racks of hardware that I’ve specified myself, and have the capex as opposed to paying off someone else’s capex.”
CBA has progressively turned to OpenStack; a Linux-based ecosystem of open source cloud services whose popularity has expanded dramatically in recent years on the back of maturing tools such as the Trove database, Kolla and magnum container services, Murano application catalogue, Zaqar message service, and Ironic for automated bare-metal server deployments, among others.
Use of OpenStack was not just about copying public cloud giant Amazon Web Services (AWS), Anderson said, since that company “retains hundreds of software engineers and data scientists to optimise their networks under the covers".
“That was not a situation we wanted to be in – so the choice was made, as far as possible, to not simplify our requirements but to simplify our network.”
The bank has chased that vision by using those and other OpenStack tools – including Yarn resource managers, Mesos distributed kernel, Marathon container manager, Docker hypervisor, Vault security, and Calico’s microsegmentation – to build a scalable infrastructure with hundreds of x86 servers supporting a big data environment integrating tools such as Teradata data warehouses, Hadoop analytics, and IBM InfoSphere DataStage ETL.
The open source Apache Spark framework provides a technological base for the platform, which runs as an alternative to VM-based private clouds by loading operating system images directly onto the servers; a tiny kernel manages the interactions between servers.
CBA’s highly-regulated, risk-averse operating environment has traditionally required the team to move carefully, favouring stability over upgrades. This created a conflict with the team’s architectural goals, which were predicated around flexibility.
“You need to be able to move the underlying platform forward because you’ve got to take on new application requirements,” Anderson said.
“Actual, tier-1 business systems depend on these things so you can’t just stand up a new app. You end up getting these really long upgrade cycles” that could see it take 6 months to roll out new OS instances, he said.
This had seen the team embrace the use of read-only images, which can be spun up within minutes in the OpenStack environment using the Ironic project, whilst ensuring that they are identical to previous production systems. Virtual routing and forwarding (VRF) functions restrict and manage the routing of data between systems.
“Ironic allows us to take some really unique stances around security," Anderson said, including restrictions on folder access and SSH access, as well as the use of read-only file system partitions.
“If you can absolutely be sure about what’s running and you know it can’t change, you can do things like push it through version control processes and be sure that what’s running has been signed.
"This cryptographic trust gives you confidence on the application front. And while it gives you operational security attributes, it also gives you the ability to upgrade things.”
By pairing flexibility with appropriate security, OpenStack has helped the team reduce deployment time from 6 months to what is “effectively a 10 minute debug cycle”, and the team is working to get that down to “sub minutes”, Anderson said.
“Very complex tasks that would traditionally have been done by humans, the tooling is now doing on their behalf. Everything is codified.”
CBA’s success with OpenStack reflects a growing tide of interest in high-end environments where large businesses are realising that they face being overwhelmed by big data, analytics, security, internet of things (IoT), and other capabilities.
Demand for scalable, cost-effective bandwidth is driving that interest, OpenStack Foundation executive director Jonathan Bryce told attendees in his keynote speech.
“It’s about being able to have public cloud anywhere you need it,” he said.
“People used to think of cloud solely as virtualisation with extra services, automation, and elasticity. But now it’s much more nuanced in the way they see their deployments and bare metal has made a resurgence.
“When you look at the amount of data we produce in our environments, it’s clear that centralised data centres are not going to work as an architecture to handle all of that going forward.
“Thinking about how we handle processing closer to the sensors and devices generating that data will be an important step in the next five years.”