Sunday, June 28, 2015

Some thoughts on planning




Anyone working in the IT industry has seen, heard or said this phrase more than once. Unfortunately, if you in operations then someone else's inability to plan more than likely DOES become your emergency.This remains true even in companies preaching DevOps. The larger and more complex the company, the greater the chances that there is a team somewhere, Operations, DevOps, whatever the name, who's workload is driven almost entirely by interrupts.

These teams often adopt Kanban as a means of making the problem visible but Kanban alone won't fix the problem. Kanban highlights the demands being made upon an operations team and can be useful in improving the work flow but the real problem is in planning.

DevOps at its heart is about collaboration between teams, avoiding silos etc. In a few startups DevOps results in single teams made up of developers and operations people who are in on the plan from the beginning and everyone gets to celebrate at the same time due to a common understanding of "done". At its best DevOps encourages an understanding that a feature is only completed when it is running in production and in the hands of the end user.

In far too many cases, however, companies proclaiming they are doing DevOps have not redefined their definition of "done". In these companies, development teams consider work done when the code builds successfully, or when QA has signed off on it. The task of making this run in production is not considered part of the development process, as a result, what is required to make and keep this running in production is often not part of the plan.

The need for planning is certainly understood by software development teams and forms an integral part of agile methodologies such as Scrum. Developers would revolt if they spent four hours in a planning meeting only to be told three days into a two week sprint, "forget the plan, something has come up and we need you to do this instead." Yet it is considered perfectly acceptable to assign tasks to an operations team towards the end of a two sprint for work that could easily have been predicted in planning.

DevOps Planning


Clearly there is a need for better planning. Many organizations seek to integrate operations people into development sprint planning sessions. This can help but too often discussion focuses around feature requirements from a code perspective and doesn't leave room for considerations of infrastructure. It is often discovered late in the sprint that a new feature requires a change to the Load Balancer for example and this becomes an emergency change for the operations team. To allow operations teams to plan and minimize interrupt driven work, it is necessary to ensure that they are involved in the earliest stages of design discussions.

New features need to be considered in their entirety. Design discussions can not be limited to code implementation. How does the new feature integrate into the application as a whole? What configuration changes are required? Are there new ingress/egress points? Can the existing infrastructure serve the new feature and meet performance requirements? These are all questions that need to be considered as early as possible if we are to make our operations teams more proactive than reactive.





Sunday, June 21, 2015

Monitoring and Visibility

I have been thinking a lot recently about the role of monitoring in a DevOps role versus traditional operations.

Monitoring has traditionally been the purview of Operations. To the extent that Development was involved in monitoring, it was to provide the applications metrics that could indicate a problem. These would then be sucked into a monitoring platform managed by operations and appropriate alert thresholds set up with the necessary run books for how respond.

With the evolution of DevOps, production support is often a joint responsibility. Lots of tools have emerged, providing more insight into the application from a user or code perspective and these metrics can be integrated into the monitoring and alerting platform. It is not uncommon to have different escalation polices for different types of alerts, some going to developers first, others to ops first and frequently passing between the two.

This collaborative effort can lead to a much quicker resolution of production issues but it is still reactive. How can the monitoring tools we implement assist in teams proactively preventing production issues?

Visibility is often talked about as an integral part of DevOps. Tools such as Ansible or Puppet pride themselves on giving a clear view into how systems are managed. Code Coverage tools provide visibility into testing. Continuous Integration systems provide visibility into build stability. All of this helps to increase confidence among those working together to build and release systems that those systems will meet requirements: functionality, performance, security and stability.

Monitoring Tools should be seen as an integral part of this effort to provide visibility. They should present information in a simple visual format that can inform the development process.

Developers working on a new feature for example can check systems metrics to watch for issues such as a spike in CPU or memory that may indicate code inefficiencies not covered by unit tests. User simulations can indicate problems with response times deep within a web application. Are these the result of new code, or are we just seeing them now because we are looking more closely? Either way it is probably something we want to fix before releasing to production.

Taking benchmark snapshots with each release is important. Measuring against these benchmarks can be incorporated into the unit testing. When we are confident that we have a good understanding of what is optimal performance of an application we can break the build if code changes degrade performance.

Thus our monitoring tools move from something that becomes important at the end of the process -- when code is deployed to production -- to an integral part of the development process itself, in true DevOps fashion.




Sunday, June 14, 2015

Defining DevOps



In this blog I will talk about all things DevOps. As the blog title suggests, these are my own views on what constitutes DevOps culture and practices. So it seems appropriate in this first post to present my own definition of DevOps.

For me DevOps describes best practices that have actually existed in the most successful agile companies for a long time. It is the process of removing walls between departments and ensuring a close collaboration between operations and engineering departments involved in the creation of systems.

What has become known as DevOps culture or practice is actually central to the success of agile development.

Agile, like the term DevOps, can mean all things to all people. According to the Agile Alliance:
In the late 1990’s several methodologies began to get increasing public attention. Each had a different combination of old ideas, new ideas, and transmuted old ideas. But they all emphasized close collaboration between the programmer team and business experts; face-to-face communication (as more efficient than written documentation); frequent delivery of new deployable business value; tight, self-organizing teams; and ways to craft the code and the team such that the inevitable requirements churn was not a crisis.
Taking this definition of Agile, DevOps becomes a logical extension of Agile. Whereas the Agile Manifesto speaks of “deployable business value” DevOps speaks of “delivering operational systems.”

The challenge is to bring operational skill sets to projects earlier in order to encourage development to think about what is required to build scalable systems that can be managed operationally.

Hiring for DevOps


Coming from the operations side of DevOps, I am seeking to recruit ops guys who “get” DevOps. It has become almost obligatory to include the term DevOps engineer in a job description, if not in the actual title, in order to differentiate from the perception of old school sys admins who act as the gate keepers of production, waiting for code to be thrown over the wall to be deployed in their Kingdom.A good candidate for working in a DevOps culture is not someone who knows Puppet or Chef or other tools. Chances are the right candidate will have a good tool set but that’s not my main criteria for hiring. A good sys admin will have actually been using the main practices contained within the DevOps model, without necessarily applying a name to it. Breaking down walls between operations and engineering, automating processes, advocating continuous integration and frequent releases. These are all things that make a sys admin’s life a little less painful.

When recruiting developers for a DevOps environment, experience with configuration management and even specific tools, Puppet, Chef etc., becomes important. Also important is an understanding of a collaborative workflow, git pull requests, discussing with operations early in the development life cycle. These ways of working are often far newer to the dev side of the house than they are to operations. Recruiting the right people on both sides of DevOps is the single most important factor in a successful implementation of DevOps. When tasked with introducing DevOps in an organization that is fully staffed, look for ways to bring in "new blood". In any company there is a natural turn over of staff, take advantage of this to introduce the right type of people to the teams. These new forces, whether on the ops side or development can then become advocates for DevOps within their own teams.

Culture


Many companies, particularly start-ups, spend a lot of time seeking to define company culture, values etc. Often this involves a top down approach. Senior executives come up with some set of values that define a company culture that is then to be adopted by the ranks. For more successful companies, those values develop organically out of the day to day collaboration of people with shared interests and goals. It is no different with DevOps culture. It is extremely helpful to have executive buy-in to a DevOps program and support for the ways of working that go with it. But trying to define a DevOps culture top down is a mistake. If we are really adopting DevOps methods and practices, the culture will flow from that.

The DevOps Team


Many blogs and presentations advise “Do NOT call your team DevOps or yourself a DevOps manager”. This arises from a couple of major concerns.

DevOps is a culture and as such requires buy-in from the entire company. No team has the single responsibility for introducing a new culture.
There is a very real danger that a DevOps team becomes just another silo between Dev and Ops.

These are very valid concerns and if you are building DevOps in a new startup I would strongly encourage you not to call a team DevOps. Implementing DevOps in a large enterprise with dispersed operations teams who have traditionally seen themselves as production gate keepers, all the while using DevOps terminology without understand it, is a little different. In such an environment, introducing a DevOps team made up of people who have "done DevOps" before can have some advantages.

There can be some value to labeling a team DevOps in order to emphasize an outlook and commitment to DevOps practices. The team can establish itself as a center of excellence for DevOps best practices across the company. The name does not automatically make it a new silo anymore than not using the name guarantees against silos.

Measuring DevOps adoption


Companies are often asked, what is the current state of adoption of DevOps. In some cases they attempt to answer this in the manner of a check list: Are we doing CI? Do we have configuration management etc.

I don’t think this approach gets us very far. A better question would be, how close are we to solving the problems we are seeking to solve by adopting DevOps? For me, DevOps is not about everyone getting along and dev and ops hanging out at the bar together. That’s often useful and desirable but more importantly what we are trying to do is work together to solve the problems which call for a DevOps culture and practices.

I’ve seen various approaches to “implementing” DevOps ranging from management sub-committees to discuss endlessly what DevOps means for a given company, what should we be doing etc., to gorilla movements by devs or more commonly ops people to enforce change from below. What’s the correct route? Whatever works for you. It’s all about breaking down barriers to get the job done.

There are many ingredients to moving to a DevOps culture but to really succeed with DevOps, I believe we have to define goals. What are we trying to achieve with the DevOps program?

My goal is to deliver scalable systems that are supportable in production. This is something that programmers, QA testers, ops people and even product managers need to be on board with and play a role in, for DevOps, and ultimately the entire company, to be successful.