opsschool-curriculum/soft_skills_101.rst
Piotr Pies Ostrowski 52b08b7028
soft_skill_101 - time menagment wiki moved
Tom wiki moved from Google Code to Github - Code was killed by Google
2019-11-17 13:13:35 +01:00

776 lines
38 KiB
ReStructuredText

Soft Skills 101
***************
The term "soft skills" seems to imply that these skills are somehow less
important than technical skills. In reality, soft skills are often
specifically sought-after by hiring managers. These skills are also
important for operations people seeking to advance to senior
engineering positions.
As much as technical people would like to believe that operations is a purely
technical profession, it is really about serving people. Operations, as the
title indicates, is about making things work for people. Operations people
design, build, and maintain services for people to use. It is all in a day's
work for an operations professional to translate, educate, inform, reason,
persuade, and generally act as a liaison between technology and the people who
use it.
Soft skills at the 101 level encompass communication skills, time
management, project management, and a basic understanding of DevOps
from an operations perspective. Soft skills at the 201 level lead
into general business skills including positioning, budgeting and
the financial process, using metrics effectively, demonstrating
impact, risk management, managing customer preference, and thinking
strategically.
Communication basics
=====================
Audience analysis is the first step to effective communication.
Perform a basic audience analysis by answering some simple
questions:
* Is the audience technical or non-technical?
* How much do they know about the topic?
* How much do they care about the topic?
* What is the intended message for this audience?
Before sending one email or setting up a meeting, answer these
questions. People are inundated with communication from email,
voicemail, Twitter, social media, internal web/wikis, text, IM, and
meeting requests.
Communicating internally
------------------------
Internal customers could be people who use general computing and
call a helpdesk for support, the organization's software developers
or engineering team, senior management, or researchers, students,
faculty, or others. The type of customers depends upon the type of
organization and the industry.
Working with internal customers could be as simple as being the
"Ops" side of a DevOps team or it could mean supporting a wide range
of technologies used by customers at varying levels of technical
understanding.
When operations focuses on a specific project or works with a
specific team, such as engineering or software development,
communication is generally specific to that work. It can take on
the form of meetings, video conferencing, chat sessions, and emails
between team members. A communications culture tends to develop in
these scenarios as team members figure out the best way to coordinate
with one another.
When operations focuses on more general IT support, communication
becomes more complicated for the operations people. Factors
such as audience analysis play a larger role in successful
communication with customers. Operations faces a potentially
wide array of communications scenarios:
* Announcing outages to general staff in a large organization
* Announcing upcoming maintenance to a set of staff impacted by a service outage
* Broadcasting a technical idea to a non-technical audience
* Contacting internal customers impacted by a security issue or vulnerability (e.g. Run this update. Install this patch.)
* Asking middle management across the organization to weigh in on a potential service change
* Offering a seminar, workshop, or class to assist customers with a new or modified service for a general audience
* Offering a seminar, workshop, or class to assist customers with a new or modified service for a non-technical audience
* Presenting the service catalog in a question-and-answer session
* Meeting with senior management to address an operations problem, budget shortfall, request more resources, or propose an architectural change
* Meeting with customers to address service problems
* Meeting with specific groups of customers to collect requirements for a special project
* Requesting feedback from customers either individually or as a group
* Meeting with customers who are engaged in the subject matter
* Meeting with customers who are disengaged or in attendance because it is mandatory
This list spans a wide range of communication modes, communication types, customers, and outcomes.
* **communication modes** email, meetings, larger presentations, surveys
* **communication types** persuasive communication, instructional, informational
* **diverse customer backgrounds** management, administrative staff, technical staff, IT-savvy, non-IT-savvy, interested, disinterested
* **desired outcomes** management decision, increased understanding, increased abilities, increased awareness
Communicating externally
------------------------
Communicating with external customers can offer additional challenges.
If the external customers are customers of the organization, there
is the possibility that dealings with them could result in a
complaint to upper management.
Reduce complaints by considering how to communicate with these
external customers. When communicating about a service outage, consider
timing of the outage, duration, and impact of the outage on these
external customers. Are most external customers in the same time zone? If
so, then the maintenance window could be outside of traditional working
hours. If external customers include international people in
varying timezones, the outage window may be the one that impacts
core customers the least.
Communicate the timing of service outages with management. It is
best if management knows that external customers are about to be
impacted by operations. Include a justification for the maintenance:
why is it necessary, why this outage window, why this duration,
plan B in case the outage goes beyond the outage window, method of
communication with external customers? All of these pieces of
information may not be necessary if operations already supports
external customers on a regular basis.
There is significant breadth and depth required to effectively
communicate.
Communication Modes
===================
Let's start by covering the two most common modes of communication
for operations: email and meetings.
Communicating via email
-----------------------
Before sending that email to the entire organization, who really
needs to know the information? People already get a lot of email;
for most it is information overload. How many of customers
already complain about too many emails? Don't get filtered, Make
communication count.
Here are some best practices when using email to communicate:
* Shorter is better.
* Make the subject descriptive (e.g. "www.co.com outage, May 10 - 6-8 pm")
* Put the most important information at the top of the message (e.g. deadline, action, outage dates). People generally skim the first few lines to determine if the information pertains to them. Starting with a lengthy background risks alienating people before they read the important part of the message.
* State the audience at the top of the email (e.g. "All Macintosh Users") to let them know the message is directed at them.
* Consider including a link to an internal site with a lengthier writeup if needed.
* Limit the recipient list to only those people who need or want the information (management, administrative customers, developers, people impacted.). Build a list if necessary to avoid spamming the entire organization.
Sometimes email is the best way to communicate and sometimes not.
Decide when to use email and when to communicate another way.
Consider email appropriate in some situations:
* Attemping to reach a large audience
* The message or action is simple
* The message needs to reach them now
* Need to document the distribution of the information.
* Following up on a previous conversation, request, or action
Consider email less effective in other situations:
* Conversing back-and-forth with people to define or understand a complex issue
* Creating something new
* Drawing it on a whiteboard would provide better enlightenment
* There is a potential for much confusion or questions about the issue
* Asking for a management decision on a technical issue from non-technical management
* Trying to teach people
Sometimes email can be used in combination with other methods:
* After a meeting, send an email to the attendees to summarize action items or decisions. This can be an important tool to remind management of a decision made months earlier.
* Announce the time and location of a seminar or training class.
* Share status of an action taken as the result of a discussion or meeting.
Some common effective uses of email include the following:
* Notification of outages
* Warn of IT security threats (e.g. raise awareness of increased phishing attacks)
* Document a decision made by management in a meeting.
* Document the outcome of actions taken
* Provide status on previous assignments
* Announce training, seminars, and presentations by operations
* Provide customers with a link to access a new or modified service
The dreaded meeting
-------------------
If customers think they get too much email, some of them also
think they also attend too many meetings. Some people, especially
managers, have corporate calendars that resemble a tetris game.
Coordinating an effective and productive meeting follows a simple
formula.
**Have a purpose.** Need a decision? Need a decision
now? Need to inform? Need need to persuade?
**Be prepared!** Consider audience and be prepared to
answer questions relevant to their interest in the topic. Some of
this is covered in more depth at the Soft Skills 201 level.
**Communicate at the right level.** Leave out technical jargon if
meeting with a non-technical audience. Consider simplified
explanations, diagrams, and framing the content to address concerns
that the audience cares about. Operations is the translator of technical
information when meeting with a non-technical audience. Take that
role seriously.
**Set a duration.** Decide how much time is needed to present the
topic and answer questions. Make it as short as possible. Some
organizations default all meetings to one hour.
**Consider what the audience gets out of the meeting.** Should
the audience increase their knowledge or understanding on the topic?
Maybe they have no interest in the topic but are the final
decision maker due to funding levels, type of money, policy, role
within the organization, or other factors.
**Stick to the agenda.** Do not let the audience take the meeting
off course. In a 1:1 meeting, the audience might ask for IT support
for an unrelated problem. Agree to put someone on the problem after
the meeting, then go return to the scheduled topic. In a larger
meeting, audiences can tangent into related areas or even unrelated
areas. Be prepared to steer the meeting back on topic.
**Summarize** Summarize the outcome in the last
few minutes of the meeting. It can be good to send an email to
summarize decisions made in the meeting in order to document the
outcome.
Required meetings
^^^^^^^^^^^^^^^^^
Sometimes attendees are mandated to attend meetings:
* Committees where members are selected by the organization to represent a subset of people. Committees are often too large and unproductive. The saying "languishing in committee" describes this cultural phenomenon.
* Management meetings where all members of a management team are required to meet at regular intervals to review topics that may or may not be relevant to everyone in the room.
* Training where all employees of an organization are required to complete a minimum set of hours on a particular topic.
The operations person tasked with leading one of these types of
meetings may find a less than enthusiastic audience. Apply the best
practices above and attempt to make these meetings productive. Even
without being the chairperson, sometimes keeping a meeting on topic
and looking for areas to be productive can reduce inefficiencies.
Alternative meeting styles
^^^^^^^^^^^^^^^^^^^^^^^^^^
Meetings do not always require scheduling a conference room for an
hour or more and everyone arriving with a laptop or a legal pad.
Consider stand up meetings or even short 10-minute slots on a
manager's calendar to provide a quick status update or respond to
a question that is best answered in person.
Special cases for operations
============================
There are some special communication challenges that operations
engineers face.
Communicating planned and unplanned outages
-------------------------------------------
Managing maintenance windows in the organization involves more
than choosing a date and time that works for operations.
Consider working around important events within the organization.
It takes extra planning and outreach to learn about these events,
but it is one way operations demonstrates that it is savvy to the
organization's needs. Wouldn't it be good to know if the organization
is about to roll out the next version of a product, perform a year
end close-out, host a big conference, or stage a demo to an important
external stakeholder. There are no extra points for doing this,
but operations avoids losing respect within the organization for being
unaware of the organization's core business.
For outages that may impact a large percentage of the customers
or a critical service, it is a good practice to notify the organization
more than a week in advance. This serves a dual purpose: it alerts
people who might be out of the office the week before the actual
outage and it provides lead time to reschedule in case someone
responds with a critical activity that would conflict with the
outage. Send a reminder the day before or the day of the outage for
customers who missed the first message.
To send a followup email, simply forward the original email with
a short note at the top reminding people of the time and services
impacted.
Example: Planned outage notification
.. code-block:: console
All file cluster users,
Save your work before 7:00 pm Friday, January 10th for a planned
outage of the file cluster.
The file cluster will be taken off-line for scheduled maintenance
at 7:00pm Friday, January 10th. We expect the outage to last until
10:00 pm.
Notify operations immediately if this interferes with time-critical work.
[provide a way to notify operations]
Fielding customer complaints
----------------------------
In the world of operations, customer complaints are a given. Operations can't
please everyone all the time. Every operations person has dealt with
unhappy customers so it is good to develop strong people skills.
It is important to face customer complaints, not avoid them.
Occasionally we have a customer who is a chronic complainer and the
operations staff dive under their desks when that person walks in
the office. A complaint should be treated as an opportunity to
hear a customer's perception of services. Complaints can be turned
into opportunities for improvement and can be a path to creating a
lasting relationship with customers.
People are often at their worst when reporting a complaint; emotions are
high due to lost data, a service outage, or frustration trying to
use technology. Now is not the time for operations to get emotional or
defensive about the work. Instead of reacting, follow these steps
to adeptly manage customer unhappiness and maybe increase customer
respect for operations as a whole.
* Listen without judgment
* Rephrase the concern so to confirm understanding
* Agree to investigate if it isn't something resolvable now
* Leave the customer with the assurance that someone will get back to him/her with a solution or feedback.
* Get back to the customer even if it is to say
* It was a one-off problem and here is why
* We found a problem internally and it is now resolved
* We are improving our processes to reduce the likelihood of it happening again
* Or an explanation that simply provides feedback to the customer.
* And don't forget to thank the customer for taking the time to provide feedback
The reason to close the feedback loop is to show the customer that
operations did something as a result of the complaint. The customer
will know that someone in operations was concerned enough to
investigate and potentially resolve the root cause of the complaint.
It could have been inconsistencies in operation's internal procedures
or a skills gap. That's a bonus for operations and the customer
should know that the communication had a positive impact.
Try these techniques with chronic complainers. Sometimes all they
want is to be heard. Bring in IT operations management if someone
is repeatedly impacting operations with complaints or becomes
abusive, This advice stands if operations feels like the above
techniques are not working. Escalation to the next person in the
management chain is a valid procedural step in any of these instances.
.. TODO:: It might be interesting to put together an exercise where the student interacts with a fictional customer in some different scenarios. Depending on what the student does, the customer is happy or complains to the operations person or escalates the complaint up the management chain. How does the student respond? Could have multiple scenarios with different customers (a customer who causes his own problem then gets in the way, a customer who cannot wait, a customer who tries to fix the problem and makes it worse, a customer who uses the opportunity to speak to an operations person to dump 10 other requests on that person. This idea came to me from a series of books my kid has where you make a decision on page 10 that leads to to either page 26 or page 40. Your decision could end the story or take you in a new direction. The books are full of these decision points so the story is rarely the same twice, kinda like customer support!
Time Management
===============
Time management is a critical skill for the operations professional.
Customer service requests and trouble tickets are up against project
work and infrastructure maintenance and enhancements. How does one
person prioritize and accomplished?
Recommended reading:
* Tom Limoncelli's book `Time Management for System Administrators <http://amzn.com/0596007833>`_
* Tom Limoncelli's `Time Management Wiki <https://github.com/TomOnTime/tomontime/wiki>`_
Tom Limoncelli also teaches a Time Management tutorial at the `USENIX
LISA conference <https://www.usenix.org/conferences>`_ and sometimes the
LOPSA community conferences: `Lopsa-East <http://lopsa-east.org>`_ and
`Cascadia <http://casitconf.org>`_
.. TODO:: does this section need a real writeup or are references to Tom's work enough?
Project Management
==================
Project management is a necessary skill for any mid-level operations
person. Start with small projects and work the way up to larger ones.
Be aware that project customers, or stakeholders, will often not know
what they truly want from a project or they ask for the moon. Review
the `project management triangle
<http://en.wikipedia.org/wiki/Project_management_triangle>`_ (good, cheap, fast: pick two).
Henry Ford is credited with saying about his customers "If I had asked
customers what they wanted, they would have said faster horses."
Whether or not he said it, it still captures the essence of requirements
gathering for operations projects. The operations professional is the
technology expert. The stakeholders know they want a certain
output or service. They may not know what that looks like or how to
achieve it. The challenge is to extract requirements from the
stakeholders then realize that these may not be the real or complete
requirements.
Enter project management. Project management should help to
frame the scope, resources, goals, and outcomes for the project.
Let's look at two different project management methodologies as
they apply to operations.
Waterfall
---------
Waterfall is a hierarchical form of project management that was adapted
from other industries for the software development world. In waterfall,
think of the phases of a project as a cascading waterfall. Each phase
must be completed before moving onto the next phase. The entirety of the
project is scoped from beginning to end including milestones and
and final deliverables.
Technologies change, requirements change and scoping a large project
over a long period of time with what are commonly incomplete
requirements or faulty assumptions by stakeholders leads operations down
a path of delivering an incomplete or inaccurate solution at the end.
Waterfall breaks down in practice because it requires a promise of
delivery that may be several years out.
Also, by requiring each phase a project to complete before moving
onto the next phase, bugs and issues are often not discovered until
late in the project. This causes delays and sometimes large amounts
of refactoring or re-architecting to go back and resolve these issues.
Detractors of the waterfall method point to its rigidity and
lack of testing during the development phase. One of the issues in
operations and development work is that stakeholders may not have
a solid grasp of requirements until they see a working prototype,
or iterations of working prototypes during the implementation of
the product. It is common for stakeholders in a project not to know
what technology can deliver until they see it. Many operations teams
are moving to Agile methods for several reasons and one of them is
because agile development allows stakeholders to see working bits
of the product before the end and to modify requirements before
it's too late.
Agile
-----
Agile is a project management methodology. Agile started in 2001
when a group of software developers created the Agile Manifesto.
The `Agile Manifesto <http://agilemanifesto.org/>`_ outlines the 12
principles of agile. Agile is seen most often in the software
development world but it has crept into operations because of the
obvious benefits over waterfall. Common implementations of Agile
include: Scrum, Kanban, and the hybrid Scrumban that was created
to meet more operational needs. The idea behind Agile is continuous
release or delivery of a product. Instead of creating one big outcome
at the end of a project, Agile allows a team to release a partially
completed project for stakeholder review and requirements tweaking.
Another big benefit of Agile methodologies is the discovery of
problems early in the product development cycle when refactoring
can be done immediately before the end product is set in a particular
architectural direction that would make it costly to change.
Some documented benefits of agile include the following:
* Reduced process overhead
* Improved team and stakeholder communication and collaboration
* Errors and bugs are fixed in development instead of waiting till the product
is "complete" to address them.
* Stakeholders see the product as it is shaped and have the ability to adjust
requirements during development
* Project teams are empowered
* Can easily be combined with DevOps methodology to improve effectiveness of
development-into-operations
* If done well, can increase work output of teams (increased velocity)
* Everyone on the project can easily see where the project stands (e.g. Scrum
board or Kanban wall)
One thing to remember when implementing an Agile solution: adapt it as
needed. Each of the following has its own simple framework, but
organizations can use some or all of the implementation and even combine
Agile methods to achieve success.
Scrum
^^^^^
Scrum is the more prescriptive of the included methods. Scrum is
recognizable by Scrum boards, user stories, timeboxed sprints,
cross-functional teams, Scrum Master and Product Manager roles, the
burndown chart used for tracking project status, and the Scrum
meetings: daily stand-up, and retrospectives.
Some of the limiting factors of Scrum for operational teams include
timeboxing and tracking the burndown velocity of the team.
**Scrum board** - An electronic or physical board that is used to track
project status, actions that are in progress, upcoming work, and completed
work. A basic Scrum board will have three columns: Todo, In Progress.
Done. Items in todo are the up and coming work, items in "In Progress"
are currently being worked during this sprint. Done is fairly
self-explanatory. Assignments can be tracked by sticky note on a white board
or via an electronic Scrum board. The Scrum board also has rows. These
are referred to as swimlanes. Rows can be labeled with project names
and it common to have the very first swimlane titled "unplanned work"
for operations tasks that fall on the team.
**Electronic Scrum board** - Electronic Scrum board software can be great if
the team is geographically distributed. All members of the team can see
and update the board from remote locations. The downside of electronic
versions is getting the team to keep the application open and updated.
Burndown can also be computed automatically making it easier for
management to see progress.
**Physical Scrum board** - Often a whiteboard with a grid made of electrical
tape. The swimlanes and tasks are marked by sticky notes. The team names
can be post-it flags or some other marker. The downsides to a physical
board include manual tracking of burndown, stickies falling off the
board onto the floor (hint: Buy the Post-It super sticky notes or use
tape or magnets), and lastly distributed teams cannot see the board
easily. The upside to a physical board is visibility. The board can be
placed in a prominent location where the operations staff can see it
every day. This makes for easy daily stand-ups. It also allows members of
the team to walk up to the board and have conversations with other
members of the team about the work in progress.
**Sprint** - A sprint is a duration of time defined by the team when the work
will be done between Scrum meetings. Work is chunked into pieces small
enough to fit within the sprint window. A sprint window might be a week,
two weeks, four weeks, or whatever length of time seems to fit the
team. During the sprint, operations staff focus on the work agreed upon
at the beginning of the sprint. Organizations can define how unplanned
work will be dealt with during a sprint. Sometimes it is helpful to be
able to tell a customer that we can prioritize that project request in
two weeks at our next sprint meeting instead of feeling like operations
has to drop everything for a last minute request. Sprints are somewhat
rigid and can break down with operations because the work doesn't neatly
fit within a timeboxed window. The team will also provide time estimates
for each task.
**Daily Standup** - This is a short daily meeting with the team at the
Scrum board (virtual or physical). The person in the Scrum master role
leads the daily stand-up by asking each team member a few questions:
* What are you working on?
* Are there any impediments?
* Do you need anything to be successful?
Each member of the operations team now knows what is expected of him/her
for the day. Balance the expected work output with other team efforts
such as trouble tickets and outside projects.
**Burndown** - The burndown tracks estimates of time with the actual time spent
working on a project's tasks. The resulting chart will show a project
approaching 0 as the level of effort needed to complete the project winds down.
Teams get better at estimating with experience. Burndown can also demonstrate
if a project is taking longer than planned or is ahead of schedule. Building a
burndown chart can involve a spreadsheet or graphing application. It is common
to build formulas in excel that will automatically update a pivot chart showing
the project tracking. Some burndown charts are very complex and others are
simple. The organization has to decide how fancy to get with this tool.
**User stories** - In Agile software development, user stories can be feature
requests, bugs, or modules the team plans to code for a product release.
In operations, user stories can be small or large projects. Smaller
projects are usually broken down into smaller more easily digestible
pieces otherwise a project can park in a swimlane for an inordinately
long time bringing down team morale and potentially impacting
productivity. Teams should see positive outcomes and accomplishments
across the swimlanes.
**Cross-functional teams** - In a development environment, a cross-functional
team could include developers, testers, management, and operations. The
purpose is to introduce DevOps to software development by including
roles that have a stake in the project at different levels. In
operations, a cross-functional team could include people from systems
administration, networking, security, and management.
Kanban
^^^^^^
Kanban is a much less prescriptive Agile implementation. Kanban can be
recognized by a similar task board to Scrum but often there are more
columns. Kanban's strength is the work in progress (WIP) limit. Kanban
doesn't require roles, timeboxing, or burndown tracking like Scrum.
Because there is no timeboxed sprints, work continuously moves across
the swimlanes on the Kanban board. Daily stand-ups are critical in Kanban
because there isn't a touchpoint at the end of a sprint to review
completed work effort. Kanban boards can have several additional columns
to assist in the management of this continuous work flow. An example
Kanban board may have "Coming soon" "Review" "Available" "In progress"
"Acceptance" "Completed." The purpose of these additional columns is to
enable teams to pull work into the "In progress" column as they finish
other work. The "In progress" column and other columns will have what is
called a WIP limit. There are a few schools of thought regarding WIP
limits. Each organization must experiment with the WIP limit until a
sweet spot is found for operations.
In Kanban for operations, the columns can be varied across teams or
organizations. These columns are only provided as an example. The
organization needs to find the Kanban workflow that works best for the
team. There are several good resources that explain various ways of
configuring a Kanban board. Sticking with the current example, let's
review the columns in an example Kanban board to understand
their purpose.
* Coming soon - these are tasks, projects, or user requests. They are
un-prioritized and may be big or small.
* Review - These are tasks that are prioritized by management or the team
during the daily stand-up. They are put "in the hopper" as work items that
should be reviewed and possibly broken into smaller pieces if they are too
large. The downside of too large is similar to Scrum when the user stories
were too broad. If an in progress items its in the active queue too long, it
takes up a WIP slot and can make it difficult to understand if the team is
making progress on that item.
* Available - This item has been reviewed, broken into a reasonably sized task
and approved by management or the team to be pulled into the active column at
the next opportunity.
* In progress - Similar to Scrum, these are the tasks being worked actively by
the team.
* Acceptance - When someone on the team considers a task complete, s/he moves
it to this column. Acceptance means it is discussed at the next daily stand-up
and possibly accepted as done by the team. Acceptance can also mean
stakeholder acceptance. This could also be a testing phase for something that
is rolling toward production. If something idles too long in this column, it
will hold up other work because of the WIP in progress limits placed on this
column.
* Completed - These are tasks that are accepted as completed and put into
production.
* Impediments - Some boards might include a small section of a column to
identify impediments. Impediments are tasks that cannot begin because of
outside forces. Usually management intervention is required to resolve the
impediment. By separating these tasks on the board, the team sends a message
to management that this work requires outside intervention to move forward.
**Work in Progress (WIP) limits** WIP limits define the maximum number of
tasks that can appear in that column on the Kanban board. The two
schools of thought that seem to pervade are:
* 2n-1 - where n = the number of people on the operations team. The reason for
this is to enable team members to work together on some tasks but to give
enough tasks so team members stay busy.
* n-1 - where n = the number of people on the operations team. The reason for
this is to encourage collaboration on the team and not to overwhelm them with
too many tasks. If someone on the team completes all of their work, that
person should be able to pull the next available task from the "Available"
column.
What is the risk of having a WIP limit too low or too high? A high WIP limit
might mean the team is taking on too much at one time. Each member of the team
may get overwhelmed with the amount of work. Consider these are reviewed daily
in the stand-up meetings and team members can pull new work from the
"Available" column when current work moves to "Acceptance." High WIP limits
mean that team members are less likely to work together on projects or tasks
because each person has his/her own work to complete. A WIP limit that is too
low could create a bottleneck, disallowing a team member from pulling new work
into the "In Progress" queue because other people on the team have hit the WIP
limit with their own work. The WIP limit is a sweet spot that the organization
needs to discover through experimentation.
Whenever there is a bottleneck in Kanban, the team can refocus its
efforts on the item stuck in the flow in order to unblock progress
across the board. WIP limits force this to occur because a column with a
WIP limit of 3 on the acceptance column will not allow any tasks to move
to that column if there are already 3 items waiting for acceptance. It
is a way to keep work moving across the board.
Scrumban
^^^^^^^^
Scrumban is a hybrid of the two previously mentioned methodologies.
Operations teams seem to embrace Kanban or Scrumban because of the
flexibility of daily re-prioritizing and the WIP limits that keep the
team from getting overwhelmed.
A Scrumban implementation would take elements from both Scrum and Kanban.
For example, operations might decide to define some roles, keep the review and
retrospectives, hold the daily standup from Scrum while enforcing WIP
limits and implement continuous work flow from Kanban.
Agile Toolkit
^^^^^^^^^^^^^
`jira <http://www.atlassian.com/software/jira/overview>`_
The Tao of DevOps
=================
What is DevOps
--------------
DevOps seeks to include the IT operations team as an important
stakeholder in the development process. Instead of developers solely
coding to meet the stakeholder's requirements on time and on budget,
they are also held responsible for how easily it deploys, how few
bugs turn up in production, and how well it runs. Developers also focus
on providing software operations can asily support once it's in
production. Instead of bringing operations into the conversation
after the product is complete, the DevOps methodology includes
operations in the development stream.
Development's view:
* Roll a product out to meet customer specifications within a certain timeframe
* Continuous delivery means recurring change as bugs are fixed and features
added
* Fast changing environments are needed to support dev
* Agility is key
Operation's view:
* Supporting the product for customers
* Keeping a handle on IT security
* Planning for deployment to production state
* Changes are slow/incremental
* Consistent environments are needed to support operations
* Stability is key
Why DevOps is important
-----------------------
In organizations where DevOps is not a priority, development is
often viewed as customer-focused by trying to solve problems and
deliver solutions while operations is viewed as a barrier to
development's mission. By combining these two often competing
mindsets, both sides can be satisfied. The result is a product
that potentially has fewer bugs, higher availability, increased
security, and a process for improved development over the life of
the product that works for both the developers and the operations
people.
It is also possible to implement a DevOps methodology in a pure
operations teams. In this scenario the operations team is also
Development because they stand up a webserver, provision virtual
machines, or code configuration management systems. In this case,
operations needs to wear both the development and operations hats by
meeting customer needs while also addressing security and supportability
of the solution.
What isn't DevOps
-----------------
A person cannot be a DevOp. You don't hire a DevOp.
The importance of Documentation
===============================
What to document
----------------
* Runbooks? SOP? (cparedes: might be worthwhile even though we want to automate
SOP's away as much as possible - what should we check at 2 AM? What do folks
typically do in this situation if automation fails?)
* Architecture and design (cparedes: also maybe talk about *why* we choose that
design - what problems did we try to solve? Why is this a good solution?) How
to manage documentation
Documentation through Diagrams
------------------------------
**Anecdote** At one job we had a single network engineer. He had a
habit of walking up to a whiteboard to explain something to the
systems folks. He would proceed to draw what we considered a
hyper-complex-looking diagram showing the current or future state
of some networking solution. We could never keep his configurations
in our heads like he did and he wasn't always around when we had a
question. One of us figured out that we should take a picture of
the whiteboard after he finished drawing. These pictures went into
the operations wiki. They weren't beautiful but they saved us time
when we could easily refer back to the pictures we took.
Diagrams don't always have to be professional visio-quality to count as
documentation.
Functional diagrams
Technical diagrams
Working with other teams
========================