Dev huddles – The andon cord of software development

HelpToyota introduced the andon cord in their manufacturing lines to help people to stop the production line and alert the people around if an abnormal situation arises. It is a cord that hangs above the head within easy reach of any to help to immediately gather the attention of the others. While the rest of the industry was treating production line is something that never should be stopped and let quality control take care, this one put power in the hands of people on the line to take a call to improve quality by attacking the problem at its source.

The idea then widely got adopted in different forms like ‘Help’ buttons in various manufacturing sectors. It was easy to adopt in manufacturing as we can see what is going wrong and gather together to immediately fix the problem. In software development it is not obvious when to pull the andon cord and how to collect thoughts of people on what to solve.

huddle

Dev huddles are the answer to the andon cord in software development. Huddles are very common in any team sports, team members quickly huddle to celebrate a goal or discuss a plan. The team also optimises over time to communicate very effectively in fewer words and quick time.

When to call for a dev huddle?
We need to call for a dev huddle when

  • Our programming has deteriorated from a flow to brute force or trial/error method. Manuals did not help to resolve and even a quick help from another team member did not help.
  • We find a badly developed code and need to bring it to attention of the team when it is still fresh in mind.
  • We are about to make a major change in the code base and every body needs to be aware of the incoming change so they are not surprised.

How to make sure that we don’t waste other’s time?

Team time interruption is expensive, so before calling for a dev huddle we need to make sure that we have all our show and tell pieces ready for discussion. One example is when stuck at a problem, quickly jot down the problem and the steps tried to resolve that did not work on the whiteboard in diagrams or words; then call for the huddle. If the resolution or direction is not found within 10~15 minutes, break the huddle to come and do a detailed research on the problem.

Why should dev huddles work?

Dev huddles work on the idea of crowd wisdom, the average output of a group is always higher than best individual in the group in most cases. Ideas & solutions can come from any one provided they are given a good explanation about the problem. Dev huddles also help in knowledge sharing in a terse & effective manner.

The big bridge that was replaced twice and no one noticed

I got a chance to walk on the Golden Gate bridge, could not stop admiring the beauty of the surrondings and the bridge. One can see what the bridge is subjected when taking a walk on this one, I arrived at one edge of the bridge when it was bright & sunny and by the time I reached the other end it was misty & foggy. It is constantly pounded by waves & high speed wind; and takes a lot of traffic. It left me curios to find the engineering behind it, after a while I stumbled on a documentary made by Nat Geo titled ‘Impossible bridges – Golden Gate’.

The engineers thought that it is impossible to build such a bridge because the water was too deep and winds were too fast and the distance is too long to cover. Even if built people wondered about how could it be sustained. Nat Geo’s documentary showed how the crew worked over the years to strengthen the bridge, rework the floor without disrupting much traffic flow. There were innovations like suspended traveling platforms to work on the cables. The bridge withstood an earthquake and was  strengthened to withstand stronger earthquakes. At the end of the documentary, the narrator mentions, he is sure that all parts of the bridge has been replaced at least twice but due to the engineers’ skill, people never knew that they are looking at a new bridge.

Golden Gate

The same holds good for long running software projects, many softwares over the course of time change so much internally based on the people working on it, which Richard Gabriel calls as Habitability in his book ‘Patterns of software architecture’. If the engineers who took over the maintenance of Golden Gate have not replaced the bridge over the course of time we would have a big heap of metal lying on the sea bed due to structural erosion. It is the same analogy that applies to software called ‘big ball of mud’ due to software rot.

The biggest reason given for software rot is that there is not enough time. Architecture is a long term concern, little beyond the learning horizon as mentioned by Peter Senge in his book ‘Fifth Discipline‘. We are not able to observe the effects of software architecture in the near term as there are very urgent business needs that keep coming up and they are never tangible like bridges; visualising them requires experience and skill. The advantage a software has is that it can be made more robust even after it has been built hastily.

How do we find that the architecture & design needs a revamp?

  • If the change to the software looks simple and trivial but takes a bafflingly disproportional amount of time to fix.
  • If it is difficult to visualise the code and explain to others in simple abstractions.
  • If a technology upgrade looks infeasible without a rewrite.
  • If there is too much dependency on individuals and context.

The revamp is not just about modular design & clean code but also adequate test coverage along with a robust continuous delivery mechanism similar to how the Golden Gate’s engineers used suspended traveling platforms to carry out their repair work on the cables without affecting traffic. A successful well maintained app that takes advantage of new technologies and changes hands would definitely been written many times over and no one would have noticed it.

How far can we pile on debt?

We are inept at perceiving the effects of debt, we grossly underestimate compounding factors and fall into a debt trap. Debt is tempting, it gives that edge to own things which is not possible now but only with the future effort that we can conveniently borrow and buy. There is a saying in Tamil, buying something in loan is like lighting a fire with borrowed hay but to repay it has to be paid in wood for same volume. If instant gratification takes priority then debt that needs to be repaid will hit us hard and bring us to a grinding halt. Corporates love debt, because with a lower cost of interest they will be able to create a high value output. This is where debt helps, if we are able to create something that is more valuable that can repay the debt with interest and still leave us with lots of money then debt is good.

Developers take debt analogy to code and create technical debt to get features fast to production, one major drawback is that most of us don’t understand monetary debt and its impacts well even though it can be quantified; but take decisions on technical debt which is difficult to quantify.

Let us have a look at an analogy, Neo has a trash can in his area that has to be cleared periodically. Clearing the trash can whether it contains little trash or full can of trash takes the same effort, so he prefers till the trash can fills up. The trash accumulates in a way that it doubles every day if not cleared. A new person Brio takes up the place of Neo, Neo explains that the first day only a gram of trash gets generated after clearing, but if left uncleared it doubles every day. He used to clear it every 10th day as it is around 500 grams by that time.

Brio observes that the trash piling up day to day is trivial to care about, around the tenth day he notices that trash is 500 grams and barely fills the can, he decides to come back 10 more days later forgetting the fact that it doubles. 10 days later what was a small can trash turns into an unmanageable heap of mess which is about 500 kilograms. The power of compounding is too easy to miss in the initial days, it will hit us hard who are used to linear maths.

Same is applicable for technical debts in an application. Tradeoff decisions have to be recorded and visited at frequent intervals to see if it is causing any harm than the convenience it provided. Technical debt also paves way for ‘Broken Windows’. Software development is a social activity, it will be very easy to make a broken window the new norm when code changes hands. Run your tech retrospectives; review code, design and architecture as an entire team at frequent intervals, that will help to manage technical debt better. Remember, it is not like corporate debt which is manageable by numbers.

The shoe that fits perfectly is forgotten

As a student getting a good pair of shoes was always a three way decision between cost vs comfort vs life of shoes. Often the comfort factor will be sacrificed to get a new pair and then the trouble starts, I will spend good enough time to adjust to the new pair. Not a single day will pass without the shoe bites bothering me, but the flip side is that I will take good care of my shoes. Whenever I pick it up to try to make any adjustments I also end up polishing the shoes. Once my feet got adjusted to the new shoes then I would stop giving a second thought about them, the last time I would have polished it was when I tried to make some adjustments to it. With less care given to it after it started fitting well, the shoe usually needed replacement well before its lifetime because of less care given to it.

The case is also similar to how people are treated professionally or personally, there is a tendency for people to devote attention to fix relationships with trouble makers and toxic people; but not give enough care to the ones who are fitting in to our sphere. There was an article from Forbes that people who stay at the same company for more than two years earn lesser than job hoppers whose performance was poor.

legs-434918_1280

Why does this happen? We are not consciously geared to this behaviour, most of the time there would be short term priorities that will kick in which will grab more of our time that needs fixing; resulting in a state where the ones that are working for us well aren’t the one getting the attention. The squeaky wheels gets the attention, the aching body part gets the balm, the biting shoe gets the polish. Eventually we will get more reactive, rely on noise and markers to effectively function which is a very expensive lifestyle instead of preempting something and making course corrections.

We should aim to keep things noise free; be it people, career or possessions. Getting things done was one book which helped in channeling the energy to find the list of things that I had in mind and keep track of them. The five step process described in the book helps to move the chaos into order. Learn to capture all that needs care, clarify if there is an action needed, organize and review the priorities periodically and do things as per your priorities. Keep it stress free, you will learn to care for your shoes even when it fits perfectly.

Information Snacking

buddha-231607_640Still at my home town some people equate being fat to prosperity. A fat person is someone who had steady access to food even at times of drought and famine, hence dying of hunger was ruled out. People ate whatever they can so that starvation does not kill them. Food revolution increased the availability of food and people started to eat well; when the life expectancy of every one improved, obesity became an issue. People were now dying because of overeating, unhealthy snacking and not maintaining a healthy life style.

The same is applicable to information, the differentiation between wise and knowledgable is not understood well. Just few decades ago in India, information was flowing slowly through government run TV channels, newspapers and books. The commute times were shorter with predictable work schedules, leaving more time for interaction with people and also enough time to read. I vividly remember the elders at home reading a novel in the evening, weekends or while standing at a long queue. There were reading desks at home where people read something and make notes.

If you summarize a book, often you will end up with a few lines which was the message from the book, but it will have a deep meaning to us if we had come up with those lines after reading and summarizing our thoughts, than those lines arriving in tweet.

french-fries-461705_1280We are right now in the era of information snacking where we are increasing our knowledge to a great deal, learned to drink it out of a firehose but not able to apply it effectively when needed. More and more people are addicted to ‘pull to refresh’ and snack on one liners, emails are annotated with TL;DRs and prefer to read a lot of tweets than read a book.

Manage the distractions before you are ruled by them.

TV was cloud-709148_1280one big distraction which was overtaken by social media, staying away from this for quality information consumption and time for family is a difficult but not an impossible thing to do. It is like going to gym toning up the body complemented by a healthy diet. Start slowly by turning off all notifications, keeping only one or two channels like telephone as an instant communication system; in other words change to a pull mode than being pushed with information. Dedicate slots for reading books, newspapers or other things which has long text, periodically clear out the social media backlog; indulge yourself once in a while but come back to a healthy routine.

book-419589_1280Why read long texts? Reading has helped me gather thoughts, play thought experiments, ability to write & articulate well; beyond that it took my mind of the constant buzz that follows a work day. Maintain a list of books to read, motivate other friends to read and share your learnings and stories. There are so many books that are rich in content and only limited time to read, sharing the book experience with others is rewarding as we get lots of information from others in a condensed but more interactive form as well as encourage us to read more and share. Stay away from information snacking, it just feeds empty knowledge. Consume information with a planned diet.

 

Release planning unboxed

Requirements

Release planning is largely an empirical activity. Capturing the requirements effectively is the first step. Capture the thoughts of product owner or client in a free flow, map them to user journeys and fine grain them to small requirement statements. Each of these granular requirements which can be expressed in a sentence or two will be the basic building blocks of the final product. It is popularly called as stories and getting them right is the first step of a successful release planning.

Visual triaging

sticky-note-681016_1280

Write each of the stories on to index cards or stickies, preferably colour coded based on parent features and put them up on a wall or a large table so that you can see all of them at once. This helps to see the spread of the requirements and allows to dive into deeper details for ambiguous entries.

If the ambiguity does not resolve with deep dives we quickly get the developers to do a spike (proof of concept) to validate our understanding. The proof of concepts have to be very quick but deep enough to provide us an understanding to estimate the effort involved.

Sizing

When talking about sizing, people immediately try to force fit their stories into fibonacci series or arbitrary small, medium, large. Instead we need to create a real world analogy for sizing the stories as mentioned in one of my previous posts. The last few years I have been using Scooter, Car and Bus as my sizing references for stories; and as I expected most of my sizing meetings have shrunk to less than half of what it used to be when using fibonacci or t-shirt sizes. If you have outliers which do not fit in any of the categories we temporarily park that in a trailer or ship category if very large or in bicycle if it is too small. The outliers are revisited and the stories are rewritten to see how they can be fitted into the sizing slots. The very large stories always have been unpredictable and skew the time taken for a point metrics, so we have to break them before the end of the sizing phase.

Velocity estimation

Remove the sizing details from the stories, pick developers in a round robin fashion and let them pick what can be done in five ideal days. That is there are no meetings on that day, no holidays, high availability of all machines and tools, no dependencies.  The developers each should do around 5 iterations of picking stories and the result has to be captured like below.

Developer 1

Week 1 S,S,M
Week 2 S,S,S,S
Week 3 M,M
Week 4 L
Week 5 S,L

Developer 2

Week 1 S,S,S,M
Week 2 M,S,S
Week 3 M,S,M
Week 4 L,S
Week 5 M,M

When we collect all the developers’ estimate and average out, we will be able to come up with the ratio of S,M,L like 1:2:4 , 1:3:7 or anything but not necessarily fibonacci. If we had taken fibonacci there would be too many arguments whether a story is a 2 or 3, but we avoid that argument if we let the velocity estimate give us the ratio. The ratio then becomes the points for a size, if the ratio turns out to be 1:3:7 then our S is 1, M is 3 and L is 7.

Realism infusion

It is not possible to plan for work in ideal days; there will be meetings, holidays, sick time, vacations, down times etc. We need to account for those. If the velocity estimate which we can call it as raw velocity comes out to be 10 points per week then we deduct the appropriate numbers to get a highly probable velocity. Typically this works out to be around 70% of raw velocity but it purely depends on the teams.

Prioritization

Clients will always say that they want all the features they have envisioned, but there are many stories where an app ends up where most of its user use only the 20% of it, the remaining is a long tail with very little returns to the effort involved. The product owner has to come up with the must haves for an initial set of features. There are many popular techniques used, MoSCoW prioritization method is one that is frequently used. If there are many people who will prioritize then each of them should come up with their own priorities and the chief product owner takes a call with a common denominator. The explanation is simple but most of the clients struggle to come up with a prioritized list as the illusion is every thing is a priority.

Sequencing

The stories are sequenced against a weekly timeline put up on the wall or a large table. The stories are then picked up in the order of feasibility with priority given to difficult and important stories. This step will give us lots of options on the number of parallel streams of work that can be run, what are the dependencies between stories, the order in which the features can be completed, the team ramp up and ramp down plans. The kind of mix and match we can do depends on the granularity of the stories and the amount of dependencies between them.

After the last step has gone through few rehashing we will be able to come up with an initial release plan to start with, this plan should always remain as the reference and re-planning has to happen on a monthly or fortnightly basis to keep adapting to the learnings we get as move along. Many people treat the initial release plan as a sacred rigid plan but that was just a guideline to tell us how to approach development. It is a waterfall project if the initial release plan was delivered as is.

The lazy programmer

I was having a chat with an old time ThoughtWorks developer, the topic was trending towards voluminous work and long days at work. I interjected with the point that the developers should not be willing to work long and hard hours, instead they should be lazy so that there are better tools and automations coming out of them, rather than checking boxes on long todo lists for the day.

He thought for a while and then replied “You are right, the CruiseControl continuous integration tool was born out of laziness”. He went on to explain that one of the developers felt that it was too annoying to walk up to a computer, pull code out and run the build & tests. So the person put a build loop on that machine to do that task. Someone else put a web interface to it and there a new tool was born. It left a lasting legacy in the continuous integration space. (Cruise control home page).

Why do people relate long working hours to prosperity?

The industrial revolution required a good deal of unskilled labourers who were given instructions and repetitive tasks to be done. The more they do, the more money the company makes. So overtime was rewarded with more money and people tend to stay longer to get paid overtime. The invention part of automating the repetitive tasks were left to someone else, the thought of hard and long work is rewarding stayed on even though there was a chance of a new invention that could take this entire category of job away.

After many decades of advancements in the industrial space, automations have taken a majority of space. The place where the automation as of now is not able to get into are creative spaces or knowledge work. If a job requires more than few simple steps then it is beyond the mechanical skills and involves cognitive skills. The moment when even rudimentary cognitive skills are involved, then no longer any of those incentives and hour based pay work. It is explained well in the video created from the work of Dan Pink, the author of the book Drive.

Wisdom gets passed on through generations, bosses and workers alike, people were conditioned from the childhood that hard and long work is the only way prosper. When that person becomes the boss, demands the hours and when that person is the worker, obliges to it.

Programming is a step further, it involves complex thinking which requires us to bring deeper parts of our brain to work. Andy Hunt in his book The Pragmatic Programmer talks about L-mode and R-mode, which is about using our linear mode of the brain or the rich/random mode of the brain. Though there is value for linear mode, programming benefits a great deal from the R-mode. Staring outside the window, doodling, watching the fountain, a stroll could all be more productive activities than staring at the screen and furiously typing commands for long hours because new ideas pop out when you are least expecting and unprepared.

A programmer has to be lazy, should not jump into the task at hand instead approach programming with a mindset of ‘No code is the best code’. Laziness will make us look for to remove mundane repetitive tasks out of the way which will also pop up more creative ideas through R-mode. Workplaces should also help ease the norms of equating the number of hours in seat to productivity.

Push the problem out of your foreground mind, and just “hold it lightly”. Then go for a walk, etc. That’s when insights and breakthroughs come to me.— Henri Poincaré

Hand me downs

Why not follow the boy scout rule when moving on?

pass on

Boy scout rule is well known in extreme programming, people are advised to leave the code in a better state than they found it. I observe this mostly works well for programming but not elsewhere. I have always admired teachers especially the ones that teach the basics. I made it a point that if I understand something after a good deal of effort then I would make it easy for another person by simplifying it. I kept doing this at school and college, I helped people learn tough parts of algebra, chemistry and physics through easy analogies.

When I graduated and got my job, I still carried on with this work of simplifying the tough things I learnt. There was a sudden change in team composition and I had to take up the work of a sysadmin, which was difficult for someone who was in technical support and testing for a year. I spent a lot of time learning to fit in the new role and in the process making sure that the next person getting on to this will require less transition, this is the time I was in for a rude shock; I was told by one of the senior members of the team ‘if you proceed this way of simplifying things and sharing your knowledge, you will soon be out of your job’.

The intentional complexity was hard to accept, especially when the company was trying hard to reduce dependencies on people. The high complexity created many silos in the team which made replacements harder eventually causing the team’s growth to slow down. It gave a false sense of security as people were called experts in their tasks, but not learning anything new as the learning curve was too steep.

Not working together with peers or communities will lead to phase called ‘Expert beginner’ which prevents someone from becoming competent. There is a good writeup about ‘Expert Beginner’. People take pride in the complexity of the their work and put through the new comers through the same phase so that the learning curve is steep and there is still value for expert beginners.

I read an article from ‘British Bird Lovers’ which is about how red robins which are territorial in nature lost out on the learnings it had got to open sealed bottles. The birds which learnt and kept the knowledge to itself gained a lot, but its successors did not learn any. The article finishes well saying ‘Birds that flock together appear to learn faster and increase their chances to evolve and survive’.

The general tendency for people is to pass on what was handed down to them as it is, especially if they spent a lot of effort to make it work for them. If someone has a tough time getting on board at workplace, s/he will tend to keep the on-boarding process the same, largely due to the fear of someone else overtaking or replacing them. In the process creating a culture of territory and stagnation instead of co-operation and shared growth.
Eric Schmidt mentions in his presentation ‘How google works’, that the only way to consistently succeed is to attract skilled people, work as a group, care for the workplace. We should only hand down the best, leave the place better than we find, if we do it that way then together we move ahead.

Cost of testing last

sashimi-472778_640At a fine dining restaurant if I complained that I did not like the food; immediately the chef, waiter and the food taster run to the dining table to assure that I don’t feel disappointed and provide me an alternative as soon as possible. If you take a look behind the kitchen door then we will come to know that they take lots of precautions to make sure that this never happens at the dining table. Chefs have a tough time to keep taste and quality up to the mark irrespective of the availability of ingredients and limited time.

A typical restaurant is staffed with a chef, chef’s cooking assistants, butcher, waiters, managcook-196932_640er, janitors, food tasters and a lot of machines which speed up the process of cooking. Cook’s best use of time is spent cooking the food, but does it mean the cook should not get a feel whether the food is cooked and is it as per the customization the waiter described?

Let us a take a scenario where the manager is able to hire more food tasters and shifts the burden of the quality to the food taster. Cooks will be able to create lots of dishes as per the recipe given by the chef without giving a second thought about the balance of ingredients, texture and the level of cooking required. The dish will ready to present in a finished state when it reaches the food taster, if any fault is found then the dish had to done from the beginning which will increase the workload of the cook further and reduce the output considerably as the reworks are expensive, eventually will make the customer wait more or increase dining table incidents.

What looked like a clever idea on paper backfires as cooking for fine dining is not something that could be made into a template and responsibilities could be in silos; it is a complex system with a strong feedback loop at every stage. The scene at programming is similar at many cases, developers are encouraged to write only the code and leave the testing to the QA as the perception is that the best use of a developer’s time is to make her code. Seasoned programmers know better that unit tests are also code, if you are not testing then you do not know what you program.

Cynefin_as_of_1st_June_2014Programming is an emergent practice, which matches the complex domain in the Cynefin framework by Dave Snowden. Each project will be unique and a pattern will emerge over time which will be efficient for the team if there is a strong feedback loop. Trying to separate out of responsibilities like testing to just one small group will push the team into a cycle of code, test and fix with long feedback loops, eventually causing delays. Quality has to be assured at every stage of application development, responsiveness determines who is competitive in a complex environment.

Image courtesy: http://pixabay.com & http://www.wikiwand.com/en/Dave_Snowden