The Powerful Power Law
What does last week’s devastating events of Boston bombing and earthquake hitting Sichuan of China have in common with the nature of COQ at a manufacturing company? In fact they all are observed to follow a simple statistical rule called the power law. Simply put, plotting the logarithm of the magnitude of the events against the logarithm of the probability of occurrence will result in a straight line with a negative slope relationship. In the case of a terrorist event, the magnitude can be measured by the number of casualty. A number of research has shown that this obeys the power law. In case of an earthquake, the relationship between magnitude and the probability of occurrence at a given time and region is described by the Gutenberg-Richter law as a type of power law distribution.
How are these related to COQ? This figure is an analysis of the warranty claim data of an automotive tier 1 supplier within a period of 1 year.
This data set indicates that the larger claims (above $10,000) follow the power law very well. The circled area are smaller claims that most likely indicates many smaller size defects have skipped the system and hence have lower occurrence than predicted by the power law. Typically, empirical earthquake data also demonstrates similar behavior known as “roll-off”. Assuming these data are representative patterns, they are showing that the power constant is approximately equal to -1. This means that the occurrence of above $100K claim is about 100 cases in a year, that of above $1M claim is about 10 cases/year and that of above $10M claim is about once every year.
Studies on terror events all over the world have found that very similar relationship exists between casualty and the probability of occurrence. In fact the power constant for terrorism is found to be about -2.5. In other words, the occurrence of a 200 casualty event such as the Boston bombing is approximately 10^2.5= 316 times more likely than a casualty 2000 and above event such as Sept 11.
Why do important quality events exhibit Power Law behavior?
There are 2 main reasons, both are results of the network nature of the manufacturing supply chain.
- Interdependency – Supply chain elements are highly interdependent. An example is that during my early career as a storage media quality engineer, there was an incidence that one day a small crack was discovered at the glass furnace at a remote factory in Japan. This turned out to be a devastating event because this glass furnace was the only one that made glass substrate for storage media in multiple brands of magnetic disk drives. These drives were supplied to make servers and PCs. That small crack hence stalled the entire server and PC supply chain for days costing millions of dollars.
- Positive feedback – An example of how positive feedback works is Toyota ‘s “unintended acceleration” case that ended up costing Toyota over billion dollars. At first those were considered isolated cases but as more cases were suspected to be connected, Toyota identified potential root cause as the floor mats from certain suppliers. Number of reports increased as the publicity of the case increased which in turn lead to the suspicion of Toyota hiding something increased. Toyota was drawn by Congress for hearing and later being fined for about $1.1B even there had been no proof that could relate the unintended acceleration cases to any electronic or software defects. Each cycle of litigation and probes reinforced the public’s suspicion of something was wrong with Toyota till the point of avalanche even when no major defects were identified by those investigations.
Six Sigma and the Power law
This power law behavior of COQ offers important insights on how quality executives should deal with important quality events. This is particular counter-intuitive to many quality professionals who have gone through six sigma training or are themselves six sigma professionals. The foundation of six-sigma builds on the normal distribution or the Bell curve. COQ, however, observes the power distribution, not the normal distribution. Here are some major differences.
- There is no average – In other words, it is meaningless to talk about the average size of a warranty claim. The Power distribution has no average value like the Normal distribution.
- The most important data points are the outliners – In our data set, the top 10 claims among the total of 412 claims contributed to more than 50% of the total warranty cost. These large claims are the outliners that are typically ignored by six-sigma methodology.
- Black swan events occur – The theory was developed by Nassim Nicholas Taleb to describe highly unlikely events that determines the course of human history. According to the above data set and the underlying power law, a warranty claim that costs over billion dollars occurs in about every century. Such event though rare can easily lead to termination of responsible executives or even bankruptcy of the business.
The Power law Strategy
Just like security gates alone cannot eliminate terrorist events, government bodies run drills and set early warning systems to reduce the risk of terrorist events. Similar method can be applied to catch quality defects.
In order to tackle the Power Law phenomenon, a strategy is needed to tackle its fundamental elements. This involves 3 major steps. The first step is to enable track and trace of the interdependency of the supply chain. Once interdependency tracking is established, the second step is to conduct further analysis that enables early warning (such as using Big data technology) based on the interdependency. Warning signals detected need to tie to a series of actions that involves PDCA cycles. The third step is a containment strategy to quickly respond to quality events before their effects were amplified by positive feedback. These measures will significantly lower the probability of isolated events escalating into catastrophic events through self-reinforcing cycles of positive feedback. It is worth noting that traditional ROI analysis based on average annual return rarely can be used to justify investment on implementing such strategies and solutions. When dealing with the potential catastrophic effect of the Power law, executive decision is required to set organizational direction. Seeking average annual return of such investment just does not make sense in the world of Black Swan events.
2012’s Nobel Prize in Physics goes to Serge Haroche of France and American David Wineland. They showed in the 1990s how to observe individual particles while preserving their bizarre quantum properties, something that scientists had struggled to do before. While this contribution may first seem far-fetching and remotely detached from the daily management challenges of a business executive, I am going to argue otherwise.
The Principle of Uncertainty
Let me first touch on the significance of this discovery. At the beginning of last century when quantum physics was born, physicists have discovered that classical laws of physics break down at sub-atomic level. Everyday objects that we are used to have deterministic states. For example, given the starting location and the velocity of a car, we can easily determine its location at any time. Tiny particles on the other hand behave differently. The foundation of quantum mechanics was first built on the Heisenberg’s principle of uncertainty, which describes the possibility of physical objects having multiple states. Hence given the initial location and velocity of a particle, multiple locations described by probability functions are possible. This is what makes quantum mechanics such a bizarre subject for most people. Making things worse, it was not possible to observe this type of behavior. For example, observing a photon will require lights to be absorbed by our eyes or any image sensors, hence altering the state of the photon itself. This observer effect and uncertainty relation has been captured in many ways in philosophical studies such as those of Karl Popper and reflexivity. The latter one has been mentioned by George Soros as the principle behind his investment strategy. Working around these monumental theoretical and philosophical hurdles is hence what the 2 Nobel literates have achieved.
What Can Managers Learn From Quantum Physicists?
While the bizarre world of quantum mechanics may seem distant, the principle of uncertainty for tiny objects prevails well in business management. For example, many companies have installed some type of ERP systems to get a real time view of the state of their business. There is a strong belief in the existence of a single version of the truth on financial data that are at company or division levels. Day-to-day decisions are made based on this information. This is almost in analogy to management by classical physics. However, when it comes down to highly granular information like events on critical machines, individual operator performance, inventory by SKU and bin locations, or even OEE for machines, business executives tend to think of them as the world of tiny objects like the bizarre world of quantum mechanics. It is not uncommon to have multiple truths in such manufacturing operations. The reported OEEs from different plants for the same type of machine can be based on very different measurement methods and subject to different degree of human errors. Different departments on the manufacturing shop floor have different recognition of the true state of their operations. The variable cost by product line by shift can be far from the aggregate cost that was captured in ERP. The inventory accuracy by SKU quantity can be way below the ERP inventory accuracy that is based on total aggregated financial numbers. It is far too common that business executives have admitted the principle of uncertainty and allowed their manufacturing operations to operate based on multiple uncertain states.
Mastering the Quantum Bits of Your Business
It does not have to be that way. Just like the Nobel Prize winners have discovered, the technology to observe and measure the quantum bit of manufacturing information exists. Some companies have already tapped into the power of this technology and achieved significant improvement in profit margins and working capital. In the increasingly complex and turbulent world, tiny quantum bit of information can explode into a perfect storm in a very short time. The capability of a business to leverage these quantum bits is already distinguishing the winners from the losers in the marketplace.
While the technology to observe the quantum bits of manufacturing information may be a far cry from getting its own Nobel Prize, the application of such technology should not be left as a subject of uncertainty anymore.
Any seasoned Lean manufacturing expert will tell you that implementing lean is not about JIT, Heijunka or any sort of tools. It is about implementing a lean culture of continuous improvement. In fact in Toyota, they consider their ultimate competitive advantage is the “intoxication of improvement” by every employee from shopfloor to top floor. Thousands of improvement ideas are created every day even for the smallest mundane tasks. This is in big contrast to “don’t fix what is not broken” mindset prevails in most other organizations. Well, what they believe is one thing. Have any of these been scientifically proven? Can we simulate this kind of organizational behavior and measure its output? And if we can, what can we learn from such about managing thousands of ideas and distill them to actions every day?
In this video, Dr. John Seely Brown, one of my favorite business writer talks about the innovation dynamics within the World of Warcraft (WoW), which also happens to be my favorite on-line video game. At the end, Brown said “This may be for the first time that we are able to prove exponential learning … and figure out how you can radically accelerate on what you’re learning”. Indeed, I have found this game could interestingly cast light on the social dynamics of lean culture and how it will evolve in the future.
Guild structure and QC circles
“There is too much information changing too fast…The only way to get anything done seriously is to join a guild” said Brown. These guilds in WoW are groups of 20-200 people helping each other to process ideas. This greatly resembles the Quality Circle movement, in which employees are not just hired to perform a task but rather to form part of small groups that constantly seeking ways to self-improve. The differences of QC circles to these guilds could be the technology that they are using as indicated below.
Everything is measured; everyone is critiqued by everyone else
In the WoW, it is easy to record every action and measure performance. There are after-action reviews on every high-end raid and everyone is critiqued by everyone. This resembles the typical PDCA (Plan-Do-Check-Act) process used by QC circles. The challenges however in the manufacturing world are that too much information is still recorded on paper or if recorded electronically, on multiple segregated systems. This inhibits the sharing, retrieval and analysis of information that enabled the rapid group self-improvement dynamics of WoW.
Personal dashboard are not pre-made, they are mashups
Another key learning from the WoW is that you need to craft your own dashboard to measure your own performance. Brown even said that the Obama administration is stealing the idea from WoW and trying to do the same. So much for the software companies who are trying to sell pre-packaged KPIs to measure corporate performance. Imagine a new manufacturing world that every operator and supervisor has real-time feedback on his/her own performance. Seeing how minute by minute idle time or over-production is affecting bottom-line and return on capital. The future of performance measurement technology is detail, real-time and personalized.
The last slide in the video shows learning speed exponentially increases as one goes up the level in WoW. The high performance guilds need to distill what they have learnt from their own guild and share with other guilds throughout the network. Those who can do that effectively tend to move up level faster. In the manufacturing world, there are many companies trying to share best-practices across and within organizations. However, manufacturing executives may not realize that effective continuous improvement and best-practice sharing can lead to a state of exponential learning that constitutes an ultimate competitive advantage.
In a sense, the computer world of WoW is able to simulate the social dynamics of how individuals could form groups to process and create ideas, how groups could measure and improve within themselves and how groups could interact with each other in order to accelerate learning that results in high performance. Such social dynamic also resembles that of the lean culture, long promoted within companies like Toyota. Looking forward, the promises of manufacturing 2.0 are technologies to enable almost everything to be measured, allow information from individuals to interact freely as groups and also empower groups to effectively share best-practices. Such multi-tier collaboration from shopfloor to topfloor will bring about a new form of highly competitive organization that harnesses the power of exponential learning. On that note, the future evolution of lean culture may not be that much different from the present World of Warcraft.
I have just finished reading “Velocity: Combining Lean, Six Sigma and the Theory of Constraints to Achieve Breakthrough Performance – A Business Novel” with my Kindle. The author Jeff Cox is the co-author of “The Goal“. This time the story is about Amy, the newly named president of Hi-T Composites Company could not get any bottom line improvement after implementing Lean Six Sigma for a year. In the end, she convinced her team to combine TOC with LSS approach in order to achieve and exceed the bottom line goal.
A critical piece of the story is a dice game. It is this dice game that has finally got everyone on the same page, including the stubborn LSS guy Wayne to change his approach. A key insight is to abandon the balanced line approach at which Wayne has been working. The team finally has agreed on changing to an unbalanced production with everything synchronized to the bottleneck.
In the book, Amy was betting her career on this dice game to convince her staffs as well as to generate the same results in actual production. It worked out that way in the novel. But in practice, would you bet your career on a dice game? I cannot held to ask the following questions:
- How repeatable are the results of the dice game described in the novel? How sound is the statistics behind it?
- How close is the game in resemblance to the reality of a production line? What are the limitations? Under what conditions would the TOC approach (Drum-Buffer-Rope) work better or worse?
- Under what conditions does a balanced line with takt time work better or worse than an unbalanced line? How to quantify the variability in order to determine which approach to use?
The book has left these questions unanswered. That means these theories may or may not work at your reality. In order to better understand these questions, I intend to use simulation and analytic techniques to explore further. Stay tuned.
In Scenario 1, a balanced line is simulated with everyone starts with a single dice (same capacity) and the same 4 pennies (Initial buffer size).
In this simulation, WIP has increased from 20 to 26 by the 20th round and the total output is 62 pennies. This “throughput” number can be compared to the 70 pennies, which is the average dice point (3.5) times 20 rounds. 62 is in general less than 70 because of throughout lost as a result of variability.
In order to improve the performance of throughput, it was suggested to unbalance the line and create a constraint. Murphy is given only 1 dice while everyone else is then given 2 dices. The results look like the following:
This time WIP has increased from the initial 20 to 42 by te 20th round and total output is 81 pennies. This is significant throughput improvement but with a high WIP, especially around the bottleneck in front of Murphy.
In order to further improve the performance, a DBR (Drum-Buffer-Rope) method is introduced. In this case, Amy’s dices are being taken and she only releases pennies to the line according to the signal given by Murphy on what he rolls. In addition, Murphy is given a higher initial inventory buffer of 12 pennies.
This time WIP has actually decreased from 28 to 23 by the 20th round and the total output is at 91.
In the final case, the team discussed about improving the yield of at the bottleneck through Lean and Six Sigma. In order to simulate this, the dice roll of Murphy is mapped to number betweens 4 to 6.
The results indicated that WIP stayed low at 21 after 20 rounds, the throughput has been further improved 110.
It is shown that the simulation described in the book is generally repeatable. The logic behind these calculations can be nicely summarized with a G/G/1 queue and solved with Markov Chain analysis. We will discussed how practical are these results in application to real production line next time.
After taking lessons from several coaches, I noticed some very fundamental differences between their approaches. My current coach is very good at giving a one point advice based on my swing. Although one day I would like to swing like Ernie Els, right now I am settled with my ugly swing and happy to experience the notable score improvement after every lesson. That is quite different from the lessons that my friend took. His coach basically asked him to forget all he had learnt and tried to revolutionize his swing in order to take him to the next level. He is scared to go to the course now because he is stuck with a setback before he can get any better. He however believes that he is taking the necessary steps towards his goal of turning professional someday.
What are your long and short term goals and which approach is more suitable for you?
You should focus on eliminating muda in your swing. Do not try to “push” the club head towards the ball but rather let a synchronized body turn to naturally “pull” the club head in order to achieve a smooth flow of your swing. The game of golf is a process of relentless continuous improvement. We do not generally recommend you to invest too much energy to your tools because dependence on such frequently undermines the development of the correct mindset. If you focus on improving every little piece, your efforts will eventually show up in your score and hence your handicap, which should not be your ends but means to the way of golf.
Golf is a game of consistency. You should hence focus on reducing variability of your swing. We have a set of statistical tools to measure the defects of your swing as well as scientific instruments to monitor and track your progress. You need to certify your skills from green to black belts. Through leveraging the right tools with scientific measurement and objective feedback, you will ultimately reduce your swing variability to under 6-sigma.
TOC (Theory of Constraint):
You can maximize the return of your practice time by focusing on identifying and improving the bottleneck. At every stage of your skill development, there is a constraint that determines the throughput of your entire game. At one point it may be the grip or the address or the swing plane or approach shot or putt … but the point is that the bottleneck moves. By identifying the bottleneck and concentrate on it, you will be able to get notable handicap reduction within the shortest time. While lean and six-sigma can get you closer to the “perfect” swing, TOC focused on optimizing what you have already got to quickly improve your score.
Whatever the approach you pick to improve your golf game or to help transform your manufacturing operations, you can benefit from applying technology that automatically records your current swing (or process) to then give you instant feedback on what to improve. In my opinion, there is no better example than golf to illustrate how your actual execution can be deceptive to the best intended plan.
I have frequently come across factories that keep a lot of on-hand stocks and at the same time material shortage is among the top reasons of unexpected production downtime. Same goes for retail operations that have kept safety stock level arbitrarily high. More often than not, shortage is not reduced by keeping more stocks.
At first glance, this seems to be counter-intuitive. After all, textbook stock management models always indicate that safety stock level and the probability of shortage, which directly affects service level, has an inverse relationship. You reduce stock and you end up increasing the chance of shortage, which leads directly to loss of sales opportunity. On the other hand, more stock means more working capital, more product obsolescence, more warehouse cost …etc and hence serving customers is a financial tradeoff between shortage and stock levels.
In practice, this is not necessary true. In fact, I argue that in order to reduce shortage, you need to reduce your stock level first.
The reason is that excess stock level has many side effects that are not accounted by the textbook model. Most notable ones are the followings:
Inefficient use of procurement budget
Typically a fixed budget is allocated for procurement. High stock level reduces the flexibility of allocating budget to purchase what is really needed. This is very commonly seen not only in retail but also for large manufacturers’ raw material procurement and sales company operations.
Loose inventory management practice
Excess stock level tends to create an inappropriate peace of mind for managers. When less attention is given to keep stock level low, larger variability of turnover among SKUs is resulted. Replenishment of the SKUs that are really needed by operation is hence more likely to be forgotten.
Lack of continuous improvement incentives
This is the classical lean wisdom that when stock level is excessively high, problems are hidden behind the stocks. There is no pressure to improve supply chain responsiveness by reducing manufacturing lead time or improving overall material flow synchronization. Eventually, demand and competition will catch up with the limitation of the supply chain responsiveness. For example, studies have shown that US automobile manufacturers tend to keep higher dealer stock than their Japanese counterparts. This has been one of the major differences in competitiveness between US and Japan automotive companies.
Classical inventory model assumes that inventory decouples supply and demand functions. In practice, supply chain is complex and you cannot simply decouple supply and demand with stock. The key success factor to reduce shortage by reducing stock is to actively manage the complex relationship by better synchronization of manufacturing with customer demand.
Better synchronization can be achieved in 2 ways: more accurate forecasting and higher flexibility of execution. In today’s market of increasing demand volatility, there is a limit on how accurate you can predict the future by improving forecasting. However, lean methodology has taught us that there is almost no practical limit in improving execution.
Take the example of a manufacturing plant that I visited recently. They make products that are distributed across US through a franchise network. 5 years ago, their plant inventory alone was at 90 days and they only met 70% of customer orders. Today, their plant and downstream distribution center total inventory is less than 80 days while they are now meeting more than 96% of orders. The key to this change is that the manufacturing plant now is accountable for not only the plant inventory but also downstream DC and soon warehouse inventory. In this case, the overall stock reduction target has put manufacturing operation under pressure to reduce lead time and improve flexibility. Any such improvement has an impact to downstream supply chain stock level as well as customer satisfaction level. Such improvement cannot be achieved when manufacturing and supply chain stock are managed separately as silos. In order to achieve the next level of operational performance, they are evaluating a unified IT platform across manufacturing and supply chain.
By the same token, manufacturers are now using the latest information technology to synchronize better with their suppliers. An industrial equipment manufacturing plant that I visited has implemented an IT platform that allows them to see in real time the progress of WIP at its suppliers’ production lines. Such visibility allows them to control material synchronization between key supplying parts and the in-house final assembly operations, resulting in overall inventory and shortage reduction.
Are you wondering why your operation is keeping high stock level but still cannot reduce shortage? Do you see your manufacturing operation driving supply chain stock management? What is keeping your manufacturing operation from better synchronization with supplier operations as well as market demand?