Measuring Success and Failure

I was halfway into a post about modeling value and why reliable tech skill is universally important, but I received several independent criticisms regarding being more concrete. In response I’m going to address measuring the success or failure of your experiments.

How to Measure Success and Failure

The short answer is, I can’t tell you. The more satisfying answer is that there’s an easy way to find out yourself. Success depends entirely on your definition, which you can freely choose to define before running tests. Here’s an example of something I plan on testing tonight on stream:

Waste or Value Discovered: By watching videos of my play, I see that I consistently miss a significant number of punish opportunities on knockdown.

Root Cause Analysis: To ensure that I’m treating a root cause and not a symptom, I turn to the 5 Whys.

  1. I always hard-read tech in place or no tech
  2. I prioritize hard punishes over consistent punishes
  3. It’s extremely low effort because it’s now ingrained in muscle memory

Prescription: I have to update my strategy and commit to consistency over hard punishes. To counteract the muscle memory, I’m going to have to heavily invest in focusing up and not going on autopilot after knockdown hits. This is what I think the 5 whys are telling me to do:

  1. Not a root cause
  2. Root cause – adjust strategy to increase priority of consistency
  3. Root cause – invest significant time into breaking muscle memory

Hypotheses: This is a leap of faith that hasn’t been proven yet. The hypothesis will be either validated or disproved based on experimentation. Mine is as follows:

                Increasing my reaction-based punishes will deliver more value.

Now I have to create metrics, establish a baseline, define my success criteria, and come up with experiments.

Metric Creation: What is important here? I can think of a few things.

  1. % of punish attempts that cover a legitimate option
  2. % of punish attempts that are reaction-based vs. prediction-based
  3. Success % of reaction attempts
  4. Success % of prediction attempts
  5. Influence of stage / matchup on outcomes.

Obviously if I’m covering no options, I have a big problem. Based on the distribution of reaction vs prediction attempts, I should be able to maximize value. Note that this experiment (unlike focus) will likely have a LOT to do with matchup. I should measure this too.

Establish a Baseline: By watching videos, I can measure my baseline accurately. Remember that more points of data means more accurate measurements.

  1. Approximately 60% of punish attempts cover a legitimate option
  2. 10% of punish attempts are reaction-based vs. prediction-based
  3. Success % of reaction attempts – about 25%
  4. Success % of prediction attempts – about 10%
  5. Influence of stage / matchup on outcomes. (no data)

Success Criteria and Experiment Creation: Ultimately I want to increase the total value of punishes and follow-ups. But, on the road to getting there, I want to be able to think and make the right decisions based on the situation. Currently I’m just guessing in place, so from my perspective, any steps towards delivering on my goals are an objective improvement.

In this case, OKRs sound like a great tool. My objective is to increase consistency in punishes on knockdown. What are my Key Results?

Key Results: The point of OKRs is to set up sub-goals that, if completed, will demonstrate that I did everything I could to satisfy my objective of increasing consistency in punishes on knockdown. The 5 whys prescription is a great starting point and provides me with much-needed direction.

KR1: Follow up on reaction after ¾ of knockdowns (5 Whys Rx 3)
KR2: Successfully follow up on 25% of reactions
KR3: Increase hard-read success rate by 50%

Setting up strong KRs is my job, and my ability to create good experiments will determine whether I actually achieve my objective. The logic is that currently on knockdown I’m attempting hard-reads nearly 90% of the time, attacking in-place or missed techs. I want to invest heavily in not doing this to break the muscle memory first.

Results
I’ll update this post tomorrow after measuring the data to determine the results. To find out sooner, join us live on the stream!

-Scar
Subscribe to @bobbyscar for update notifications
Join the discussion at The Lean Melee Smashboards Thread
Watch our stream on twitch.tv – next broadcast indicated at the top of the page
Check out the new home for NorCal Melee Videos on YouTube

Email questions to leanmelee@gmail.com

Advertisement

Lean as a Framework, and Small Batch Sizes

Since I’m just introducing Lean as a gaming framework, I want to throw out some clarifications about what Lean is and is not. Then, I’ll dive into the idea of “small batch sizes.”

First, Lean is not a way to start winning a few more games quickly. It’s also not merely a reminder that we should continue to learn and improve. The desire to improve is a requirement. A commitment to the philosophy of continuously achieving validated learning is the foundation.

Lean is a framework – a tool. By itself, it won’t find what you should work on and how. Rather its purpose is to enable you to efficiently find big pain points, and to tell you if what you’re doing to fix them is working. Like all tools, lean is only as powerful as its user. So in my post giving an example of Lean from problem diagnosis to solution, that example is relevant to me specifically. It’s possible that we could have been making precisely the same mistakes, like rushing in at inopportune times, messing up edgeguards, and recovering poorly, but the root cause could be very different, like poor spacing as opposed to poor focus. If used properly, Lean should help uncover your real weakness.

Small Batch Sizes
On the topic of efficiency, a core part of Lean is achieving validated learning as quickly as possible. Top players are pushing hard to improve all the time. To stay competitive, you need to keep up. To be one of the best, you have to be even faster.

I received some pushback about whether breaking bad habits really had to do with not thinking, and that top players can beat bad habits out of you just as easily. Like I mentioned above, I followed the Lean framework and the 5 Whys led me to the conclusion that “not thinking” was the core problem, as opposed to bad spacing or any other root cause.

More importantly though is the idea that playing top players is the most efficient solution. I’m not sure that I stressed this enough, but the entire example detailed in my last post took a total time of about 6 hours. I was watching vids on the train home at about 6:30 and diagnosed the root cause, texted Tafokints and Darrell, who I practiced with that night, and had achieved validated learning by about 10pm. I continued to practice and find tricks to improve my focus over the next 2 hours.

Further, I diagnosed my issues by watching my own videos and looking for waste, and examining errors for a common root cause. This can be done independently over the course of minutes, whenever time is available.

Compare my solution with the idea of finding top players and having them beat the bad habits out of you.

  1. Find a top player
  2. Find time and a place to play Melee together one-on-one
  3. Commute
  4. Ensure that he/she finds your biggest problems and punishes you for them
  5. Learn from it

For some people, finding top players might be extremely easy. You may have one on the same campus as you, and perhaps neither of you are terribly busy. Maybe this person is extremely good at finding your weaknesses, and maybe you’re extremely good at learning from being punished. If this is the case, then I submit that playing top players might be the best solution for you. In all other cases, though, arranging this could involve an investment of weeks or even months. There is probably a more efficient way to improve.

By the same logic, it’s generally not a good idea to “wait for tournament” to try out your new skills. Lean is about testing a new idea quickly and seeing if it creates real value. Large tournaments are by far the best environment to put a new idea to the test, but if one isn’t available, the worst thing you can do is get stuck waiting for one.

More Problems with “Playing Against Pros”
You may have noticed some added benefits of my solution. By relying on myself to diagnose my own problems, I ensured that I had a dependable mechanism for improvement. What if the “top player” didn’t punish all of my mistakes? Or, what if the top player is great at punishing mistakes, but I came to the wrong conclusions about them? Surely not approaching at all would be better than approaching poorly. I could also turn to gimmicks to see immediate improvement. Treating the symptoms could afford better match results, especially against the same practice partner. But unless your ultimate goal is to beat this specific person, this is only the illusion of improvement. Taking steps to eliminate the underlying problems that create waste — in my case, not focusing — leads to consistent results.

A second benefit of not relying on “top players” is that I ever to get better than my counterpart, I’ve lost my strategy for improvement entirely. I’ll have to find a new top player, who almost by definition will be less accessible, which will further increasing my time through feedback loops.

The idea of keeping “batch sizes” small is central to pushing through Build-Measure-Learn feedback loops quickly. We have to keep diagnosing problems and keep testing new ideas to ensure that we continuously improve. If your goal is to be a top player, a good way to get there is to improve faster than everyone else.

-Scar
Subscribe to @bobbyscar for update notifications
Join the discussion at The Lean Melee Smashboards Thread
Check out the new home for NorCal Melee Videos on YouTube

Email at leanmelee@gmail.com

Bad Habits – Cut out the fat

Instead of using this second post to define terms or explain something that “will be useful in the future,” I think it would be more helpful to really dive in with a “vertical slice” snapshot of what the Lean Gaming process looks like. This story starts before the problem was discovered, and takes us all the way through making steps towards fixing it, so this post is a bit on the long side. So without further ado, the problem: How do I break bad habits?

Diagnosis
It seems to take way too long for smashers to get rid of bad habits and mental blocks. In my experience, some of my own habits have been so deep-seated that my entire strategy would shift to compensate – it would actually be easier for me to change major parts of my game than to address a bad habit.

In the past month, though, things haven’t looked so bleak. After reading The Lean Startup and theorycrafting about Lean Gaming, I noticed that I began to look at my gameplay differently. Lean told me what to look for.

  1. What was I doing that caused waste?
  2. What delivered value?

Unfortunately I found a ton of waste, and not very much value. Most of my kills came from lucky hits that led into one-player mode (combocombocombo). I’d find myself throwing out moves that probably wouldn’t work, approaching predictably, recovering poorly, and most importantly, SDing way too much. Why was I doing things that ultimately weren’t helping me win? Were they at all related?

To answer this question, I turned to a powerful diagnostic tool called “The 5 Whys”. The technique in a nutshell is asking “Why” of a situation 5 times. As you dig deeper, often you’ll find that there is an underlying cause that’s completely within your power to fix. The more I asked why, the more I saw the same root cause: not thinking.

(As an aside, I wonder if it comes as much of a surprise to anyone to hear that I never really thought, or maybe more accurately, though while playing. A lot of my analysis comes before and after playing actual matches, and my evolution as a player took place on the boards or while watching my vids. But, sure enough, my growth was basically an upload of new information, which I would execute next time without really ever thinking mid-match. I don’t think I’m alone here.)

Build-Measure-Learn
With Build-Measure-Learn, we typically operate in reverse order. Figure out what you need to learn, then determine how to measure progress. After that, you can build experiments and run them through the loop. I already figured out what I needed to learn, namely, how do I start thinking in-game? This is where the real work begins.

The first steps to determining progress are to define terms, and to establish a baseline. I chose to call my thinking “focus”, and created a bunch of silly metrics to help me organize my experiments. This process of ensuring accountability and objectivity in learning is called “Innovation Accounting.” Here is a copypasta of my personal notes:

  • Innovation Accounting
    • What is my baseline focus score?
    • Does my focus increase or decrease over time?
    • How can I organize my thoughts to maximize focus?
    • What is music’s impact on my focus?
  • Metrics
    • Focus: How much I’m thinking at a given moment
      • 1 – Autopilot, distracted, or otherwise not paying close attention.
      • 3 – In the moment, but not efficiently organizing and using thoughts.
      • 5 – In the moment, efficiently using brainpower, thinking creatively.
    • Focus Rate: Change in focus over time with all other things equal
      • Change in focus over time
      • Change in focus after an event
        • Focus in previous match
        • Match outcome (win loss)
        • Match difficulty (character / stage combo / opponent strength)
        • Match duration
    • MSF: Maximum sustained focus – max amount of time I can stay completely focused
  • Experiments
    • Establish a baseline focus score.
    • What behavior or event can increase or decrease focus between games?
    • What behavior or event can increase or decrease focus within a game?

Admittedly, the above three experiments aren’t really experiments. I had no idea about what could possibly affect my focus because I’d never paid attention to it before. So it became apparent that the first step was to simply observe.

I decided to track my focus score after each game by using an excel spreadsheet and to use the data to see what was causing my focus score to spike or fall. Here’s a snapshot of the spreadsheet:

Date Matchup Stage Focus Score Notes
16-Feb CF/Samus 3
16-Feb CF/Samus 4  OKR: 2 edgeguards – worked very well
16-Feb CF/Samus 4 Was very strong in the beginning, did something sweet and lost focus, brought it back last stock
16-Feb CF/Samus 3

Here were my preliminary findings, which became clear after 1-2 hours of play.

  • My focus increased simply by tracking it (tracked baseline was 3.5, compared to an average of about 1.5 to 2 before tracking)
  • Matchup and stage have no effect on focus
  • Focus decreases if there’s a big lead on either side
  • Focus decreases after I do really cool combo (did anyone see that? what is the stream saying? is this recording? what were those moves again?)
  • Focus increases after I lose a lead or catch up
  • I actually hit a score of 5 for a few seconds, but then did something awesome and started thinking about other things

Establishing the baseline shed a lot of light on the consequences of focus in-game. With this new information, I decided to try something new to improve my average of 3.5. For the next match, I set some OKRs, or “Objectives and Key Results”, as an experiment. Setting up OKRs is basically the practice of establishing concrete “Key Results” as a measure of whether you did what you could to achieve your less-tangible “Objective.” I started off with just one Key Result, but I plan to expand it as I get better at focusing.

Objective: Increase Focus
Key Result: Land 2 edgeguards (edgeguard defined as opponent offstage before kill percent and not allowed to return to the stage)

I hit my key result that match and continued to try to edgeguard at least twice every game, which led to an increase in focus when Samus is offstage, which in turn helped me create a strategy to make it difficult for her to recover. Implementing OKRs was my first successful experiment.

Summary
As a result of successful root cause diagnosis using the 5 whys, using innovation accounting and feedback loops to objectively measure learning, and OKRs as an experiment to increase focus, I made more progress toward my goal of thinking while playing in 4 hours than I had in the 5 years beforehand.

  • Diagnosis
    • 5 Whys
  • Build-Measure-Learn and Innovation Accounting
    • State objective
    • Define terms and create metrics
    • Create and run experiments

This is the framework that I used, and I believe that everyone can benefit from lean principles. I hope to continuously discover my greatest weaknesses and deliver against them to create real value and to keep making real progress. Join the discussion and let me know what you think.

-Scar
Subscribe to @bobbyscar for update notifications
Join the discussion at The Lean Melee Smashboards Thread
Check out the new home for NorCal Melee Videos on YouTube

Email at leanmelee@gmail.com