Scoping
Scoping is the process of identifying, picking and planning the most valuable project.
Scoping process
- Brainstorm business problems (not AI problems). Usually, brainstorm with business people (not AI people).
- For example, you’d ask “What are the top 3 things you wish were working better?”
- In this step, you’re NOT looking for AI problems. In fact, you’d want to hear about “business problems”. Then, it’s the AI person’s job to see if there’s an AI solution for it.
- Brainstorm AI solutions:
- We have to be clear about identifying problems and identifying solutions.
- Assess the feasibility and value of potential solutions: Double checking if the AI solution is actually doable (diligence).
- Determine milestones
- Budget for resources
Separating problem identification from solution
It’s worthwhile to first engage in divergent thinking where you brainstorm a lot of possibilities, to be followed by conversion thinking where you then narrow it down to one or a small handful of the most promising projects to focus on.
| Problem (What to achieve?) |
Solution (How to achieve it?) |
| Increase inventory |
Search, recommendations |
| Reduce inventory |
Demand prediction, marketing |
| Increase margin (profit per item) |
Optimizing what to sell (e.g. merchandising), recommend bundles |
Diligence on feasibility and value
Feasibility: Is this project technically feasible?
- Use external benchmark (literature, other company, competitor)
Why use HLP to benchmark?
- We can use HLP to benchmark what might be doable for unstructured data, because people are very good on unstructured data tasks.
- The key criteria for assessing project feasibility is: can a human, given the same data, perform the task?
- Note: You have to make sure that the human evaluator sees only the same data as a learning algorithm will see is really important.
Do we have features that are predictive?
Looking at whether you have features that you believe are predictive is an important step of diligence for assessing technical feasibility of a project.
- Given past purchases, predict future purchases → seems feasible.
- Given weather, predict shopping mall foot traffic → seems feasible.
- Given DNA info, predict heart disease → not sure → there’s no exact mapping from genom to heart disease.
- Given social media chatter, predict demand for a clothing style → not sure.
- Given history of a stock’s price, predict future price of that stock → not sure (highly unlikely).
History of a project
It seems that the rates of previous improvements can be a surprisingly good predictor for the rate of future improvement.
- Therefore, you could estimate the rate of progress (e.g. some % relative to HLP) for every fixed period of time (e.g. every quarter). You can then project this rate of improvement to the future.
Diligence on value
How do you estimat the value of ML project? This is sometimes not easy to estimate. But, there are some best practices that can help.
There are often two types of metrics being tracked:
- MLE metrics
- Business metrics
There’s often a gap between these two. For example, the metrics for speech recognition can be,
Some back of the envelope estimation (Fermi estimates) of conversion rates, can help a lot to pick a metric.
It’s good practice to have technical and business teams try to agree on metrics that both are comfortable with. This often requires some compromise from both teams. Some back of the envelope estimation (Fermi estimates) of conversion rates, can help a lot to pick a metric.
Milestones and resourcing
Key specifications:
- ML metrics (accuracy, precision/recall, etc.)
- Software metrics (latency, throughput, etc. given computer resources)
- Business metrics (revenue, etc.)
- Resources needed (data, personnel, help from other teams)
- Timeline
If unsure, consider benchmarking to other projects, or building a PoC.