Photo:Scott Page |
“To be wise you must arrange your experiences on a lattice of models.” — Charlie Munger |
Organizations are awash in data — from geocoded transactional data to real-time website traffic to semantic quantifications of corporate annual reports. All these data and data sources only add value if put to use. And that typically means that the data is incorporated into a model. By a model, I mean a formal mathematical representation that can be applied to or calibrated to fit data.
Some organizations use models without knowing it. For example, a yield curve, which compares bonds with the same risk profile but different maturity dates, can be considered a model. A hiring rubric is also a kind of model. When you write down the features that make a job candidate worth hiring, you’re creating a model that takes data about the candidate and turns it into a recommendation about whether or not to hire that person. Other organizations develop sophisticated models. Some of those models are structural and meant to capture reality. Other models mine data using tools from machine learning and artificial intelligence.
The most sophisticated organizations — from Alphabet to Berkshire Hathaway to the CIA — all use models. In fact, they do something even better: they use many models in combination.
Without models, making sense of data is hard. Data helps describe reality, albeit imperfectly. On its own, though, data can’t recommend one decision over another. If you notice that your best-performing teams are also your most diverse, that may be interesting...
The case for models
First, some background on models. A model formally represents some domain or process, often using variables and mathematical formula. (In practice, many people construct more informal models in their head, or in writing, but formalizing your models is often a helpful way of clarifying them and making them more useful.) For example, Point Nine Capital uses a linear model to sort potential startup opportunities based on variables representing the quality of the team and the technology.
Leading universities, such as Princeton and Michigan, apply probabilistic models that represent applicants by grade point average, test scores, and other variables to determine their likelihood of graduating. Universities also use models to help students adopt successful behaviors...
The first guideline for building an ensemble is to look for models that focus attention on different parts of a problem or on different processes. By that I mean, your second model should include different variables. As mentioned above, models leave stuff out. Standard financial market models leave out fine-grained institutional details of how trades are executed. They abstract away from the ecology of beliefs and trading rules that generate price sequences. Therefore, a good second model would include those features.
The mathematician Doyne Farmer advocates agent-based models as a good second model. An agent-based model consists of rule based “agents” that represent people and organizations. The model is then run on a computer. In the case of financial risk, agent-based models can be designed to include much of that micro-level detail. An agent-based model of a housing market can represent each household, assigning it an income and a mortgage or rental payment. It can also include behavioral rules that describe conditions when the home’s owners will refinance and when they will declare bankruptcy. Those behavioral rules may be difficult to get right, and as a result, the agent-based model may not be that accurate — at least at first. But, Farmer and others would argue that over time, the models could become very accurate...
The second guideline borrows the concept of boosting, a technique from machine learning. Ensemble classification algorithms, such as random forest models consist of a collection of simple decision trees. A decision tree classifying potential venture capital investments might say “if the market is large, invest.” Random forests are a technique to combine multiple decision trees.
Read more...
The Model Thinker: What You Need to Know to Make Data Work for You |
Source: HBR.org Daily