Launch Lab · MLB Analytics

Franchise DNA

Identity, style, and era-by-era evolution across full franchise history

vs. Compare
Launch Lab · MLB Analytics

About & Reference

About
Glossary
FAQ
What is Franchise DNA?

Franchise DNA maps the complete statistical identity of an MLB franchise across its entire history — not just wins and losses, but how they played. Every season is scored across six dimensions and normalized to league average so a 1927 Yankees number is directly comparable to a 2019 Dodgers number despite decades of rule changes and era shifts.

The result is a fingerprint: you can see at a glance whether a franchise has always been a power-hitting club or reinvented itself, when pitching was the foundation and when it collapsed, and which eras were genuine dynasties vs. paper contenders.

Data Sources
Lahman DB
Primary source for all seasons 1871–2020. The Lahman Baseball Database is the most comprehensive open historical baseball dataset, maintained by Sean Lahman and updated annually. Provides team-level R, HR, SB, ERA, fielding %, and salary data.
Baseball Ref
2021–2025 stats manually sourced from Baseball Reference season pages and converted to the same normalized scale using known league averages for each year.
Hardcoded
Era labels, manager names, milestone events, WS years, and DNA badge assignments are manually curated and editorial in nature.
Scoring methodology

Every metric uses the same normalization formula: (team stat ÷ league average for that season) × 100. The result is an index where 100 is always exactly league average for that year. A score of 120 means 20% above average; 80 means 20% below. This makes cross-era comparison valid — a dead-ball era team and a juiced-ball era team can be compared on equal footing.

After computing raw per-season values, a ±4 year rolling average is applied to the Lahman portion (1871–2020). This smooths out single-season noise — an injury-ravaged year or a fluke outlier — without obscuring genuine multi-year trends. The 2021–2025 data is presented as-is without smoothing.

Known limitations
DefenseUses errors per game inverted against league average. Errors capture fielding mistakes but not range, arm strength, or positioning. A team with great range but occasional errors (e.g. an aggressive shortstop) will score lower than a conservative team that never attempts difficult plays. DRS and OAA are more accurate but only exist from 2003 onward.
EfficiencySalary data in Lahman covers 1985–2016 only. Outside that window the Efficiency line is flat at 100 (neutral) and should be ignored. 2017–2025 efficiency numbers are estimated from publicly reported Opening Day payrolls.
SpeedStolen base strategy has changed dramatically across eras. The SB nearly disappeared from baseball between ~1990–2022. A low Speed score during that window may reflect era-wide strategy rather than a team's lack of athleticism.
Pre-1920Statistics from the dead-ball era (pre-1920) are real but context differs significantly. Power scores near 100 in 1910 mean "average for 1910" — not comparable in raw output to 1960s or 1990s power numbers.
Six metrics
Offense Team R/G ÷ Lg R/G × 100
Runs scored per game relative to league average. Purely measures run production — no park adjustment. 120 = scored 20% more runs than a league-average team that season.
Pitching Lg ERA ÷ Team ERA × 100
ERA+ style: inverted so higher = better. League ERA divided by team ERA. A team with a 3.00 ERA in a 4.50 ERA league scores 150. A team with a 5.40 ERA in that same league scores 83.
Power Team HR/G ÷ Lg HR/G × 100
Home runs per game vs. era average. Era-adjusting is critical here — league HR rates in 1910 were near zero, while the 1990s–2000s juiced-ball era saw explosion-level rates. 100 always means average for that year.
Defense Lg E/G ÷ Team E/G × 100
Errors per game inverted against league average. Fewer errors than the league = score above 100. Replaced fielding percentage, which clustered between 99–101 for all modern teams and showed no real signal. The errors metric has a standard deviation of ~15 points and correctly identifies historically elite defensive teams (1970s Orioles, Big Red Machine) vs. error-prone ones.
Speed Team SB/G ÷ Lg SB/G × 100
Stolen bases per game vs. era average. Reflects baserunning identity. Highly era-sensitive — the stolen base went out of fashion after ~1990 and returned post-2023 rule changes. The Whiteyball Cardinals and Big Red Machine spike hard on this line.
Efficiency Win% ÷ (Payroll ÷ Lg Avg Payroll) × 100
Win percentage per payroll dollar relative to league average. A high score means more winning per dollar spent. Oakland's Moneyball era (2000–2006) produces a pronounced spike. Only meaningful where salary data exists (1985–2016 from Lahman; 2017–2025 estimated).
Chart symbols
World Series Champion
Gold dashed vertical line with ★ at the bottom axis. Click any star to zoom the chart to that season's ±10 year window. Also accessible via the Championships tab in the sidebar.
League average baseline
Faint dashed horizontal line at 100. Everything above is above-average for that era; below is below-average. The baseline is fixed at 100 regardless of era.
Era bands
Tinted vertical columns marking each franchise era. Click a band to zoom the chart to that period. Era label appears at top-left of each band when wide enough. Corresponds to the Era Cards in the sidebar.
DNA badges
PowerHome run–driven offense is the franchise's historical calling card across multiple eras.
PitchingIdentity defined by starting rotation or bullpen excellence — the team wins by suppressing runs.
OffenseHigh-scoring lineups built around contact, walks, and run creation rather than power alone.
SpeedBasestealing and aggressive baserunning define the team's style across significant stretches.
DefenseFranchise historically values glove work — consistently above league average in fielding metrics.
EfficientAnalytics-driven roster construction — franchise consistently outperforms relative to payroll spent.
Era grade scale
A+
A+ / A / A− — Championship-caliber eras. Multiple pennants or WS titles. Historically dominant and memorable.
B
B+ / B / B− — Competitive, playoff-contending eras. Solid performance without championship breakthrough.
C
C+ / C / C− — Below-average eras. Around .500 ball or inconsistent results season to season.
D
D / D− — Losing eras. Extended droughts, 90+ loss seasons, franchise lows.
Using the tool
Why does every metric baseline at 100?
Because baseball's offensive environment has changed dramatically across eras. A team scoring 5 runs per game in 1968 (the Year of the Pitcher) was elite. The same output in 2000 (the Steroid Era) was below average. Normalizing to league average for each individual season makes all eras directly comparable on the same chart.
Why does the 1927 Yankees power line hit 200?
Because Murderers' Row was genuinely twice the league average in home runs per game. Babe Ruth alone hit 60 — more than any other entire team that season. The normalization isn't broken; it's accurately reflecting how historically extreme that team was.
Why does the Defense line barely move?
Fielding percentage — our defensive proxy — is a blunt instrument, and modern baseball teams are all very good at not making errors. The metric clusters between 99 and 101 for most post-1980 teams. Larger deviations are meaningful; small ones reflect noise or single-season anomalies more than genuine defensive identity differences.
What does clicking an era band or Era Card do?
It zooms the chart's x-axis to that era's time window, so you can see the metric lines at higher resolution. Click again (or click the active era card) to zoom back out to full franchise history.
Why is the Speed line so low for most post-2000 teams?
The stolen base nearly disappeared from baseball between roughly 1990 and 2022. Teams collectively decided that the risk of caught stealing outweighed the reward, especially as on-base percentage and home run strategies became dominant. The 2023 pitch clock rules (larger bases, disengagement limits) brought stealing back — you'll see recent upticks on several teams.
Why is the Efficiency line flat before 1985 and after 2016?
The Lahman Database salary table only covers 1985–2016. Outside that range there's no clean historical payroll data available in the dataset, so Efficiency defaults to 100 (neutral). The 2017–2025 efficiency values are estimated from publicly reported Opening Day payroll figures rather than computed from the pipeline.
Data & methodology
Where does the data come from?
The Lahman Baseball Database for 1871–2020 — the most comprehensive open historical baseball dataset, used in academic research and by professional analysts. Seasons 2021–2025 are sourced manually from Baseball Reference season summary pages and converted to the same normalized scale.
Why is a rolling average applied to the data?
A ±4 year rolling average smooths out single-season noise — an injury year, a statistical fluke, or a 60-game pandemic season — without obscuring genuine multi-year trends. Without it, the chart lines are jagged and make it harder to see franchise-level identity arcs. The 2021–2025 data is shown unsmoothed.
How is Pitching scored — isn't ERA park-adjusted?
Our Pitching metric is raw ERA inverted against league ERA — not park-adjusted. True ERA+ factors in ballpark, which we don't do here. Teams in extreme hitter's parks (Coors Field, old Fenway) will show slightly lower Pitching scores than their true talent, while pitcher's park teams will show slightly higher. It's a known limitation of using ERA directly.
How accurate is data from before 1920?
Statistically real but contextually different. The dead-ball era (pre-1920) had dramatically different rules — no forward pass on the home run, spitballs legal, balls played until they were soft. Stolen bases were used far more as primary offensive strategy. The numbers are accurate for what they are; comparing them to modern values requires understanding the different game context.
How often is this updated?
The Lahman-sourced historical data (1871–2020) is static. The 2021–2025 values are manually maintained and updated at the end of each season. The next full update will add 2026 data after that season concludes.
How to use this tool
1Select a franchise from the dropdown to load its full history going back to founding year.
2Toggle metric pills (or click legend items) to isolate specific stat lines on the chart.
3Hover anywhere on the chart to see exact normalized scores for that season in the tooltip.
4Click an Era Card in the sidebar or an era band on the chart to zoom that period.
5Click a WS year badge in the Championships tab to jump the chart to that title season.