Scaling is About Unscaling Technical Debt

Speaker: Jeff Szczepanski – COO, Stack Exchange (@inscitekjeff)

Large Scale

  • 100,000 people on the site during a weekday lunch. 30 million visitors a month for Stack Overflow
  • Over 100 million visitors a month for Stack Exchange

Technical Debt Warning Signs

  • Regular schedule slips
  • Frequent need to refactor, more often than it feel like you should be
  • Irregular or inconsistent output of new features
  • Adding new developers doesn’t speed things up, or even slows things down
  • New features break old features
  • Ongoing performance or stability problems
  • Slow turnarounds for your bug fixes
  • Bug fixes for your bug fixes

Solution: Rewrite!

  • Most brute force way to handle the problem

Another Kind of Technical Debt

  • How much code are you tossing?
  • How much tossing was really unavoidable?

Iceberg Analogy

  • Above the water features: the end features customers are interacting with
  • Below the water infrastructure: where the core value of your application is

Definition of Technical Debt

Misalignment of what is easy to do with your codebase and data structures versus what you need to do for your product to be more successful in the market.

Evolving Bar of Quality

When you only have a small number of customers, 80/20 works fine for tradeoff between feature completeness / reliability and shipping. As you scale, 80/20 no longer cuts it. Software needs to become more reliable to scale.

  • 80/20 Rule
  • 90/10 Rule
  • 95/05 Rule
  • 99/01 Rule

Features and whole product quality must continue to increase over time.

Customer Compelled vs Market Focused

Debt comes not just from being misaligned when being customer compelled in defining features, but from being misaligned when being customer compelled in the architecture of your system.

Key to scaling your business is bringing your customers and their data with you along your product roadmap.

Good Properties of Agile

  • Admits that requirements evolve
  • Encourages regular releases
    • Which allows validation of progress more often
  • Sprints are excellent for driving cadence
  • Closes the loop relative to quality
    • Quality in all forms: retrospective, etc.

Pitfalls of Agile

  • Scheduling beyond a few sprints seems rare
  • Encourages feature-centric thinking
    • At the expense of good system design
  • Incremental nature encourages shortcuts
    • Lighter specs, lighter documentation, etc.

Good Properties of Waterfall

  • Formal specifications rule
    • Helps you think through the problem more
  • Emphasizes detailed planning
    • And how you are going to progress through the plan
  • System design is specifically a thing
  • Highly efficient, if requirements are solid

Pitfalls of Waterfall

  • Requirements are never perfect
  • Encourages long, serial release cycles
    • Spec everything, then design everything, then build everything, then test everything
  • Problems discovered late
  • Cost of errors high

Better Approach: The Agile-Waterfall Continuum

  • Strengths of agile are pitfalls of waterfall
  • Strengths of waterfall are pitfalls of agile
  • Two sides of same coin: do something in the middle

Key Variables to Consider

  • Length of release cycles
    • Waterfall can have short release cycles
  • Clarity/confidence of customer requirements
    • Changes over time as you understand customer requirements more and move away from an early MVP
  • Depth of system complexity
    • More complexity requires more planning
  • Severity of defects
    • More catastrophic, the more appropriate waterfall is
  • Size of software team / customer base
    • As increases, need more waterfall approach

Bias Toward Agile When Focused On

  • Requirements discovery/validation
  • Incremental feature delivery
  • Simple systems close to the UI
  • Progression of “dot” releases

Bias Toward Waterfall When Focused On

  • Architecture phases
  • Development of core services
  • Complex modules far from the UI
  • Things that deliver on your differentiation and positioning

Joel Test: Addtions for Teams At Scale

  • Do you document your services & major modules?
    • Concise descriptions of role & scope
    • Full API and all the interface data types
    • Key design assumptions & dependencies
    • Single person on the team that owns each
  • Do you document all your data?
    • Text for role of every table
    • Text on purpose and invariants for every column
    • Owner that reviews every field addition/deletion
  • Do you enforce a coding standard?
  • Do you do code reviews?
  • Do you track bugs back to their source?
    • Bugs cluster
    • Spend time to check for other potential bugs from the same source
    • Closes the loop on quality
    • Looking for brittle modules
    • Looking for programmers in need of mentoring
  • Do you define deadlines and hit them?
    • Can you make make predictable deadlines and hit them?
    • Schedules are critical
      • Sales, marketing & customer support care when things ship
      • Good roadmap decisions depend on valid cost/benefit tradeofs
    • How else do you deterministically evaluate the performance of your developers or your development team?
    • Schedule needs to be developed by developers and rolled up from the bottom
    • Predictability is a symptom of high quality software development
    • Good Schedule = Good Business

More on Schedule & Candence

  • Schedule slips are a learning opportunity
    • Estimateion is a specific skill, so develop it in people
  • Better to do structured slips than cramming
    • Reschedule, don’t try to force release ASAP
  • Each team should have an anchor and a rover
    • Anchor: Probably team lead. Ultimately responsibile for everything being done.
    • Rover: Someone who isn’t scheduled for 100% who can take up the slack when unforseen things happen.
  • Best way to deal with continguency isn’t to pad the schedule, but ensure developers aren’t 100% committed.

Team Development

  • Performance != Results
  • Bunches of separable skills to develop
  • Aim for a steady cadence & continuous output
  • Pay attention to motivation and morale
  • Seek to empower & enable, not manage

Thee Key Roles

  • Developers: coding team members
  • Team Leaders: As players & coaches
  • Engineering Managers: Help skills development

Team Leaders

  • Ideal developer
  • Natural leader that enjoys mentoring
  • Drives cadence & morale of the team
  • Responsible for team meeting deadlines
  • In charge of quality, documentation
  • Eyes & ears for engineering management

Engineer Manager

  • Main responsibility is skills development
  • Line manager to all developers
    • Covering 15-20 developers
  • Removes operational barriers
    • Helps define common tools & infrastructure
  • Works with a longer term horizon
  • Thinking on building capabilities of organization
  • Role specializes as company grows
    • Splits into operational aspects & skills development aspects

On Team Morale & Candence

  • Bottom up estimates critical
  • Strive for continuous and steady output
    • Create ownership of goals, but no death marches
  • Leverage peer and social pressure effects
    • Set cultural norms & expectations
  • Merit-based, not tenure-based advancement

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>