Collaborating within a Codebase

Sometimes a deadline approaches and a handful of dedicated engineers step in to rescue the project with extraordinary effort. While admirable, heroics in software development must be examined with a healthy dose of cynicism. An exceptional night shift due to a one-off unplanned complication should not cause concern. However, when releasing software through heroics becomes the norm it's a symptom of an inadequate development process. These sustained heroics burden our team until it ultimately erodes with turnover.

One typical catalyst for frantic work is treating software development and software releases as two separate entities. A concrete example: Different development teams built outstanding improvements in isolation and each team has been running their version of the software without any issues. However, when we consolidate the changes of the teams to prepare a build for public release some changes turn out to be incompatible.

We dedicate subject matter experts from each team to get the application working. Our engineers employ all their metaphorical duct-tape, spit, and prayers to consolidate assumptions, align dependencies, and coordinate requirements. The amount of effort required for this time-sensitive build arouses the suspicion of our quality assurance team. Naturally, they find flaws in our deliverable.

With tensions running high, we decide what flaws we can live with and which ones we cannot. Our subject matter experts apply their craft once again to fix the designated bugs and hand back their result to QA. Our quality team gives the thumbs-up and we ship on time thanks to the efforts of all involved. To avoid any regressions for future releases, we integrate our heroic bug fixes back into the development teams' code base. Turns out, our teams have been hard at work for the next release and some changes turn out be incompatible…

Working continuously

Rather than releasing a single high-effort impeccable version, we build software on two virtues: frequent iteration and continuous improvement. By refining a product over time, we may better respond to changing requirements, evolving technology, and approaching deadlines. To ship high-quality updates, we optimize our development process for frequent releases, and should we find flaws during the release procedure, our development teams provide improvements quickly and without introducing new (or old) errors.

A solid development and release process allows us to deliver our product with confidence. The more efficiently we can address unwanted behavior, the earlier we may ship our software to gather vital user insights. If we're forced to keep software behind closed doors until we're "done", we build an entire product lifecycle on (potentially misguided) assumptions. User feedback reveals what works, what doesn’t, and what was never needed in the first place. The sooner our software sees use, the sooner it becomes useful.

Besides gathering customer insights, teams that don't release software don't generate revenue. Every paying customer funds the further development of our product. The more paying customers we aggregate at an early stage, the faster we can grow the team, scale the product, and expand our market share. More importantly, the sooner our software makes money, the less pressure on our depleting bank accounts.

In order to make "ship fast, ship frequently" sound more business savvy, we refer to this strategy as CI/CD. Since any serious professional movement requires an acronym to be considered industry standard, this counter-intuitively conflated initialism that stands for three continuous practices:

Continuous Integration
Continuous Deployment
Continuous Delivery

CI/CD insinuates automation. We build an environment that encourages developers to integrate code frequently, and have our system automatically build, test, and deploy code. CI/CD shifts the responsibility of mindless repetitive tasks to computers. Running tests, packaging build artifacts, signing binaries. The more processes we automate, the more time our engineers can spend on solving domain problems and writing business critical code.

Part I of Engineering Collaboration - Collaborating within a Codebase - focuses on the workflows for a complete CI/CD cycle. Each chapters covers a topic within that cycle to help teams to ship code fast, and ship frequently.

Continuous Integration

The first problem we encountered during the introductory example was our development teams working on different aspects of the software in their own isolated worlds. While every team produced (presumably) stable software in isolation, after consolidating all the new changes, we broke the build. The more people involved and the more work done, the further the different worlds diverge, and the more error-prone, tedious, and time-intensive task of creating a stable release candidate.

If an organization has a Continuous Integration (CI) process in place, all teams and team members integrate their work into the shared source code. The state of the source code reflects the state of the software to be released. Automated tests verify the integration of the source code and detect errors as quickly as possible. The fundamental goal of CI is to automatically catch problematic changes as early as possible by continuously assembling and testing complex and rapidly evolving ecosystems.

The more frequently engineers integrate changes, the greater the improvements of CI. CI offers verifiable and timely proof that our code changes can progress to the next stage. We don't need to hope that all contributors are careful, responsible, and thorough; we can guarantee the working state of our application at various points in time. By frequently checking small batches of revisions we reduce the overhead of integrating changes into the product.

We'll discuss continuous integration topics across the chapters Merge Requests, Code Reviews, and Testing.

Continuous Deployment

After integrating our code changes, we remotely build the application and run additional tests against the new binary. While CI verifies the changes to our source code, Continuous Deployment (CD) confirms the expected behavior of our compiled machine code.

Because continuous integration focuses purely on developer workflows, we tend to anchor the entirety of CI/CD to engineering. Thus, when talking about continuous deployment, the conversation shifts towards automated testing with self-service infrastructure in pseudo production environments. While important, deploying to a managed test environment is only the first step. Most software requires an automated and a manual thumbs-up before we ship it.

Engineers don't build tech in a silo. We talk to customers that reported an error. We prioritize product features with product owners. We design the user inputs and UI feedback flows with our UX colleagues. Before, during, and after writing code, we share our progress with all stakeholders and collectively build the software. A CI/CD mindset (and toolset) supports our aspects of collaborating iteratively on our products. The more we automate this process, the faster our feedback cycle.

Humans struggle to deliver consistent results across repetitive tasks. Building a software package manually requires us to follow a strict check list of compiling, copying, and renaming artifacts. Every so often, we miss a step and send our colleague a build that does not run due to misconfigured permissions or missing files. Computers, however, excel at performing the same steps over and over again. Automatically building, storing, and distributing software internally allows us to exchange updates with a high degree of velocity.

We discuss common strategies related to continuous deployment in the chapters Testing, Release Mechanisms, and Source and Artifact Management.

Continuous Delivery

Once we're happy with our changes, Continuous Delivery (CD) moves our tested package into production. Improving our software delivery effectiveness will improve our ability to work in small batches and incorporate customer feedback more rapidly.

As with the other CD (deployment), most conversations about delivery focus on sophisticated ways to replace code running in production. While CD (delivery) can cover that, that's considered the pinnacle of CI/CD. A pinnacle not appropriate for all software projects. It requires a high level of automation and testing, as well as a robust rollback process in case of issues. Hot loading to production is most suitable for internal deliveries or projects with a high degree of stability and a low risk of breaking changes.

Continuous delivery reduces the risk of the release process itself. Every public software update triggers a lot of moving parts: We run various release scripts to upload, sign, and publish artifacts and update our website with new download links and serve new documentation. Every step we manage to automate, reduces our overhead of releasing software. Less overhead allows us to release more often and increase our confidence in the process. Practice makes perfect.

Sometimes we discover errors that require an immediate fix. Whether we missed an edge case that deletes customer data, or learned of a critical security vulnerability in on of our libraries, we need to act fast and with competence. Robust continuous delivery allows us to be effective during emergencies. Opposite to people, high-running emotions and raised voices do not affect software automation. It also removes hard dependencies on key personnel during time-sensitive events.

We write customer facing release notes and creating marketing material. automatically, taking screenshots automatically. Immediate distribution and delivery with low overhead processes. Docs, Marketing, Sales, these all have to happen, but they cannot be blockers. Our CI/CD process needs to enable and provide information for these. Release notes, Screenshots, videos, tutorials, etc.

The terms Continuous Deployment and Continuous Delivery have been used interchangeably to such a degree that there's a probable chance different teams use these terms vice versa. We are not partial to any order. In this book, we deploy to test (stage) and deliver to production (client).

The chapters Source and Artifact Management, Documentation, and Release Strategies discuss various topics of continuous delivery. We discuss how we verify our changes and detect errors quickly in the chapter Monitoring and Observability.

Shift Left

Quote

"Shifting left is about paying attention to the low fuel warning light on an automobile versus running out of gas on the highway."
- Unknown, Misquoted

The term originates from a visual representation of a typical development cycle. We map the development steps in chronological order from left to right - left being the conception of a product, right being the release of the final product. The mindset of "shifting left" encourages us to detect and resolve problems in the earliest (leftest) possible stage.

The cost needed to fix an error grows exponentially the further right we detect the error. With every step we involve more people and generate additional communication overhead. An engineer can often fix a regression discovered by a failing test by themselves in minutes while they're working on the implementation. If discovered by a customer in production code, that customer reaches out to our support staff. The support engineer verifies the problem with the customer and relays their issues upstream. The people in charge of our products documentation update the "Known Issues" category for the latest build.

The development team investigates the cause for the regression - the magnitude of the task depending on release frequency and source code management - and assigns time to fix it. Equipped with customer logs and our support's case report and engineer tries to debug the issue, but might have to reach out to the customer for clarification. After integrating a fix, we verify the new (old) behavior and restart our release process to ship a patch.

The above example describes a scenario with two measurable cost comparisons. Every software leader may calculate the offset between investing into a CI/CD shift left environment and the cost of not doing so.