Ataraxia through Epoché: [CPPCON 2021][notes] Debugging/Developing techniques

Reference:
https://www.youtube.com/watch?v=M7fV-eQwxrY

Define bugs

System is subject to a set of requirements.
A software defect is a non-conformity to requirements.
Pre-Curr-Post condition violated.
A non-conformity is a failure to meet one or more requirements.
A defect is incorrect program data causes a non-conformity.
A symptom is observable evidence of a defect.
A deterministic defect is a defect that does not change its symptoms under a well-defined set of conditions.
In contrast, a non-deterministic defect is a defect that changes its symptoms from run-to-run under a well-defined set of conditions.

Terminology

A context is the totality of the environment is which a program that exhibits symptoms is running
A problem report describes one or more symptoms in some context
Analogous context is a replica of the original context
Lab is the setting that we have total control over the context
Field is the setting that we have minimal or no control over the context

Relationship

Problem report -> Symptoms <-> Defects

Challenges

Problem report can be unhelpful (feed back from the user)
Problem report may not indicate actual problem
Collecting program state data may be difficult (log/setting/dump)
Symptoms may not indicate the cause
Defects and symptoms change as repair progresses
Fixing one defect may introduce new defects (messy design/quick fix)
Symptoms can be difficult to reproduce

Debugging process

Tend to think debugging is a linear process; i.e.

Characterize and reproduce
Locate
Classify
Understand
Repair

In reality tips

Review problem report
Characterize and reproduce problem
Clone if possible
Reproduce problem (loop)

understand problem
locate problem
classify problem
gain insight
attempt to repair

Problem fixed; deliver

In detail

Characterizing

Determining the context in which symptoms were observed
Version number, platform, resources allocated, external interfaces, configuration data, etc.
Information that allows you to instantiate an analogous context

Reproducing

Instantiating an analogous context, in the lab, or in the field
Running enough of the program/system to observe the reported symptoms
Developing new/updating existing test assets to demonstrate the failure
Make sure looking at the correct source code.

Characterizing and reproducing a problem is vital to the debugging process.

Understanding

Gaining ENOUGH knowledge about a problem and the surrounding code, that you believe you can make changes to carry out a repair.

At a minimum

located the incorrect lines of code
why the code is incorrect, root cause?
check the proposed classification
formulated a set of proposed changes
determine how the proposed changes could affect the runtime state

Inspect and verify the associated test assets

The test cases or harnesses may be broken
Test data should demonstrate correct and incorrect behavior

The defect may not be where you expect it

Keep an open mind and be ready to question all parts of the program

Ask yourself where the defect is not

trying to prove the absence of a defect reveals the defect

Explain to people why there is a defect, and why your proposed fix will resolve the defect

A local guru or bobblehead could be helpful - reach out for help if necessary

Locate the problem

Employ good development practices at the outset

Practice iterative, incremental, bottom-up development
Add functionality in small sections of code
Create test assets for each new increment of functionality
Verify that new code doesn't cause previous test cases to fail
Verify that new code passes its own test cases
Practice defensive programming

Alas

Well-written and extensive test assets
Preferably the whole product does this, at a minimum your fixes should
Adds runtime overhead, which can hinder the search for non-deterministic problems

Use trace logging

Generating output describing the program state during execution
In simpler cases, instrument code with print statements
In more complex systems, take advantage of existing logging facilities

Alas

Great way to stay 'on the path' when developing new code
An easy first step in narrowing down a problem's scope

Use debugging and analysis tools

Compiler warnings
Static code analysis tools (cppcheck, etc.)
Interactive debugger(gdb, lldb, udb, etc.)
Time-travel debugger(gdb, rr, udb, etc.)
Sanitizers (asan, tsan, ubsan, etc.)
Dynamic program analyzers(valgrind, etc.)
tracers (strace, wireshark, etc.)

Alas

for deterministic problems
not always useful for non-deterministic problem

Enable and/or add assertions

verify pre/curr/post condition of a function call.
verify expected program state

Alas

little effect on execution speed

Use backtracking

Try to understand the program state at each backward step

Alas

Good for very simple programs/small search with deterministic problems

Divide and conquire (binary search)

Pick section of code to examine
Place an assertion or set a breakpoint.
Repeat until reveals the defect

Problem simplification

Gradually and strategically remove/comment out sections of irrelevant code

Alas

useful for debugging crashes of release builds
work backwards from the end of the section

Make the problem worse

Magnifies the problem signal.

Alas

helpful in first step finding and understanding the problem

Scientific method

Form a hypothesis consistent with observations
Implement tests to refute the hypothesis
If refuted, form a new hypothesis with new tests

Alas

time consuming; especially for code base that unfamiliar with
effective for all problems

Problem Types

Deterministic problems

Review the logs
Add assertions
Use interactive debugger

Non-deterministic problems

Review the logs
Create a debug build and see if it also exhibits the same symptoms
Add assertions where needed to verify invariants
Add assertions; comment out code, divide-and-conquer
Make the problem WORSE to magnify the problem
try low-overhead debugging tools "$gcc -g -o2"

Steps

Classifying

Determining a defect's category
Useful in formulating a repair strategy
Important information in subsequent reviews when considering preventive actions

Syntax errors
Syntax warnings
Implementation errors
Logic errors
Configuration errors

Repairing the problem

Implementing the appropriate fixed.
Passing the tests
Tests should be well written
Minimize changes to the system - keep changes small and localized
Verify repairs against test assets

All new/update tests should pass
All other tests should pass

Delivery

Practice good version control
Don't include fixes for more than one problem in one commit
Don't include extraneous changes (e.g. new features) in fix commits
Include new/update test assets in the fix commits
Write commit comments clear and concise

Verify tests again

Double check all new/update tests pass
Double check all other tests pass

Create documentation for posterity

How the defect was noticed
The conditions under which the defect occurred - the context
Steps necessary to reproduce the defect - the analogous context
Techniques and tools used to localize the defect
Defect's category
Underlying root cause of the defect
Latent defects precluded by fixing this defect
Possible latent defects left unaddressed
Mistake made and recommendations for preventive actions

Developing new feature

Practice defensive programming

Assume the worst case could happen at any time.

Employ an appropriate iterative and incremental development process

Decide what needs to be achieved
Formulate a plan for the achievement
Understand the invariants, requirements, and context, then design the solution.
Implement the solution in small, discrete, testable chunks
Write code to verify invariants, pre-cur-post conditions and self-test complex components

Consider employing the principles of test-driven design
Employ good configuration management practice EVERYWHERE.

May 7, 2022

[CPPCON 2021][notes] Debugging/Developing techniques

Define bugs

Terminology

Relationship

Challenges

Debugging process

In reality tips

In detail

Characterizing

Reproducing

Understanding

At a minimum

Inspect and verify the associated test assets

The defect may not be where you expect it

Ask yourself where the defect is not

Explain to people why there is a defect, and why your proposed fix will resolve the defect

Locate the problem

Employ good development practices at the outset

Alas

Use trace logging

Alas

Use debugging and analysis tools

Alas

Enable and/or add assertions

Alas

Use backtracking

Alas

Divide and conquire (binary search)

Problem simplification

Alas

Make the problem worse

Alas

Scientific method

Alas

Problem Types

Deterministic problems

Non-deterministic problems

Steps

Classifying

Syntax errorsSyntax warningsImplementation errorsLogic errorsConfiguration errors

Repairing the problem

Delivery

Verify tests again

Create documentation for posterity

Developing new feature

No comments:

Post a Comment

Syntax errors
Syntax warnings
Implementation errors
Logic errors
Configuration errors