Inside the C++ Object Model
A MUST READ. Understanding how compiler generate codes and the concepts why it does that. I had this book in Traditional Chinese (Taiwan,ROC) version. It's translated by JJHou(侯傑), whom also fixed LOT's of typos in the English version which makes it more comfortable to read.
| |
Effective C++ in an Embedded Environment
Major omissions:
- Templates (hence the STL)
- Exceptions
- Runtime type identification (RTTI)
- Multiple and virtual inheritance
- Namespaces
Have faith:
- C++ was designed to be competitive in performance with C.
- Generally speaking, you don't pay for what you don't use.
- Vptr makes each object larger
- Alignment restrictions could force padding
- Reordering data members often eliminates problem
Speed penalties:
- Call through vtbl slower than direct call:
- But usually only by a few instructions
- Inlining usually impossible:
- This is often inherent in a virtual call
But compared to C alternatives:
- Faster and smaller than if/then/else or switch-based techniques
- Guaranteed to be right
Null pointers never get an offset. At runtime, a pointer nullness test must be performed
before applying an offset.
“No-Cost” C++ Features:
- All the C stuff: structs, pointers, free functions, etc.
- Classes
- Namespaces
- Static functions and data
- Nonvirtual member functions
- Function and operator overloading
- Default parameters:
- Note that they are always passed. Poor design can thus be costly:
void doThat(const std::string& name = "Unnamed"); // Bad const std::string defaultName = "Unnamed"; void doThat(const std::string& name = defaultName); // Better(annotation)Because of generating temporary rvalue.
- Constructors and destructors:
- They contain code for mandatory initialization and finalization.
- However, they may yield chains of calls up the hierarchy.
- Single inheritance
- Virtual functions
- Abstract classes with no virtual function implementations (i.e.,“Interfaces”) may still generate vtbls
- Some compilers offer ways to prevent this.
- Virtual functions that are never called are still linked in.
- Virtual inheritance
Further reading:
C++ Q&A;: ATL Virtual Functions and vtables
Does using __declspec(novtable) on abstract base classes affect RTTI in any way?
Is there a g++ equivalent to Visual Studio's __declspec(novtable)?
--------------
- new and delete:
- By default, new = malloc + constructor(s) and
- delete = destructor(s) + free
- Note that error-handling behavior via exceptions is built in.
- Important: new is useful even in systems where all memory is statically allocated.
- Placement new allows objects to be constructed at particular locations:
- E.g., in statically allocated memory.
- E.g., at memory-mapped addresses.
“Low-Cost” C++ Features:
- Exceptions: a small speed and/or size penalty (code)
- When evaluating the cost of exceptions, be sure to do a fair
comparison. - Error handling costs you something, no matter how it is implemented.
- E.g., Saks reports object code increases of 15-40% for error handling based on return values.
Exceptions and Dynamically Allocated Memory:
- Some compilers try to use heap memory for exception objects.
- This can be unacceptable in some embedded systems.
- Implementations reserve some non-heap memory for exception objects.
- They have to be able to propagate std::bad_alloc exceptions!
- Platforms with no heap should still be able to use exceptions.
More features you may pay for, even if you don't use them:
- Multiple inheritance: a small size penalty (vtbls that store Δs)
- dynamic_cast and other RTTI features: a small size penalty (vtbls)
- Each use of dynamic_cast may be linear in the number of base classes (direct and indirect) of the object being cast.
- Each use may involve a call to strcmp for each class in the hierarchy.
- QOIs vary. The Technical Report on C++ Performance provides details.
- Temporary objects
- Templates
Why are simple “hello world” programs in C++ so big compared to C?
- iostream vs. stdio
- C++ isn't C, and C programmers aren't C++ programmers
- C++ from good C++ developers as good as C from good C developers
Efficiency Beyond C:
- C++ feature implementation often better than C approximations
- Abstraction + encapsulation ⇒ flexibility to improve implementations
- std::strings often outperform char*- based strings
- May use reference counting
- May employ “the small string optimization”
- STL-proven techniques have revolutionized library design
- Shift work from runtime to compile-time
- Template metaprogramming (TMP), e.g., “traits”
- Inlined operator()s
- Sample success story: C++’s sort is faster than C’s qsort.
- C++ designed to be competitive with C in size and speed
- Compiler-generated data structures generally better than hand-coded C equivalents
- You generally don’t pay for what you don’t use
- C++ is successfully used in many embedded systems
The Pros and Cons of Inlining:
Advantages of inlining:
- Function call overhead is eliminated:
- For very small functions, overall code size may shrink!
- Essential for decent performance in layered systems
- Allows modular source code with branch-free object code.
- Function calls in source code yield straight-line object code.
- Often allows for better object code optimization by compilers
- Debuggers can’t cope
- Overall system code size typically increases
- This can decrease cache hit rate or increase paging.
- Constrains binary compatibility for upgrade releases.
- Some “small” functions may result in a lot of code being generated
- Overhead to support EH may be significant
- Constructors may set vptrs, call base class constructors, etc
- In constructor:
- If on heap, call operator new
- Call Base::Base
- Make vptr point to Derived vtbl
- Call constructor for z((annotation)if z has a contructor, otherwise, in global, init with 0, in stack, junk)
- Compilers rarely inline virtual function calls:
- Inlining occurs at build-time, but virtuals are resolved at runtime
- Optimizations are sometimes possible:
- Virtuals invoked on objects (not pointers or references).
- Explicitly qualified calls (e.g., ClassName::virtualFunctionName()). (annotation)(if call with qualified , will inhibit virtual call.
- Compilers often ignore inline for “complex” functions, e.g., those containing loops
- Compilers must ignore inline when they need a pointer to the function, e.g., constructors and destructors for arrays of objects
The Pros and Cons of Inlining:
Minimize inlining if binary upgradeability is important
Code Bloat in C++:
- Support for exceptions.
- Support for generalized customizable iostreams.
- I.e., streams of other than char or wchar_t.
- Disable exceptions during compilation.
- Practical only if you know that no code (including libraries,plug-ins, etc.) throws.
- Use stdio instead of iostreams.
Most problems with “template code bloat” arise from:
- Misunderstandings of template rules.
- Improper use of templates.
Avoiding Code Duplication:
Avoiding code bloat with templates fundamentally calls for disciplined commonality and variability analysis:
Templates and inlines can reduce source code duplication, but can lead to object code duplication.
Avoiding code bloat with templates fundamentally calls for disciplined commonality and variability analysis:
- The parts of a template that don’t depend on the template parameters (the common parts) should be moved out of the template.
- The remaining parts (the variable parts) should stay in the template.
- Features common to multiple classes should be moved out of the classes.
- Maybe to a base class.
- Maybe to a class template.
- Features common to multiple functions should be moved out of the functions:
- Maybe to a new function.
- Maybe to a function template.
Templates and inlines can reduce source code duplication, but can lead to object code duplication.
Data Bloat:
- Some classes have a vtbl, so unnecessary classes ⇒ unnecessary vtbls.
- Such unnecessary classes could come from templates.
- Functions must behave properly when exceptions are thrown, so unnecessary non-inline functions ⇒ unnecessary EH tables.
- Such unnecessary functions could come from templates.
- This applies only to the Table Approach to EH.
- Contain only inline functions.
- Hence no extra EH tables.
- Contain no virtual functions.
- Hence no extra vtbls.
- The “normal” meaning of interface-based programming.
- In much OO literature, the only meaning.
- Unnecessarily restrictive for C++.
- The most flexible.
- Can take advantage of information known only at runtime.
- The most expensive.
- Based on vptrs, vtbls, non-inline function calls.
Link-Time Polymorphism
- Useful when information known during linking, but not during compilation.
- No need for virtual functions.
- Typically disallows inlining.
- Most inlining is done during compilation.
Link-Time Polymorphism Example:
Approach:
- One class definition for both drivers.
- Different component-dependent implementations.
- Implementations selected during linking.
- This is “C” polymorphism.
- Deployment platform unknown at compilation, known during linking.
- No need for flexibility or expense of runtime polymorphism.
- No vtbls.
- No indirection through vtbls.
Compile-Time Polymorphism:
- Useful when:
- Implementation determinable during compilation.
- Want to write mostly implementation-independent code.
- No need for virtual functions.
- Allows inlining.
- Based on implicit interfaces.
- Other forms of polymorphism based on explicit interfaces.
Goal:
- Device class to use determined by platform’s #bits/pointer.
- This is known during compilation.
- Create 2 or more classes with “compatible” interfaces.
- I.e., support the same implicit interface.
- E.g., must offer a reset function callable with 0 arguments.
- Use compile-time information to determine which class to use.
- Define a typedef for this class.
- Program in terms of the typedef.
Compile-time polymorphism is reasonable when:
- Device type can be determined during compilation.
- No need for flexibility or expense of runtime polymorphism.
- No need to configure linker behavior or give up inlining.
template<int PtrBitsVs32> struct DriverChoice; template<> struct DriverChoice<-1> { // When bits/ptr < 32 typedef SASDevice type; }; template<> struct DriverChoice<0> { // When bits/ptr == 32 typedef NASDevice type; }; template<> struct DriverChoice<1> { // When bits/ptr > 32 typedef BASDevice type; }; struct Driver { enum { bitsPerVoidPtr = CHAR_BIT * sizeof(void*) }; enum { ptrBitsVs32 = bitsPerVoidPtr > 32 ? 1 : bitsPerVoidPtr == 32 ? 0 : -1 }; typedef DriverChoice<ptrBitsVs32>::type type; };
this can’t be done with the preprocessor, because you can’t use sizeof in a preprocessor expression.
Fully Static Allocation:
No heap. Objects are either:
- On the stack: Local to a function.
- Of static storage duration:
- At global scope.
- At namespace scope.
- static at file, function, or class scope.
- Speed: essentially infinite; deterministic.
- External Fragmentation: impossible.
- Memory leaks: impossible.
- Memory exhaustion: impossible.
Two common meanings:
- Dynamic allocation outside the runtime stack.
- Irregular dynamic allocation outside the runtime stack.
- Unpredictable numbers of objects.
- Unpredictable object sizes.
- Unpredictable object lifetimes.
Why no size_t to ::operator delete?
Arithmetic types, enumeration types, pointer types, and pointer to member types are POD.
A cv-qualified version of a POD type is itself a POD type.
An array of POD is itself POD. A struct or union, all of whose non-static data members are POD, is itself POD if it has no:
- Base classes
- Virtual functions
- Protected or private non-static data members
- Non-static data members of reference type
- User-defined constructors, destructor, or copy assignment operator
- Non-static data members of non-POD types
- But note that non-virtual member functions are allowed.
- Static data and static member functions are allowed, too.
C++ and ROM
- Program instructions can always be ROMed.
- Data in a C++ program can be ROMed if it meets two criteria:
- Its value is known before runtime.
- I.e., either the compiler or the linker knows it or can compute it.
- It can’t be modified at runtime.
For non-integral scalar constants, consts are safer than #defines and may be more efficient:
Floating point values can rarely be turned into immediate operands:
#define pi 3.14159 // ROMable, but subject to // macro drawbacks const double pi = 3.14159; // ROMable, but not subject // to macro drawbacks
- They’re ROMable in both forms above.
- With a bad compiler, the macro form might result in multiple copies of pi in an object file.
- This shouldn’t happen with the const.
- It should never yield more than one copy in an object file.
Objects may be ROMed if the following are true:
- They are declared const at their point of definition.
- They contain no mutable data members.
- They are initialized with values known during compilation.
- Such “knowledge” might come from dataflow analysis, etc.
struct Point { int x, y; }; const Point origin = { 0, 0 }; // origin is ROMable struct Widget { // all Widgets can be bitwise int a; // initialized from a ROMed const char *p; // Widget initialized with Widget(): a(7), p("xyzzy") { } // { 7, "xyzzy" } }; const Widget w; // w is ROMable (even though // it’s a non-POD requiring // dynamic initialization)C++ in embedded systems: Myth and reality
Some compiler generated data structures can usually be ROMed:
- Virtual function tables
- RTTI tables and type_info objects
- Tables supporting exception handling
What’s not ROMable? Objects that may be modified at runtime:
- Objects with nontrivial constructors or destructors.
- Objects with mutable members
- Objects not defined to be const.
int x = 14; // x isn’t const, hence not ROMable std::string s = "xyzzy"; // s isn’t const, hence not ROMable
- Of course, 14 and "xyzzy" can still be ROMed.
先到這, 幹, embedded c++好難呀~~~還有好多東西要看 真累/爽 XD...
Reference Sites:
The Embedded C++ Web Site
Embedded C++: An overview
Program In Embedded C++ For Smaller And Faster Code
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.