Mar 28, 2012

[C++][NOTE] into the embedded world.

This is a MUST read before delving into embedded C++ world!!:
Inside the C++ Object Model

A MUST READ. Understanding how compiler generate codes and the concepts why it does that. I had this book in Traditional Chinese (Taiwan,ROC) version. It's translated by JJHou(侯傑), whom also fixed LOT's of typos in the English version which makes it more comfortable to read.

Below are excerpt from Scott Meyers Training Courses:
Effective C++ in an Embedded Environment



Major omissions:
  •  Templates (hence the STL)
  •  Exceptions
  •  Runtime type identification (RTTI)
  •  Multiple and virtual inheritance
  •  Namespaces

Have faith:
  • C++ was designed to be competitive in performance with C.
  • Generally speaking, you don't pay for what you don't use.
Size penalties:
  • Vptr makes each object larger
  • Alignment restrictions could force padding
  • Reordering data members often eliminates problem


Speed penalties:
  • Call through vtbl slower than direct call:
  • But usually only by a few instructions
  • Inlining usually impossible:
  • This is often inherent in a virtual call

But compared to C alternatives:
  • Faster and smaller than if/then/else or switch-based techniques
  • Guaranteed to be right

Null pointers never get an offset. At runtime, a pointer nullness test must be performed
before applying an offset.

“No-Cost” C++ Features:

  • All the C stuff: structs, pointers, free functions, etc.
  • Classes
  • Namespaces
  • Static functions and data
  • Nonvirtual member functions
  • Function and operator overloading
  • Default parameters:
  • Note that they are always passed. Poor design can thus be costly:
  • void doThat(const std::string& name = "Unnamed"); // Bad
    const std::string defaultName = "Unnamed";
    void doThat(const std::string& name = defaultName); // Better
    
    (annotation)Because of generating temporary rvalue.
  • Overloading is typically a cheaper alternative.
  • Constructors and destructors:
    •   They contain code for mandatory initialization and finalization.
    •   However, they may yield chains of calls up the hierarchy. 
  • Single inheritance
  • Virtual functions
    •  Abstract classes with no virtual function implementations (i.e.,“Interfaces”) may still generate vtbls
    • Some compilers offer ways to prevent this.
    • Virtual functions that are never called are still linked in.
  • Virtual inheritance
------------
Further reading:
C++ Q&A;: ATL Virtual Functions and vtables
Does using __declspec(novtable) on abstract base classes affect RTTI in any way?
Is there a g++ equivalent to Visual Studio's __declspec(novtable)?

--------------

  • new and delete:
    • By default, new = malloc + constructor(s) and
    • delete = destructor(s) + free
  • Note that error-handling behavior via exceptions is built in.
  • Important: new is useful even in systems where all memory is statically allocated.
  • Placement new allows objects to be constructed at particular locations:
    • E.g., in statically allocated memory.
    • E.g., at memory-mapped addresses.


“Low-Cost” C++ Features:
  • Exceptions: a small speed and/or size penalty (code)
    • When evaluating the cost of exceptions, be sure to do a fair
      comparison.
    • Error handling costs you something, no matter how it is implemented.
      • E.g., Saks reports object code increases of 15-40% for error handling based on return values.

Exceptions and Dynamically Allocated Memory:
  • Some compilers try to use heap memory for exception objects.
    • This can be unacceptable in some embedded systems.
  • Implementations reserve some non-heap memory for exception objects.
    • They have to be able to propagate std::bad_alloc exceptions!
    • Platforms with no heap should still be able to use exceptions.
One platform that uses heap memory for exceptions (when it can) is g++.

More features you may pay for, even if you don't use them:
  • Multiple inheritance: a small size penalty (vtbls that store Δs)
  • dynamic_cast and other RTTI features: a small size penalty (vtbls)
  • Each use of dynamic_cast may be linear in the number of base classes (direct and indirect) of the object being cast.
  • Each use may involve a call to strcmp for each class in the hierarchy.
  • QOIs vary. The Technical Report on C++ Performance provides details.
C++ Features that can Surprise Inexperienced C++ Programmers ( bozos, used by STL :-) ):
  • Temporary objects
  • Templates
Common Questions:
Why are simple “hello world” programs in C++ so big compared to C?
  • iostream vs. stdio
Why do C developers moving to C++ often find their code is big and slow?
  • C++ isn't C, and C programmers aren't C++ programmers
  • C++ from good C++ developers as good as C from good C developers

Efficiency Beyond C:
  • C++ feature implementation often better than C approximations
  • Abstraction + encapsulation ⇒ flexibility to improve implementations
    • std::strings often outperform char*- based strings
      • May use reference counting
      • May employ “the small string optimization”
  • STL-proven techniques have revolutionized library design
    • Shift work from runtime to compile-time
    • Template metaprogramming (TMP), e.g., “traits”
    • Inlined operator()s
    • Sample success story: C++’s sort is faster than C’s qsort.
C++ Implementation Summary:
  • C++ designed to be competitive with C in size and speed
  • Compiler-generated data structures generally better than hand-coded C equivalents
  • You generally don’t pay for what you don’t use

  • C++ is successfully used in many embedded systems

The Pros and Cons of Inlining:

Advantages of inlining:
  • Function call overhead is eliminated:
    • For very small functions, overall code size may shrink!
    • Essential for decent performance in layered systems
  • Allows modular source code with branch-free object code.
    • Function calls in source code yield straight-line object code.
  • Often allows for better object code optimization by compilers
Disadvantages:
  • Debuggers can’t cope
  • Overall system code size typically increases
    • This can decrease cache hit rate or increase paging.
  • Constrains binary compatibility for upgrade releases.
  • Some “small” functions may result in a lot of code being generated
    • Overhead to support EH may be significant
    • Constructors may set vptrs, call base class constructors, etc 
  • In constructor:
    • If on heap, call operator new
    • Call Base::Base
    • Make vptr point to Derived vtbl
    • Call constructor for z((annotation)if z has a contructor, otherwise, in global, init with 0, in stack, junk) 
inline is only a request — compilers are free to ignore it:
  • Compilers rarely inline virtual function calls:
    • Inlining occurs at build-time, but virtuals are resolved at runtime
    • Optimizations are sometimes possible:
      • Virtuals invoked on objects (not pointers or references).
      • Explicitly qualified calls (e.g., ClassName::virtualFunctionName()). (annotation)(if call with qualified , will inhibit virtual call.
  • Compilers often ignore inline for “complex” functions, e.g., those containing loops
  • Compilers must ignore inline when they need a pointer to the function, e.g., constructors and destructors for arrays of objects
Link-Time Inlining:
The Pros and Cons of Inlining:
Minimize inlining if binary upgradeability is important


Code Bloat in C++:


  • Support for exceptions.
  • Support for generalized customizable iostreams.
    • I.e., streams of other than char or wchar_t.
Possible workarounds:


  • Disable exceptions during compilation.
  • Practical only if you know that no code (including libraries,plug-ins, etc.) throws.
    • Use stdio instead of iostreams.
The feature most associated with bloat is templates.

Most problems with “template code bloat” arise from:
  • Misunderstandings of template rules.
  • Improper use of templates.
Avoiding Code Duplication:
Avoiding code bloat with templates fundamentally calls for disciplined commonality and variability analysis:

  • The parts of a template that don’t depend on the template parameters (the common parts) should be moved out of the template.
  •   The remaining parts (the variable parts) should stay in the template.
This kind of analysis is critical to avoiding code duplication in any guise:
  • Features common to multiple classes should be moved out of the classes.
    • Maybe to a base class.
    • Maybe to a class template.
  • Features common to multiple functions should be moved out of the functions:
    • Maybe to a new function.
    • Maybe to a function template.
Need to distinguish here between source code duplication and object code duplication.
Templates and inlines can reduce source code duplication, but can lead to object code duplication.

Data Bloat:
  • Some classes have a vtbl, so unnecessary classes ⇒ unnecessary vtbls.
    • Such unnecessary classes could come from templates.
  • Functions must behave properly when exceptions are thrown, so unnecessary non-inline functions ⇒ unnecessary EH tables.
    • Such unnecessary functions could come from templates.
    • This applies only to the Table Approach to EH.
An important exception to these issues are class templates that:
  • Contain only inline functions.
    • Hence no extra EH tables.
  • Contain no virtual functions.
    • Hence no extra vtbls.
Runtime Polymorphism:
  •  The “normal” meaning of interface-based programming.
    • In much OO literature, the only meaning.
      • Unnecessarily restrictive for C++.
  • The most flexible.
    • Can take advantage of information known only at runtime.
  • The most expensive.
    • Based on vptrs, vtbls, non-inline function calls.
Link-Time Polymorphism
  • Useful when information known during linking, but not during compilation.
  • No need for virtual functions.
  • Typically disallows inlining.
    • Most inlining is done during compilation.
Link-Time Polymorphism Example:
  Approach:
  • One class definition for both drivers.
  • Different component-dependent implementations.
  • Implementations selected during linking.
    • This is “C” polymorphism.
 Link-time polymorphism is reasonable here:
  •  Deployment platform unknown at compilation, known during linking.
    • No need for flexibility or expense of runtime polymorphism.
      • No vtbls.
      • No indirection through vtbls.

Compile-Time Polymorphism:
  • Useful when:
    • Implementation determinable during compilation.
    • Want to write mostly implementation-independent code.
  • No need for virtual functions.
  • Allows inlining.
  • Based on implicit interfaces.
    • Other forms of polymorphism based on explicit interfaces.
TIPS:
Goal:
  • Device class to use determined by platform’s #bits/pointer.
    • This is known during compilation.
Approach:
  • Create 2 or more classes with “compatible” interfaces.
    • I.e., support the same implicit interface.
      • E.g., must offer a reset function callable with 0 arguments.
  • Use compile-time information to determine which class to use.
  • Define a typedef for this class.
  • Program in terms of the typedef.
Compile-time polymorphism is reasonable when:
  • Device type can be determined during compilation.
    • No need for flexibility or expense of runtime polymorphism.
    • No need to configure linker behavior or give up inlining.
Compile-Time Polymorphism Example:


template<int PtrBitsVs32> struct DriverChoice;
template<> struct DriverChoice<-1> { // When bits/ptr < 32
typedef SASDevice type;
};
template<> struct DriverChoice<0> { // When bits/ptr == 32
typedef NASDevice type;
};
template<> struct DriverChoice<1> { // When bits/ptr > 32
typedef BASDevice type;
};
struct Driver {
enum { bitsPerVoidPtr = CHAR_BIT * sizeof(void*) };
enum { ptrBitsVs32 = bitsPerVoidPtr > 32 ? 1 :
bitsPerVoidPtr == 32 ? 0 :
-1
};
typedef DriverChoice<ptrBitsVs32>::type type;
}; 

this can’t be done with the preprocessor, because you can’t use sizeof in a preprocessor expression.


Fully Static Allocation:
No heap. Objects are either:
  • On the stack: Local to a function.
  • Of static storage duration:
    • At global scope.
    • At namespace scope.
    • static at file, function, or class scope.
“Allocation” occurs at build time. Hence:
  • Speed: essentially infinite; deterministic.
  • External Fragmentation: impossible.
  • Memory leaks: impossible.
  • Memory exhaustion: impossible.
“Heap Allocation”:
Two common meanings:
  • Dynamic allocation outside the runtime stack.
  • Irregular dynamic allocation outside the runtime stack.
    • Unpredictable numbers of objects.
    • Unpredictable object sizes.
    • Unpredictable object lifetimes.

Why no size_t to ::operator delete?


Arithmetic types, enumeration types, pointer types, and pointer to member types are POD.
A cv-qualified version of a POD type is itself a POD type.
An array of POD is itself POD. A struct or union, all of whose non-static data members are POD, is itself POD if it has no:

  • Base classes
  • Virtual functions
  • Protected or private non-static data members
  • Non-static data members of reference type
  • User-defined constructors, destructor, or copy assignment operator
  • Non-static data members of non-POD types
Essentially, a C++ class or struct is a POD type if it “looks and acts like C.”
  • But note that non-virtual member functions are allowed.
  • Static data and static member functions are allowed, too.
Part of “acting like C” is being memcpyable, which explains why POD types can’t have private or protected data members.



C++ and ROM
  • Program instructions can always be ROMed.
  • Data in a C++ program can be ROMed if it meets two criteria:
    •  Its value is known before runtime.
      • I.e., either the compiler or the linker knows it or can compute it.
    • It can’t be modified at runtime.

For non-integral scalar constants, consts are safer than #defines and may be more efficient:

Floating point values can rarely be turned into immediate operands:

#define pi 3.14159 // ROMable, but subject to
// macro drawbacks
const double pi = 3.14159; // ROMable, but not subject
// to macro drawbacks
  • They’re ROMable in both forms above.
  • With a bad compiler, the macro form might result in multiple copies of pi in an object file.
    • This shouldn’t happen with the const.
      • It should never yield more than one copy in an object file.

Objects may be ROMed if the following are true:
  • They are declared const at their point of definition.
  • They contain no mutable data members.
  • They are initialized with values known during compilation.
    • Such “knowledge” might come from dataflow analysis, etc.
    struct Point {
    int x, y;
    };
    const Point origin = { 0, 0 }; // origin is ROMable
    struct Widget { // all Widgets can be bitwise
    int a; // initialized from a ROMed
    const char *p; // Widget initialized with
    Widget(): a(7), p("xyzzy") { } // { 7, "xyzzy" }
    };
    const Widget w; // w is ROMable (even though
    // it’s a non-POD requiring
    // dynamic initialization)
    
    
    C++ in embedded systems: Myth and reality

    Some compiler generated data structures can usually be ROMed:
    • Virtual function tables
    • RTTI tables and type_info objects
    • Tables supporting exception handling
    ROMing these objects may be impossible if they are dynamically linked from shared libraries.

    What’s not ROMable? Objects that may be modified at runtime:

    • Objects with nontrivial constructors or destructors.
    • Objects with mutable members
    • Objects not defined to be const.
    int x = 14; // x isn’t const, hence not ROMable
    std::string s = "xyzzy"; // s isn’t const, hence not ROMable
      •  Of course, 14 and "xyzzy" can still be ROMed.


    先到這, 幹, embedded c++好難呀~~~還有好多東西要看 真累/爽 XD...

    Reference Sites:
    The Embedded C++ Web Site
    Embedded C++: An overview
    Program In Embedded C++ For Smaller And Faster Code

    No comments:

    Post a Comment

    Note: Only a member of this blog may post a comment.