System Modelling - visualize the structure of system (abstractions and relations among them) to aid design and implementation.
The Unified Modeling Language, known as UML for short, is a viausl modeling language that lets people design and document a software system.
A class model is a diagram that visually represents a group of classes, and the relationships between them.
The class model is a key component of how we describe design patterns, since understanding the classes involved and their relationships will be key to understanding how to apply the patterns.
Basic class structure and example UMLs for Vec class:
This is the basic structure for a class, the general form of a class has three boxes.
-
the first box contains the class name
- The class name must be unique among all of the classes in that particular scope; otherwise, it is qualified with a scope operator. e.g. Package-name::Class-name
- The class name must be capitalized, centered or left-justified, and in bold. Note that, in general, class names are not pluralized unless the class is intended to be a container for multiple objects of the type
- The class name may be qualified by an optional stereotype keyword, centered in the regular typeface, placed above the class name, and within guillemets "≪≫" (special Unicode single characters 226A and 226B) and/or an optional stereotype icon (in the upper-right corner)
-
The second box is optional, and contains the class attributes
-
The third box is also optional, and contains the class operations
The class name may never be omitted.
A comment can be added to the class diagram by a rectangle with a bent upper-right corner to the item being annotated by a dashed line. The comment is a text string but has no effect on the model, though it may describe a constraint on the annotated item.
For example:
Attributes are left-justified and witten in the regular typeface. They are generally shown when needed, though a full descriptiong must be provided at least onnce. They describe the information held by an instance of the class, and may be replaced by association ends.
The general syntax for an attribute is:
≪stereotype≫ visibility / name : type multiplicity initial-value {property}
Where every argument except the name is optional. The name generally startes with a lower-case letter.
Visibility : Describes the visibility of the attribute as a punctuation mark, though can be instead described in the property string.
+
public-
private#
protected~
package
type : A string that describes the attributes's type. Usually written to be as language-independent as possible. (e.g., Boolean rather than bool
, Integer instead of int
)
/ : indicates that this attribute is derived from a parent class
multiplicity : Describes the multiplicity on the attribute (i.e., how many of these values will be held). Enclosed in square brackets
initial-value : Specifies the default initial value as an equal sign by the value e.g. = 0
property : List of comma-separated strings surrounded by {}. Default to {changable}, but can be specified as {readOnly} to indicate that it's a constant.
If the attribute is static
, the name and type strings are underlined.
Some examples of attribute declarations:
- colours : Saturation [3]
# points : Point [2..*] {ordered, set}
size : Area = (100, 100)
+ name : String [0..1]
+ name : String {readOnly}
Operations are left-justified and written in the regular typeface. They are generally shown when needed, though a full description must be provided at least once.
The general syntax for an operation is :
≪stereotype≫ visibility name ( parameter-list ) : return-type multiplicity initial-value property
Every argument except the name and parameter list is optional.
Visibility : same as for the attribute
return-type : A string containing a comma-separated list of names that describes the operation's return type.
parameter-list : Comma-separated list of parameters, enclosed in parentheses. Each parameter is of the form direction name : type multiplicity = default-value
- direction : specifies direction of information flow and is optional
in
: input parameter, passed by value (default)out
: output parameter with no input value; final value is available to callerinout
: input parameter that may be modified and whose final value is available to callerreturn
: return value of a call (equivalent to out but available for use inline)
property : List of comma-separated strings surrounded by {}
An abstract operation is italicized. This apply to C++ pure virtual methods
If the operation is static
, the name and type strings are underlined
Some Examples of operations declarations:
- display () : Location
+ hide ()
≪constructor≫ + create ()
- attachXWindow( xwin : XWindow* )
# Matrix::transform (in distance: Vector, in angle: Real = 0) : Matrix
If a class A has a dependency upon another class B, it receives an instance of B as a parameter, or returns an instance of B, or has some sort of other temporary relationship. We draw the line between classes A and B as a dashed line.
static attributes are unedrlined in the UML class model; otherwise, you might need to tag them with the {abstract}
property.
Visibility
- private
+ public
There are four main types of class relationships: association, aggregation, composition, and generalization.
Example of Vec class and Basis class:
Class Vec {
int x, y;
public:
Vec(int x, int y) : x{x}, y{y} {}
};
Class Basis { // the default ctor for Vec doesn't exist,
// must specify how to construct v1 and v2 in the default ctor for Basis
Vec v1, v2;
};
Basis b; // Error: can't initialize v1, v2
We make the following changes:
Class Basis { // the default ctor for Vec doesn't exist,
// must specify how to construct v1 and v2 in the default ctor for Basis
Vec v1, v2;
public:
Basis() : v1{0, 1}, v2{1, 0} {}
};
Basis b; // OK now!
A "part" is joined to a "whole", it may not be shared with any other object. The whole is also responsible for destroying all of its component parts when it is destroyed. It may also be responsible for creating its components. Whether the components exist independently beforehand or are created by the owner is to be specified in the design.
Embedding an object with another (e.g., v1 and v2 with Basis objects) is called composition.
Relationship: a Basis object "owns-a" Vec object (actually it owns 2, v1 and v2)
If A "owns-a" B, then typically:
- B has no identity outside of A (no independent existence)
- If A is destroyed, then B is also destroyed
- If A is copied, then B is also copied (perform a deep copy)
Example: A body owns two kidneys, a kidney is a prat of a body. If you destroy the body, you destroy the kidney. If you copy a person, you also copy their kidneys.
Implementation of composition is usually a composition of classes.
Modelling:
The owner has an implicit multiplicity of 1, so we don't bother to specified it. Note that the "owner"
end of the composition is marked with a solid, black diamond.
Another example: a Point to a Polygon or a Circle. An individual Point object may be part of an object of either type, but not part of both simultaneously.
Relationship: Aggregation
Compares the situation of kidneys in a body to car parts in a catalogue. The catalogue contains parts, but the parts have independent existence.
This is a "has-a" relationship ("aggregation")
Aggregation is a form of association that describes a "whole-part" relationship where one class, the aggregate or whole, is made up of the constituent part class. The end of the aggregation is marked with a hollow, white diamond. It is possible that the aggregation is shared i.e. a part is shared between one or more aggregates. These parts may also exist independently of the aggregate. An aggregate may or may not be responsible for destroying the parts.
If A "has-a" B, then typically:
-
B exists apart from its association with A (B exists independently outside of A)
-
If A is destroyed, B lives on
-
If A is copied, B is not (perform a shallow copy). Copies of A shares the same B
Modelling:
Implementation is usually a non-owning pointer
Class Catalogue {
parts *p[size]; // an array of pointers
...
};
Another Example: a student may belong to zero or more clubs and exists independently of the clubs. A club needs at least four students to be a part of it.
This is usually implemented via pointers or reference fields.
The specialization association between the parent/superclass and the child/subclass is indicated by putting a triangular arrowhead on the association end that joins the parent. By definition, there is no multiplicity or navigation arrowhead, though constraints may be added.
There may be separate lines from the parent class to each child, or the lines may be drawn as a tree structure.
In classic UML notation, the name of an abstract base class is italicized, and we italicize only the pure virtual methods. In this course, the name of the abstract base class is italicized, and all virtual methods are italicized.
This is often described as an "Is-A" relationship, where A is a B. It is implemented through inheritance.
Relationship: Specialization (inheritance)
Suppose you want to track your collections of books:
Class Book {
string title, author,
int length;
public:
Book(...) : ... {}
};
Suppose there are special types of books, for textbooks, we want to know the topic
Class Text {
string title, author;
int length;
string topic;
public:
Text(...) : ... {}
};
Maybe there are also comic books, we want to know the name of the hero
Class Comic {
string title, author;
int length;
string hero;
public:
Comic(...) : ... {}
};
This is OK, but it has limitations, and it doesn't really capture the relationship among Books, Texts, and Comics.
Ideally, we would like to keep the books all in a single array (or other collection type) so that we could iterate/traverse the entire collection in one loop, without having to worry about the underlying types.
There are three possible techiniques
- C union
- C void pointers
- C++ inheritance
We could
-
use a union Union is defined to contain multiple data fields of potentially different sizes. However, only one data fields will be available at any given moment. An instance of the union is allocated the number of bytes required by the largest data field in the union.
union BookTypes {Book *b, Text *t, Comic *c}; BookTypes myBooks[20];
-
use an array of void pointers, can point at anything (i.e.
void*
)
These two solutions are both not good, they sublet the type system.
Rather: Observe that Texts and Comics are kinds of books., but with extra features. To model this in C++, we use what's called inheritance.
// version 1
Class Book {
String title, author;
int length;
public:
...
}; // called a Base class (superclass)
Class Text : public Book { // a Text is a public Book, this is called a Derived class (subclass)
string topic;
public:
Text(...) : ... {}
};
Class Comic : public Book { // also a derived class from book
string hero;
public:
Comic(...) : ... {}
};
Derived Class inherits fields and methods from their Base Class. So in this case, Comic and Text get title, author, and length fields. Additionally (quite important), any methods that can be call on a book, can be called on a Comic or Text.
Note that we do NOT repeat the data fields that we inherit from the base class! Doing so hides the parent's information, and is almost always an error.
The corresponding UML diagram now looks like this:
Question: Who can access the fields of Books? They are private, so outsiders can't see them, but they are also part of Comics and Texts. So, can Comics and Texts see(access) them?
Answer: NO! Not even subclasses can see private fields. Only an object of type Book can see the fields of Book.
How do we initialize Text objects? We need a title, author, and length, which are the big book components, and a topic.
Class Text : public Book {
string topic;
public:
Text(string title, string author, int length, string topic) : title{title}, author{author},
length{length}, topic{topic} {}
// DON'T WORK! Text method can't access topic, author, and length, only Book can.
};
First reason : the data fields author, title, and length are private
to the Book class by default and thus not directly accessible to a Text or Comic object.
There is also a second reason
-
when an object is constructed, there are 4 steps
- space is allocated
- The superclass component is constructed <NEW STEP> (invoke the superclass constructor to build the superclass portion of the object)
- fields are constructed
- the constructor body runs
-
And step 2 doesn't work because Book doesn't have a default constructor.
We must specify how to construct our base class components, so to fix:
Class Text : public Book { // specify how to construct the base class in the ctor
string topic;
public:
Text(string title, string author, int length, string topic) :
Book{title, author, length}, topic{topic} {}
// |-------> step 2 <---------| |--> step 3 <--|
// replace the initializations of author, title, and length with a call to the Book ctor instead
};
If the superclass has no default ctor, the subclass MUST invoke the superclass constructor (specify how to construct it) in the MIL.
There are good reasons to keep superclass fields inaccessible to subclasses. If it is absolutely necessary, you can give the subclasses access, by using the protected visibility.
// version 2
class Book {
protected:
string title, author;
int length;
public:
Book(...) : ... {}
};
Now Text can access the fields of Class
Class Text : public Book {
string topic;
public:
void addAuthor(const string &s) {
author += s; // concatenate the string onto author
// OK, because author is only protected, not private
}
};
This is not a good idea to give subclasses unfettered access to superclass fields. It would be better to keep the fields private, and provide protected
getters and setters.
The better solution:
class Book {
string title, author;
int length;
protected: // the subclasses can call these
string getAuthor{ return author; }
void setAuthor(string s) { author = s; }
public:
Book(...);
bool isHeavy() const;
};
The relationship between Text, Comic, and Book is called an "is-a" relationship, as in: a Text is a Book, and a Comic is a Book.
In UML:
We implement the is-a relationship by public inheritance
The public keyword means everything public in Book is also public in Text, private means any public in Book is now private in Text.
Let's now determine: an ordinary Book is heavy if it is >200 pages, a Text book is heavy if it is >500 pages, a Comic is heavy if it is >30 pages.
class Book {
...
public:
bool isHeavy() const { return length > 200; }
}
class Text : public Book {
...
public:
bool isHeavy() const { return getLength() > 500; }
};
class Comic : public Book {
...
public:
bool isHeavy() const { return getLength() > 30; }
};
Book b{"A small book", "Papa Smarf", 50};
Comic c{"A Big Comic", "A Comic Guy", 40, "Heroperson"};
cout << b.isHeavy(); // print false
cout << c.isHeavy(); // print true
But! since public inheritance is a "is-a" relationship, then a Comic is a Book, so we can write the following:
Book b = Comic{"a Big Comic", "A Comic Guy", 40, "Heroperson"}; // we can write this because a comic is a book
The Big Question: Is b heavy? i.e. Does b.isHeavy()
produce true or false? i.e. Which isHeavy runs? Does Book::isHeavy
or Comic::isHeavy
run?
Answer:b.isHeavy();
runs, and b is not heavy. The compiler sees this, knows only that the type of b is a Book, so it calls Book::isHeavy
, why?
We try to fit a Comic object where there is only allocated space for a Book Object.
What happens? Comic is sliced.
The hero fields is chopped off, Comic was coerced into a Book.
so Book b = Comic{....}
creates a Book and Book::isHeavy()
runs.
When accessing objects through pointers, slicing is unnecessary and doesn't happen.
Comic c {__, __, 40, __};
Book *pb = &c;
Comic *pc = &c;-\
cout << pb->isHeavy(); // produce false
cout << pc->isHeaby(); // produce true
... and still Book:isHeavy()
that runs when we access pb->isHeavy();
The compiler uses the type of the pointer (or reference) to decide which isHeavy to run. It does not consider the actual type of the object.
Same object behaves differently depending on what type of ptr accesses it.
How do we make a Comic act like a Comic, even when pointed to by a Book ptr?
Solution: Declare the method virtual
// version 3
class Book {
string title, author;
protected:
int length;
public:
Book(...);
virtual bool isHeavy() const { return length > 200; }
};
class Comic : public Book {
public:
bool isHeavy() const override { return length > 30; }
};
pb->isHeavy(); // true, Comic::isHeavy() runs
virtual methods, choose which class method to run, based on the actual type of the object at runtime.
override says make sure I'm actually overriding something, give me an error if I doesn't.
The respective class header files are then changed to look like the following:
class Book {
string title, author;
int length;
protected:
int getLength() const;
string getAuthor{ return author; }
void setAuthor(string s) { author = s; }
public:
Book( const string &title, const string &author, int length );
string getTitle() const;
virtual bool isHeavy() const;
};
class Text : public Book {
string topic;
public:
Text( const string &title, const string &author, int length, const string &topic );
bool isHeavy() const override;
string getTopic() const;
};
class Comic : public Book {
string hero;
public:
Comic( const string &title, const string &author, int length, const string &hero );
bool isHeavy() const override;
string getHero() const;
};
Note that we leave the keywords virtual
and override
off of the implementations
bool Book::isHeavy() const { return length > 200; }
bool Text::isHeavy() const { return getLength() >500; }
bool Comic::isHeavy() const { return getLength() > 30; }
Example: My Book Collection
Book *myBooks[20];
for (int i = 0; i < 20; ++i) {
cout << myBooks[i]->isHeavy() << endl;
}
This used Book::isHeavy()
for Books, Text::isHeavy()
for Texts, Comic::isHeavy()
for Comics.
Accommodating multiple types under one abstraction: **polymorphism **("many forms")
Example: a function void f(istream &in);
can be passed an ifstream
by reference instead of an istream
- ifstream
is a subclass of istream
.
We add another virtual method to our Books, this return true if the item is one of our favorites. What makes an item a favorite is different for each class.
// My favourite books are short books.
bool Book::favourite() const { return length < 100; }
// My favourite textbooks are C++ books
bool Text::favourite() const { return topic == "C++"; }
// My favourite comics are Superman comisc
bool Comic::favourite() const { return hero == "Superman"; }
The header files:
class Book {
string title, author;
int length;
protected:
int getLength() const;
public:
Book(const string &title, const string &author, int length);
string getTitle() const;
string getAuthor() const;
virtual bool isHeavy() const;
virtual bool favourite() const;
};
class Text : public Book {
string topic;
public:
Text(const string &title, const string &author, int length, const string &topic);
bool isHeavy() const override;
stirng getTopic() const;
bool favourite() const override;
};
class Comic : public Book {
string hero;
public:
Comic(const string &title, const string &author, int length, const string &hero);
bool isHeavy() const override;
bool favourite() const override;
};
Our main routine creates an array of (Book *), initialized to various books, texts, and comics. It then calls the printMyFavourite functions that iterates over the array and calls favourite on each object. If favourite returns true, the title of the item is printed, then all of the dynamically allocated memory is freed.
// main.cc
// Polymorphism in action
void printMyFavourites(Book *myBooks[], int numBooks) {
for (int i = 0; i < numBooks; ++i) {
if (myBooks[i]->favourite()) cout << myBooks[i]->getTitle() << endl;
}
}
int main() {
Book* collection[] {
new Book{"War and Peace", "Tolstoy", 5000},
new Book{"Peter Rabbit", "Potter", 50},
new Text{"Programming for Beginners", "??", 200, "BASIC"},
new Text{"Programming for Big Kids", "??", 200, "C++"},
new Comic{"Aquaman Swims Again", "??", 20, "Aquaman"},
new Comic{"Clark Kent Loses His Glasses", "??", 20, "Superman"}
};
printMyFavourites(collection, 6);
for (int i = 0; i < 6; ++i) delete collection[i];
}
If you want to use polymorphism in combination with arrays, it turns out that this will only work correctly if your array holds pointers
DANGER:
class One {
int x, y;
public:
One(int x = 0, int y = 0) : x{x}, y{y} {}
int getX() const { return x; }
int getY() const { return y; }
};
class Two : public One {
int z;
public:
Two(int x = 0, int y = 0, int z = 0) : One{x, y}, z{z} {}
int getZ() const { return z; }
};
ostream &operator<<(ostream &out, const Two &obj) {
out << "(" << obj.getX() << "," << obj.getY() << "," << obj.getZ() << ")";
}
void f(One *a) {
a[0] = One{6,7};
a[1] = One{8,7};
}
int main() {
Two myArray[2] = { Two{1, 2, 3}, Two{4, 5, 6} };
for (int i = 0; i < 2; ++i) cout << myArray[i] << endl;
// a Two is a One, so it is completely legal to do this:
f( myArray ); // this is a BIG problem -- misaligned
for (int i = 0; i < 2; ++i) cout << myArrau[i] << endl;
}
// The output is
(1, 2, 3)
(4, 5, 6)
(6, 7, 8)
(9, 5, 6)
When it overwrote the content of what it thought were two objects of type One, it over-writes only part of the second object. Our data is misaligned!
Never use arrays of objects polymorphically!
If you want a polymorphic array, use an array of pointers.
Using polymorphism in combination with dynamic memory allocation poses a special problem,
Destructor Revisited:
class X {
int *x;
public:
X(int n) : x{new int[n]} {}
~X() { delete []x; }
};
Let's make a subclass
class Y : public X {
int *y;
public:
Y(int n, int m) : X{n}, y{new int[m]} {}
~Y() { delete []y; }
};
don't delete X, ~Y() will call ~X() when it is done
// Run with valgrind
int main() {
X x{5};
Y y{5, 10};
X *xp = new Y{5, 10};
delete xp;
}
// LEAK SUMMARY:
// ==41844== definitely lost: 40 bytes in 1 blocks
// ==41844== indirectly lost: 0 bytes in 0 blocks
// ==41844== possibly lost: 0 bytes in 0 blocks
// ==41844== still reachable: 0 bytes in 0 blocks
// ==41844== suppressed: 0 bytes in 0 blocks
Because it isn't made virtual, we are only accessing X's dtor, not Y's.
X *myX = new Y{10, 20};
delete myX; // this calls ~X, not ~Y
- so only x, but not y, is deleted
How can we ensure that deletion through a pointer to the superclass will call the subclass destructor?
- make the destructor virtual !
class X {
public:
virtual ~X() { delete []x; }
};
ALWAYS, make the destructor virtual in classes that are meant to have subclasses.
- even if the dtor would do nothing, still make it virtual
- the whole point of virtual dtors is to make sure the subclass dtors runs
On the other hand, if a class is not meant to have subclasses, declare it final
class Y final : public X {
...
};
This will prevent anyone from making a subclass of Y.
Always make your destructors virtual
, even if they do nothing!
Recall our Student class, it has a field called final
.
Sometimes, we don't have anything to write in the implementation of a virtual method in a base class.
Pure virtual methods of abstract classes.
class Student {
...
public:
virtual int fees() const;
};
2 kinds of student, regular and coop
class Regular : public Student {
public:
int fees() const override; // reg student fees
};
class Coop : public Student {
public:
int fees() const override; // coop student fees
};
I know how to calculate Regular and Coop student fees, what I don't know is what to put for Student::fees
?
I am not sure because every student should be regular or coop. So, we should never create objects of the class Student, all objects must be created from classes Regular or Coop.
I can explicitly give Student::fees
NO implementation.
We can make Student be an abstract class. An abstract class cannot be instantiated and has at least one method that is not implementated. Its purpose is to organize subclasses.
class Student {
public:
virtual int fees() const = 0;
// this says the method has no (*) implementation
// this is called a pure virtual method
};
We create abstract classes by leaving methods without implementation, so we can explicitly give Student::fees
no implementation. This is done in C++ by adding = 0
to the end of the declaration of a virtual method.
The method fees is called a pure virtual method. A class with a pure virtual method cannot be instantiated.
Student s; // ERROR!
new Student(); // also ERROR!
Subclasses of an abstract class are also abstract unless they implement all pure virtual methods.
Non-abstract classes are called concrete:
class Regular : public Student { // concrete class
public:
int fees() const override { return 700 * numCourses; }
};
In UML, represent virtual and pure virual methods using italics. Represent abstract classes by italicizing the class name.
This part is not tested.
In other language, methods with no implementation are just called abstract methods. The keyword abstract
is used to declare abstract classes and methods. For example in Java:
public abstract class Student {
public abstract int fees();
}
Note that the keyword virtual
does not exists in Java.
In C++, you do not need to declare a class as abstract, it is inferred automatically by the compiler if the class has at least one pure virtual method.
What happens with the objects' copy and move operations when you use inheritance?
class Book {
protected:
string title, author;
int length;
public:
Book(const string &title, const string &author, int length);
Book(const Book &b); // define the copy ctor
Book &operator=(const Book &rhs);
Book &operator=(Book &&b);
// other public methods
...
};
class Text : public Book {
string topic;
public:
Text(const string &title, const string &author, int length, const string &topic);
// Does not define copy/move ctors and operators
...
// other public methods
};
In main:
Text t{"Algorithms", "CLRS", 500, "CS"};
Test t2 = t; // No copy constructor in Text, what happens?
// calls Book's copy ctor
// then goes fields by field (i.e. default behaviour) for the text part
This copy initialization (Text t2 = t;) calls Book's copy constructor and then goes field-by-field (i.e. default behaviour) for the Text part. The same is true for other compiler-provided methods.
However, you can also write your own implementation o fthe constructors and assignment operators. You do this by calling one of Book's constructors/operators first, and then continuing with your implementation:
// Copy ctor:
Text::Text(const Text &other) : Book{other}, topic{other.topic} {}
// Copy Assignment:
Text &Text::operator=(const Text &other) {
Book::operator=(other);
topic = other.topic;
return *this;
}
// Move ctor:
Text::Text(Text &&other) : Book{std::move(other)}, topic{std::move(other.topic)} {} // std::move() is in <utility>
// Move Assignment:
Text &Text::operator=(Text &&other) {
Book::operator=(std::move(other));
topic = std::move(other.topic);
return *this;
}
Note: Even though other refers to an rvalue, other itself is the named parameter of a function, and so is an lvalue. (so is other.topic). So to invoke the move operation we must tell the compiler to treat it like an rvalue by calling std::move()
on it. The function std::move()
forces an lvalue x to be treated as an rvalue, so that the "move" versions of these operators run.
This is important in the implementation of the move constructor and assignment operator, otherwise, the copy version of the operations would run.
The operations given above are equivalent to the default behaviour (i.e., they do the same as the compiler would do as the default behaviour in the initial example when we did not create specific implementations of the operations for class Text). You can specialize those behaviours if your class needs to do anything different.
What happens if you use pointers to the base class to assignment an object to another via copy or move? Now consider:
Text t1{"Programming for Beginners", "Nick", 200, "Pascal"};
Text t2{"Programming for Big Kids", "Ben", 300, "C++"};
Book *pb1 = &t1;
Book *pb2 = &t2;
// What if we do the following?
*pb2 = *pb1;
Book::operator= is called and the result is partial assignment. Only the Book fields are assigned (copied), but Text's fields (i.e. topic) are not copied.
How to fix this? Make our assignment operators virual?
Before the assignment
t1 | t2 |
---|---|
Programming for Beginners | Programming for Big Kids |
Nick | Ben |
200 | 300 |
Pascal | C++ |
After the assignment (*pb2 = *pb1)
t1 | t2 |
---|---|
Programming for Beginners | Programming for Beginners |
Nick | Nick |
200 | 200 |
Pascal | C++ |
Solution 1: Virtual operations
Partial assignment is not desirable, if only some of the fields are copied when you try to assign different objects, this can potentially lead to bugs in your program.
One potential solution to the problem of partial assignment is making operator=
virtual
:
class Book {
...
public:
virtual Book &operator=(const Book &other);
virtual Book &operator=(Book &&other);
};
class Text : public Book {
...
public:
Text &operator=(const Book &other) override { // BAD parameter
Topic = other.topic; // doesn't work, Book has no Topic field
}
Text &operator=(Book &&other) override;
};
Note: Text::operator=
is permitted to return (by reference) a subtype object, but the parameter types must be the same, or it is not an override (and won't compile)
However, we created a new problem
By the "is-a" principle, if a Book can be assigned from another Book, then a Text can be assigned from another Book. Therefore, assignment of a Book subject to a Text object variable would be allowed, which is called mixed assignment
Text t{...};
Book b{...};
Text *pt = &t;
*pt = b; // call virtual operator= through pointer; subclass version runs
// uses a Book to assign a Text; BAD (but it would compile) we are using a book to assign a text
Also, it is now possible to use a Comic object to assign a Text variable (or vice-versa)
Text t{...};
Comic c{...};
t = c; // Use Comic object to assign Text object. REALLY BAD
IN SUMMARY, if opeartor=
is non-virtual, then we got partial assignment when assigning through base class pointers/references. If it is virtual, then the compiler will allow mixed assignment.
Solution 2: Abstract Superclasses
It is not a good idea to implement a class hierarchy with non-virtual assignment/move operations or you will create a problem of partial assignment. However, it is also not a good idea to just make them virtual or you will create a problem of mixed assignment.
Although it is possible to implement those solutions to avoid compile errors or run-time crashes, letting programmers do strange operations like trying to assign a Comic book into a Text book or vice-versa can lead to logical errors and bugs that are hard to debug.
Recommendation: All base classes should be Abstract! (don't create it as object)
To implement this, we rewrite the class hierarchy, and italicize the base class.
// abstractBook.h
class AbstractBook {
string title, author;
int length;
protected:
AbstractBook &operator=(const AbstractBook &other); // copy assignment now protected
AbstractBook &operator=(AbstractBook &&other); // move assignment now protected
public:
AbstractBook(...);
virtual ~AbstractBook() = 0; // need at least one pure virtual method
// If you don't have one, use the dtor
// Note: a destructor always need a definition, so we can declare it PV as above
// and that does ake this class Abstract, but we need to also define it in the
// implementation file (even if the implementation is empty)
};
Remember in C++, we need to have at least one pure virtual method to make the class abstract.
In this case, we make the destructor pure virtual. But note that this is only because we want to make the class abstract, we still need to implement the destructor because it will be called by the subclasses when the objects are destroyed. (you can try removing the implementation of the destructor in the example and you will see that the code will not link).
// in abstractBook.cc
AbstractBook::~AbstractBook() {}
(Note: hence, make a method pure virtual doesn't really mean that there is necessarily no implementation, it means that the method must be overridden by subclasses, but if the base class does have an implementation, the overriding method in the subclass is free to call up to the base class implementation, for default behaviour)
The protected AbstractBook::operator= disallows assigning through base class pointers but the implementation is still available for subclass implementations to invoke.
Now, we can create a new concrete class, so we can instantiate normal books. We can just call AbstractBook::operator=
as needed:
class NormalBook : public AbstractBook {
public:
NormalBook(...);
~NormalBook();
NormalBook &operator=(const NormalBook &other) {
AbstractBook::operator=(other);
return *this;
}
NormalBook &operator=(NormalBook &&other) {
AbstractBook::operator=(std::move(other));
return *this;
}
};
And we can implement the concrete classes Text and Comic in the same way, i.e., calling AbstractBook::operator= as needed and just copying/moving the specific fields of the subclasses.
The design prevents partial and mixed assignment because copy/move assignment will not be allowed using base class pointers, but the implementation is still available for subclass implementations to invoke.
Text t1(...);
Text t2(...);
// The lines below will not compile
// because AbstractBook::operator= is protectced, so it cannot be called here
AbstractBook *pb1 = &t1;
AbstractBook *pb2 = &t2;
*pb2 = *pb1; // compile error, this will not compile because abstractbook operator is protected
// However, it is possible to assign a Text object to another:
t2 = t1; // this is fine
// Or using pointers:
Text *pt1 = &t1;
Text *pt2 = &t2;
*pt2 = *pt1; // this is fine
Summary
There are three different options to implement the copy and move operations together with inheritance:
- Public non-virtual operations do not restrict what the programmer can do but allow partial assignment;
- Public virtual operations do not restrict what the programmer can do but allow mixed assignment
- Protected operations in an abstract superclass that are called by the public operations in the concrete subclasses prevent partial and mixed assignments but prevent the programmer from making assignment using base class pointers.
Unfortunately, none of these solutions are perfect as each one of them has one weakness. Generally, the third option is recommended, with abstract superclasses containing protected assignment operations, because it prevents partial and mixed assignments, thus avoiding logical errors in the program.
However, it creates a limitation, it is not possible to do an assignment with base class pointers. For some programs, this limitation may be relevant. In this case, one of the other two solutions may be more appropriate, together with measures to minimize problems of partial or mixed assignment.
We are trying to keep our header file as separate from our implementation files as possible.
We need to examine under what circumstances we absolutely must include one header file in another, and under what circumstances we can simply use a forward declaration in the header file, and just include the header file in the implementation file of other class. (The latter is necessary in order to break include cycles, where, for example, some file x.h includes file y.h that in turns include x.h)
Consider some class A, defined in the file a.h, there are five possible ways that A can be used by another class.
class B : public A {
...
};
- Must include a.h since compiler needs to know exactly how large class A is in order to determine the size of class B
class C {
A myA;
};
- Must include a.h since compiler needs to know exactly how large class A is in order to determine the size of class C
class D {
A *myAptr;
};
- All pointers are the same size, so a forward declaration in the header file for class D is sufficient, thought the implementation file of D will need to include a.h
class E {
A f(A x);
};
- Despite the fact that the method
E::f
passes a parameter of a type A by value, and returns an instance of A by value, the method signature is only used for type checking by the compiler. There is thus no true compilation dependency, and a forward declaration is sufficient, though the implementation file of E will need to include a.h
class F {
void f() {
A x;
...
x.someMethod();
...
}
};
- Because class F wrote the implementation of method
F::f
inline, it is using a method that belongs to class A. Therefore, it must include the header file for A so that the compiler knows what methods A has available; however, if we moved the implementation ofF::f
to the implementation file of F, then we could use a forward declaration here instea. - This is why we discourage you from writing your methods inline
Example: the program has a stack and a queue class, each implemented using the Node class.
// stack.h
#ifndef STACK_H
#define STACK_H
struct Node; // forward declaration
class Stack {
Node *ptr;
public:
Stack();
~Stack();
bool isEmpty();
int top();
void pop();
void push(int value);
};
#endif
// queue.h
#ifndef QUEUE_H
#define QUEUE_H
class Node; // forward declaration
class Queue {
Node * frontPtr, *backPtr;
public:
Queue();
~Queue();
bool isEmpty();
int front();
void dequeue();
void enqueue(int value);
};
#endif
Note that we have removed the #include "node.h"
from both header files, as well, even though Node is actually defined as a struct, Queue has forward-declared it as a class. This is perfectly legal that we just stats that "such a type exists"
Now look at the implementation files
// stack.cc
#include "stack.h"
#include "node.h"
Stack::Stack() : ptr{nullptr} {}
Stack::~Stack() { while (!isEmpty()) pop(); }
bool Stack::isEmpty() { return ptr == nullptr; }
int Stack::top() { return ptr->data; }
void Stack::pop() {
Node *tmp = ptr;
ptr = ptr->next;
delete tmp;
}
void Stack::push(int value) {
Node *tmp = new Node{value, ptr};
ptr = tmp;
}
// queue.cc
#include "queue.h"
#include "node.h"
Queue::Queue() : frontPtr{nullptr}, backPtr{nullptr} {}
Queue::~Queue() { while (!isEmpty()) dequeue(); }
bool Queue::isEmpty() { return (frontPtr == backPtr && frontPtr == nullptr); }
int Queue::front() { return frontPtr->data; }
void Queue::dequeue() {
Node *tmp = frontPtr;
frontPtr = frontPtr->pnext;
if (frontPtr == nullptr) backPtr = nullptr;
delete tmp;
}
void Queue::enqueue(int value {
Node *tmp = new Node{value, nullptr};
if (frontPtr == backPtr && frontPtr == nullptr) frontPtr = tmp;
else backPtr->next = temp;
backPtr = tmp;
}
Example: Case of an include cycle due to inheritance
// a.h
#ifndef A_H
#define A_H
#include "b.h"
class A : public B {
...
};
#endif
// b.h
#ifndef B_H
#define B_H
#include "a.h"
class B : public A {
...
};
#endif
Conceptually, you can't have some class A inherit from some class B and have some class B also inherit from class A. That just doesn't make any sense! However, in many cases, we can replace "is a" relationship instead i.e. use object composition instead of inheritance. (And in fact, we generally recommend composition over inheritance since it provides flexibility at run-time)
// a.h
#ifndef A_H
#define A_H
#include "b.h"
class A {
B myB;
...
};
#endif
// b.h
#ifndef B_H
#define B_H
#include "a.h"
class B {
A myA;
...
};
#endif
In the case of a data field being an object, the fix is to either make it a reference to the object, or a pointer to the object. Remember, a reference is really just a constant pointer, and all pointers are the same size, so we just need a forward declaration to be aware of the type name in the header file
// a.h
#ifndef A_H
#define A_H
class B;
class A{
B *myB;
...
};
#endif
// b.h
#ifndef B_H
#define B_H
class A;
class B {
A &myA;
...
};
#endif
General rule: if there is no compilation dependency necessitated by the code, don't introduce one with extraneous #include
statements; instead use forward declarations wherever possible and include the necessary headers in the implementation files.