-
Notifications
You must be signed in to change notification settings - Fork 506
Coding Guideline
In the project root directory, there are three major places where you will put code:
-
src
-- This is where the bulk of the code for lives. Anything you expect to be compiled into the release should be here. -
test
-- This is where unit tests, benchmarks, and utility code for them lives.src
should not have dependency going intotest
. -
script
-- Where scripts that support development and testing lives. (e.g. python formatting script, dependency installation).
Almost never will you need to create new directories outside of these.
There can be at most 2-levels of directories under src
, the first level will be general system components (e.g. storage, execution, network, sql, common), and the second level will be either for a class of similar files, or for a self-contained sub-component.
Translated into coding guidelines, you should rarely need to create a new first-level subdirectory, and should probably consult Andy if you believe you do. To create a new secondary directory, make sure you meet the following criteria:
- There are more than 2 (exclusive) files you need to put into this folder
- Each file is stand-alone, i.e. either the contents don't make sense living in a single file, or that putting them in a single file makes the file large and difficult to navigate. (This is open to interpretation, but if, for example, you have 3 files containing 10-line class definitions, maybe they should not be spread out that much).
And one of the two:
- The subdirectory is a self-contained sub-component. This probably means that the folder only has one outward facing API. A good rule of thumb is when outside code files only need to include one header from this folder, where said API is defined.
- The subdirectory contains a logical grouping of files, and there are enough of them that leaving them ungrouped makes the upper level hard to navigate. (e.g. all the plans, all the common data structures, etc.)
A good rule of thumb is if you have subdirectory
As
, you should be able to say with a straight face that everything underAs
is an A. (e.g. Everything undercontainers
is a container)
TBD
TBD
Ask us how we know, but Singletons, and hard-coded dependencies in general, do wonders to make a codebase hard to understand, hard to test, and hard to maintain.
The solution is to use a specific style of coding to avoid having to deal with this. Dependency Injection is a widely used paradigm in industry for this problem, and it is easier than it sounds.
Although we don't have the need for a full DI framework yet, we should strive to write code in a one amenable to future changes. Read the article linked to above if you are interested, but otherwise, here is a quick example of how these things work. Suppose we want to write a LinkedList, but linked list nodes will be reused:
struct LinkedListNode {
// ...
};
class LinkedList {
// ...
template <typename T>
void Add(T content) {
// ...
LinkedListNodeObjectPool::GetInstance().New(content);
}
};
At this point, it is hiding the fact that it mutates the state of the global singleton LinkedListNodeObjectPool
, and because we made an explicitly call to it, preventing modularized testing. Now consider a scenario where for some reason LinkedListNodeObjectPool
hands out large chunks of memory (say, 1 MB each node), and we want to do scale testing on the operations of LinkedList
(e.g. Insert, Delete concurrently), our test will have to run slowly because there is no way for us to change the LinkedListNodeObjectPool
's memory behavior without changing the code. Whereas in an ideal world, we know that in this test, it doesn't matter what the content of these LinkedListNode
are, and can get away with a fake object that just has the pointer field.
Suppose we have written the code instead in this way:
class LinkedList {
LinkedList(LinkedListNodeObjectPool &pool) : pool_(pool) {}
// ...
template <typename T>
void Add(T content) {
// ...
pool_.New(content);
}
//...
LinkedListNodeObjectPool &pool_;
};
Everything still works. But now suppose we want to write the above test, we simply do:
// fake implementation with low memory overhead
class FakeLinkedListNodeObjectPool : public LinkedListNodeObjectPool {
// ...
}
TEST(LinkedListTests, LargeTest) {
FakeLinkedListNodeObjectPool fake_pool;
LinkedList tested(fake_pool);
// test on tested
}
When we execute the real program, presumably in main.cpp
:
int main() {
// ...
LinkedListNodeObjectPool real_pool;
LinkedList tested(real_pool);
// ...
}
To sum it up, dependency injection states that object creation and the logic be separated. This allows us to change the object without changing the logic. Of course, in practice, there is no need to be as strict about this; it might make sense, for example, for objects to create and own objects when the owned object is essential to functionality (e.g. rarely do we want to change a std::vector implementation, so creating a vector member in the object is fine). However, if the object in question logically is a different component, and it might make sense for the logic to be tested apart from the object, write the code in this way.
Carnegie Mellon Database Group Website