Tips
A collection of useful tips, tricks and traps. Feel free to contribute by sending an email or making an Ed post. In addition to the below, please refer to this CPMSoc blog post.
A useful Makefile
Create a file called Makefile
in your directory (or just home directory for exams) with these contents (copied verbatim)
CXX = g++
CPPFLAGS = -Wall -Wextra -Wshadow -Wformat=2 -Wfloat-equal -Wconversion -Wlogical-op \
-D_GLIBCXX_DEBUG -D_GLIBCXX_DEBUG_PEDANTIC -D_FORTIFY_SOURCE=2 \
-fsanitize=address -fsanitize=undefined -fno-sanitize-recover -fstack-protector \
-O2 --std=c++17 -g
These extra flags do a mix of static analysis and extra run time checks that catch some common mistakes in your program. Many instances of undefined behavior now at least trigger a warning when you run the program, instead of (worst case) silently working.
This is not a panacea though. It won’t catch anywhere near all bugs, but it catches some particularly simple ones.
One extra warning: these flags slow down execution speed dramatically. So switch to something more lightweight if you need to run a program for a while or you want to get a gauge of the running time of a program.
A brute force comparison script
For many problems, a useful way of debugging is to write a brute force and a random data generator and repeatedly generate data and compare the brute force’s output to your program’s output until there’s a mismatch. This is particularly useful for problems with subtasks where you can use the subtask solution as your brute force.
For this, I will assume you have:
- Your solution is an executable named
soln
. - You have a brute force executable named
brute_force
. - A random data generator titled
gen
. I usually write these in Python.
Then to generate random data until there’s a mismatch, just run
./gen > random.in; while diff <(./soln < random.in) <(./brute_force < random.in); do echo "all good"; ./gen > random.in; done
If this ever stops, a mismatch has been found and the input responsible is stored in random.in
.
Speeding up IO
Iostream
Iostream and cin/cout by default are quite a bit slower than scanf/printf. This occasionally matters for problems with a lot of input. To speed up input, add the following lines directly below main:
int main() {
cin.tie(nullptr);
cin.sync_with_stdio(false);
The former stops cout from flushing every time cin tries to read (having cout flush is desirable for interactive programs but it introduces unnecessary flushing in our case).
The latter unsyncs iostream from cstdio. Calling this speeds up input even if you aren’t using cstdio but you should not then use C style input in your program.
Endl
There are 2 common ways to print a new line when using cin, either << '\n';
or << endl;
The difference is the latter one flushes output.
Again this is desirable for interactivity but frequent flushing is extremely slow.
For competitions, you should generally use '\n'
not endl
.
Common Traps
Overflows
- Whenever you find yourself needing to multiply ints, check if you should use long longs instead.
- If you are working mod a large prime (usually
1000000007
), check you mod at each step. This is easiest if you define a helper function.
Long data type
- In C++, the
long
data type is only guaranteed to be at least 32 bits. - Never use
long
. Always use eitherint
for a 32 bit integer orlong long
for a 64 bit integer.
Mod negative numbers
- C++
%
is a bit weird,-5 % 3 == -2
, not1
as you might expect. If you need to do arithmetic modulom
, you might want to use helper functions along the lines of
long long madd(long long a, long long b) { return ((a + b) % m + m) % m; }
long long msub(long long a, long long b) { return ((a - b) % m + m) % m; }
Nested loops
Iterating through a 2D array row-by-row
for (int i = 0; i < m; i++)
for (int j = 0; j < n; j++)
total += a[i][j];
is significantly faster than column-by-column
for (int j = 0; j < n; j++)
for (int i = 0; i < m; i++)
total += a[i][j];
as the entries of a row are stored contiguously in memory. This can be the difference between a correct solution and one that times out!
Printing floating-point numbers
The simplest way to print floating-point numbers to a desired number of decimal places is to use C-style
output: from the <cstdio>
header, simply printf("%.9lf\n", x);
.
There are a couple of common pitfalls using C++-style output. If x
is a double, cout << x << '\n'
prints it with six digits of precision by default. This is often
insufficient, and in particular if x
exceeds 1,000,000, then it will be printed in scientific notation (yes, really).
- To insist on fixed-point notation (i.e. not scientific notation), use the manipulator
std::fixed
from the<ios>
header (note: this is included within<iostream>
). - To change the number of decimal places, use the manipulator
std::setprecision()
from the<iomanip>
header.
So to print x
to 9 decimal places using C++-style output, we can write
#include <iostream>
#include <iomanip>
using namespace std;
int main (void) {
/*
...
*/
cout << fixed << setprecision(9) << x << '\n';
}
Binary search on red-black trees
Recall the lower_bound()
function from <algorithm>
.
If a
is a std::vector
whose elements are in increasing order, then
lower_bound(a.begin(), a.end(), x)
finds an iterator to the first element which compares >= x
in logarithmic time.
However, the same function runs in linear time on a std::set
! This is because set iterators do not support random access, which is necessary for the binary search.
Instead, std::set
includes a member function lower_bound()
which navigates the underlying binary search tree in logarithmic time. Therefore, for a set b
, we should instead call b.lower_bound(x)
.
The other functions in <algorithm>
making use of binary search translate similarly.
Vector | Set |
---|---|
binary_search(a.begin(), a.end(), x) |
b.find(x) != b.end() |
lower_bound(a.begin(), a.end(), x) |
b.lower_bound(x) |
upper_bound(a.begin(), a.end(), x) |
b.upper_bound(x) |
equal_range(a.begin(), a.end(), x) |
b.equal_range(x) |
Similar functions are available for std::multiset
, std::map
and std::multimap
.
Unordered data structures have member functions find()
and equal_range()
which run in expected constant but worst case linear time.