Range Minimum Queries Project
The project is to implement the following RMQ algorithms:
- compute on the fly
- precompute all queries
- sparse array
- block based approach (with varying block size)
- Cartesian trees (optional) (with varying block size)
Optimizing the code is encouraged but optional.
Then, write a report comparing the methods via 3 plots:
- how does query time scale with \(n\)?
- how does space usage scale with \(n\)?
- what is the space-time trade-off for \(n=10^8\)?
Input data Link to heading
Input data is provided in the following file format:
| |
For example:
| |
The provided test cases contain both random data and queries, as well as edge cases that your code should work on (e.g. increasing/decreasing data, only very long query ranges, only very short query ranges, …).
Note that for the final evaluation, we will build, test, and run your code against ‘private’ test data to verify its correctness and to get speed measurements for all submissions on the same hardware.
We provide input files for \(n\) up to \(10^8\), each with \(10^6\) queries, but you can also test with your own randomly generated queries.
Benchmarks Link to heading
Your code should read the input, build the data structure, and only then time how long it takes to answer all queries. Then it should write the results (the average time per query for each \(n\)) to a csv file.
Deliverables Link to heading
A abcd12.zip file named after your student ID containing, in the root:
report.pdf: a report (in English) containing:- Introduction: Briefly explains the files you submitted;
- Methods: Briefly explains the algorithms you implemented.
State the space and query time complexity of each method.
- In case you did more than the minimum: explain any optimizations you did;
- Results: This is the focus of the report.
- Contains the (at least) three plots.
- Analyse the results: do the results match your expectations? If so, how closely? If not, why do they not match?
- Which algorithms are Pareto-optimal? Which algorithms are best for small space? For speed? And what would you choose as general-purpose balanced tradeoff? Motivate your choice.
- What parameters/block sizes are best?
- How fast is the fastest method in absolute ns/query? Can you explain this number? Do you think this could be optimized further? What (hardware?) limit is the bottleneck?
- A
makefile(orjustfile):make buildshould build the code.make testshould test the code against all input files.make runshould run all the benchmarks and write todata/{time,space,tradeoff}.csv.make plotshould readdata.csvand writeplots/{time,space,tradeoff}.png.makedoes the 4 above, in order.
Please use this provided skeleton as a starting point: TODO ZIP.
AI policy Link to heading
Using AI is allowed (but discouraged) for the code. Using AI is not allowed for the writing of the report.