Data Structures and Algorithms Tutorial Problems: Part 3

1. ### B+-tree Design

You are constructing a database for the Tax Office of a medium-sized Pacific nation. The primary key for records is a 9-digit tax file number. (This Tax Office is so pre-occupied with a new tax scheme that their Prime Minister is touting as the answer to all their economic woes, that it hasn't learnt about binary representations for numbers yet and still uses one byte per decimal digit. They get really uptight when anyone mentions the year 2000 problem!)

The records for this database will be stored on discs which have an average access time of 2.5ms to read a block of 4Kbytes. Each disc uses a 32-bit integer to address blocks within the disc. The database spreads over multiple discs, so a 16-bit disc identifier has to be added to each block address. It takes 150ns to read a word from the computer's main memory.

The population is 17x106 taxpayers and 0.78 x 103 politicians (1.8x103 millionaires have never needed to submit a tax return.). There are also records for 1.2 x 106 foreign residents who pay some tax and 2.8x106 companies, trusts and other tax avoidance structures.

1. How much space will you use to store the indices for this database? Work out the minimum and maximum values for the space - in either disc blocks or bytes. (Don't worry about the poor taxpayers' records, which are humungous as they include details such as the cost of the beer consumed by every travelling salesman in every bar in the country - in order to calculate FBT.)
2. How long will it take to find a taxpayer's record?
3. How much disc space has been wasted in the indices?
4. Many compilers align fields on to word boundaries for efficiency. For example, a 9-byte field is padded out to 12-bytes (the next multiple of the 4-byte word length) so that the next field lies on a word boundary. Would this be a good thing to do in this case? Would it be worth going to some effort to prevent the compiler from padding out short (<4byte) fields?

2. ### Stable Sorting

A stable sorting algorithm is one which leaves equal keys in the same order as they appeared in the original collection. Work out which of the algorithms that you have studied so far will produce stable sorts.

3. ### A Numerical Puzzle

You are given an array containing n integers. You are asked to determine if the list contains two numbers that sum to some arbitrary integer, k. For example, if the list is 8, 4, 1 and 6 and k = 10, the answer is "yes", because 4+6 = 10.
1. Find a way to solve this problem that is O(n2).
2. Can you do better? What is the time complexity of the better method?
Hint: Consider sorting the array first.

4. ### Dynamically Balanced Trees

Show that any AVL tree can be coloured as a red-black tree.

5. ### Dynamic Memory Allocation

Modern object-oriented software makes extensive use of the malloc and free operations. Unfortunately, they are generally quite expensive (in time) and thus efficient routines are important.

free places freed blocks into free lists. In order to re-use memory as much as possible, malloc will generally search the free lists for a block before requesting another one from the operating system (a very expensive exercise!). A best-fit algorithm searches for the smallest free block into which the requested block will fit.

Suggest a number of ways in which free could organise the free lists and give the time complexities for

1. free to add a block to the free lists
2. malloc to find a "best-fit" block from these lists.
Describe any drawbacks that any method might have.

6. ### Equivalence Classes

You are required to test some sorting routines:
1. an insertion sort,
2. a heap sort and
3. a quick sort.

1. Propose a simple specification for a general purpose sorting function.
2. Using the specification, derive a set of tests that you would perform to verify the function.
3. Would your knowledge of the algorithms used by each sorting method cause you to extend the test set?
For the purpose of drawing up examples, use records whose keys are integers.

### Key terms

stable sort
A In a stable sort, the order of equal keys in the original is retained in the output.