πŸ”₯ FibyTree vs. 🐍 Set and SortedSet

Maxim Zaks
AI Mind
Published in
9 min readAug 24, 2023

--

Couple of days ago, I wrote an article about FibyTree, a data structure I designed and implemented in Mojo. Today I would like to reveal some benchmarks.

Words of caution. This benchmarks are preliminary. At current point in time Mojo code runs only in a Jupyter Notebook, so I run Mojo and Python code in the Jupyter Notebook. This is also the first performance evaluation I do on FibyTree. I did design the data structure with performance in mind, but I did not have the time to think about and try out optimisations. BTW if you have ideas, how I can improve FibyTree, please let me know!

For the benchmark I chose seven set operations:

I chose ten different sizes for the sets (10, 100, 300, 500, 1K, 3K, 9K, 15K, 30K and 50K) in order to understand how the size correlates with performance, specifically when it comes to a tree based data structure this is important.

The time unit through the benchmarks is nano seconds.

Insert an element to the set

Here is the Python script I use to populate the Set/SortedSet:

def perf_test_random_add_p(size, min=-30000, max=30000, sorted=True):
total = 0

s = SortedSet() if sorted else set()
for i in range(size):
v = random.randint(min, max)
tik = time.time_ns()
s.add(v)
tok = time.time_ns()
total += (tok - tik)

return total / size

def perf_test_ordered_add_p(size, sorted=True):
total = 0
s = SortedSet() if sorted else set()
tik = time.time()
for i in range(size):
tik = time.time_ns()
s.add(i)
tok = time.time_ns()
total += (tok - tik)

return total / size

As you can see I am measuring only the time it takes to add an element to a set. I am also considering two possibilities:

  • Adding random values to a set
  • Adding elements in ascending order

I did this distinction mainly because adding sorted elements to a binary search tree is the worst case scenario, but I was also surprised to see that Python Set implementation also has quite a preference.

Here is the corresponding Mojo code:

fn perf_test_random_add(size: Int, min: Int = -30000, max: Int = 30000) -> Float64:
var total = 0

var tik = now()
var tok = now()
var f = fiby()

for _ in range(size):
let i = random_si64(min, max).to_int()
tik = now()
f.add(i)
tok = now()
total += (tok - tik)

return total / size

fn perf_test_ordered_add(size: Int) -> Float64:
var total = 0
var tik = now()
var f = fiby()
var tok = now()
total += tok - tik
for i in range(size):
tik = now()
f.add(i)
tok = now()
total += (tok - tik)

tik = now()
f.balance()
tok = now()
total += (tok - tik)

return total / size

As I mentioned before, it was very surprising for me to see that adding elements in ascending order to a Python Set is so much faster then adding random values. Generally adding random elements to a FibyTree seems to have very nice performance characteristics. Through the randomness the balancing does not have to kick in that often and average difference per insert between 100 elements and 50K elements is just 1.5x.

As expected adding elements in ascending order to FibyTree follows an exponential growth, just because we need to balance the tree so often.

Adding random elements to FibyTree up to 10K of size seems to be about 10x more efficient then adding to Python Set or SortedSet.

Check if an element is in the set

Here is the Python code, where we measure the time to call __contains__ method:

def perf_test_contains_p(size, sorted=True):
total = 0

s = SortedSet() if sorted else set()
for i in range(size):
s.add(random.randint(-size, size))

for i in range(size):
tik = time.time_ns()
r = s.__contains__(i)
tok = time.time_ns()
total += (tok - tik)

return total / size

And it’s Mojo counterpart:

fn perf_test_contains(size: Int, inout found: Int) -> Float64:
var f = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f.add(i)

var total = 0

var tik = now()
var tok = now()

var res = DynamicVector[Bool](size)
for i in range(size):
tik = now()
let r = f.__contains__(i)
tok = now()
res.push_back(r)
total += (tok - tik)

var count = 0
for i in range(len(res)):
if res[i]:
count += 1
found = count

return total / size

There is also perf_test_contains_on_balanced function, where we call f.balance() after all elements are added.

As we can see, the search in FibyTree gets constantly slower with bigger set. That said finding elements in a balanced FibyTree is ~5x fastert than SortedSet and about 3x faster than Set.

Delete an element from the set

The operation is quite similar to contains, as we first need to find the element and then delete it.

Bellow you can find corresponding Python and Mojo code:

def perf_test_delete_p(size, sorted=True):
total = 0

s = SortedSet() if sorted else set()
for i in range(size):
s.add(random.randint(-size, size))

for i in range(size):
tik = time.time_ns()
r = s.discard(i)
tok = time.time_ns()
total += (tok -tik)

return total / size
fn perf_test_delete(size: Int, inout found: Int) -> Float64:
var f = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f.add(i)

var total = 0

var tik = now()
var tok = now()

var res = DynamicVector[Bool](size)
for i in range(size):
tik = now()
let r = f.delete(i)
tok = now()
res.push_back(r)
total += (tok - tik)

var count = 0
for i in range(len(res)):
if res[i]:
count += 1
found = count

return total / size

Balanced FibyTree is about 4x faster than Set and whooping 20–30x faster than SortedSet. It’s also interesting to notice that delete operation in FibyTree is on average faster then contains, which kind of make sense as the tree get smaller through the process. This is however not the case for Python Set.

Build a union of two sets

For this operation we create two random sets and apply the union operation. The time measured for operation execution is divided by the set size.

Python code:

def perf_test_union_p(size, sorted=True):
total = 0

s1 = SortedSet() if sorted else set()
s2 = SortedSet() if sorted else set()
for i in range(size):
s1.add(random.randint(-size, size))
s2.add(random.randint(-size, size))

tik = time.time_ns()
s3 = s1.union(s2)
tok = time.time_ns()

return (tok - tik) / size

Mojo code:

fn perf_test_union(size: Int) -> Float64:
var f1 = fiby()
var f2 = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f1.add(i)
f2.add(i)

let tik = now()
f1.union_inplace(f2)
let tok = now()

return (tok - tik) / Float64(size)

We use union_inplace method in Mojo, which does not create a new instance but mutates the tree in-place, because of a bug I stumbled upon and reported in Mojo compiler. From performance point of view this should not make a big difference though.

The growth of execution time for FibyTree seems to be quite linear based on the set size, where Set and SortedSet have a bit of a degenerating performance. It also shows that Set is 1–7x slower than FibyTree and SortedSet 6–12x slower, with a spike of 26x for 10 elements set.

Build an intersection of two sets

Similar procedure as for union, Python and Mojo code below:

def perf_test_intersection_p(size, sorted=True):
total = 0

s1 = SortedSet() if sorted else set()
s2 = SortedSet() if sorted else set()
for i in range(size):
s1.add(random.randint(-size, size))
s2.add(random.randint(-size, size))

tik = time.time_ns()
s3 = s1.intersection(s2)
tok = time.time_ns()

return (tok - tik) / size
fn perf_test_intersection(size: Int) -> Float64:
var f1 = fiby()
var f2 = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f1.add(i)
f2.add(i)

let tik = now()
f1.intersection_inplace(f2)
let tok = now()

return (tok - tik) / Float64(size)

It seems like Python Set has a very efficient way to compute intersection, at least for smaller sets, the larger sets are still ~2x slower than FibyTree. Sorted Set is 3–5x slower, with a spike of 17x for 10 elements set. Not sure what is happening there but the spike was consistent throughout multiple runs.

Build a difference of two sets

Similar procedure, Python and Mojo code below:

def perf_test_difference_p(size, sorted=True):
total = 0

s1 = SortedSet() if sorted else set()
s2 = SortedSet() if sorted else set()
for i in range(size):
s1.add(random.randint(-size, size))
s2.add(random.randint(-size, size))

tik = time.time_ns()
s3 = s1.difference(s2)
tok = time.time_ns()

return (tok - tik) / size
fn perf_test_difference(size: Int) -> Float64:
var f1 = fiby()
var f2 = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f1.add(i)
f2.add(i)

let tik = now()
f1.difference_inplace(f2)
let tok = now()

return (tok - tik) / Float64(size)

We observe Set being 1–5x slower and SortedSet 5–9x slower with a 19x spike for 10 elements set.

Build a symmetric difference of two sets

Similar procedure, Python and Mojo code below:

def perf_test_symmetric_difference_p(size, sorted=True):
total = 0

s1 = SortedSet() if sorted else set()
s2 = SortedSet() if sorted else set()
for i in range(size):
s1.add(random.randint(-size, size))
s2.add(random.randint(-size, size))

tik = time.time_ns()
s3 = s1.symmetric_difference(s2)
tok = time.time_ns()

return (tok - tik) / size
fn perf_test_symmetric_difference(size: Int) -> Float64:
var f1 = fiby()
var f2 = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f1.add(i)
f2.add(i)

let tik = now()
f1.symmetric_difference_inplace(f2)
let tok = now()

return (tok - tik) / Float64(size)

We observe Set being 1–6x slower and SortedSet 7–11x slower with a 20x spike for 10 elements set.

All in all, I think FibyTree is a viable data structure to investigate further. As I mention in the beginning of the article, this benchmarks can be seen as preliminary, but they already establish some baselines we can expect from the data structure. As next steps, I will consider, if there are any vectorisation and parallelisation techniques I could utilise to make the operations run faster and maybe reducing some memory management overhead by utilising stack allocations and direct heap allocations (dropping DynamicVector in favour of direct memory allocation and manual capacity management).

Benchmark results summarised

Insert an element to the set:

Set: 4–13x

SortedSet 6–13x

Check if an element is in the set:

Set: 3–4x, 9x spike for 10 elements set

SortedSet: 4–6x, 10x spike for 10 elements set

Delete an element from the set:

Set: 4x

SortedSet: 14–36x

Build a union of two sets:

Set: 1–7x

SortedSet: 6–12x, 26x spike for 10 elements set

Build an intersection of two sets:

Set: 1–2x

SortedSet: 3–5x, 17x spike for 10 elements set

Build a difference of two sets:

Set: 1–5x

SortedSet: 5–9x, 19x spike for 10 elements set

Build a symmetric difference of two sets:

Set: 1–6x

SortedSet: 7–11x, 20x spike for 10 elements set

A Message from AI Mind

Thanks for being a part of our community! Before you go:

--

--

Tells computers how to waste electricity. Hopefully in efficient, or at least useful way.