π₯ FibyTree vs. π Set and SortedSet
Couple of days ago, I wrote an article about FibyTree, a data structure I designed and implemented in Mojo. Today I would like to reveal some benchmarks.
Words of caution. This benchmarks are preliminary. At current point in time Mojo code runs only in a Jupyter Notebook, so I run Mojo and Python code in the Jupyter Notebook. This is also the first performance evaluation I do on FibyTree. I did design the data structure with performance in mind, but I did not have the time to think about and try out optimisations. BTW if you have ideas, how I can improve FibyTree, please let me know!
For the benchmark I chose seven set operations:
- Insert an element to the set
- Check if an element is in the set
- Delete an element from the set
- Build a union of two sets
- Build an intersection of two sets
- Build a difference of two sets
- Build a symmetric difference of two sets
I chose ten different sizes for the sets (10, 100, 300, 500, 1K, 3K, 9K, 15K, 30K and 50K)
in order to understand how the size correlates with performance, specifically when it comes to a tree based data structure this is important.
The time unit through the benchmarks is nano seconds.
Insert an element to the set
Here is the Python script I use to populate the Set/SortedSet:
def perf_test_random_add_p(size, min=-30000, max=30000, sorted=True):
total = 0
s = SortedSet() if sorted else set()
for i in range(size):
v = random.randint(min, max)
tik = time.time_ns()
s.add(v)
tok = time.time_ns()
total += (tok - tik)
return total / size
def perf_test_ordered_add_p(size, sorted=True):
total = 0
s = SortedSet() if sorted else set()
tik = time.time()
for i in range(size):
tik = time.time_ns()
s.add(i)
tok = time.time_ns()
total += (tok - tik)
return total / size
As you can see I am measuring only the time it takes to add an element to a set. I am also considering two possibilities:
- Adding random values to a set
- Adding elements in ascending order
I did this distinction mainly because adding sorted elements to a binary search tree is the worst case scenario, but I was also surprised to see that Python Set implementation also has quite a preference.
Here is the corresponding Mojo code:
fn perf_test_random_add(size: Int, min: Int = -30000, max: Int = 30000) -> Float64:
var total = 0
var tik = now()
var tok = now()
var f = fiby()
for _ in range(size):
let i = random_si64(min, max).to_int()
tik = now()
f.add(i)
tok = now()
total += (tok - tik)
return total / size
fn perf_test_ordered_add(size: Int) -> Float64:
var total = 0
var tik = now()
var f = fiby()
var tok = now()
total += tok - tik
for i in range(size):
tik = now()
f.add(i)
tok = now()
total += (tok - tik)
tik = now()
f.balance()
tok = now()
total += (tok - tik)
return total / size
As I mentioned before, it was very surprising for me to see that adding elements in ascending order to a Python Set is so much faster then adding random values. Generally adding random elements to a FibyTree seems to have very nice performance characteristics. Through the randomness the balancing does not have to kick in that often and average difference per insert between 100 elements and 50K elements is just 1.5x.
As expected adding elements in ascending order to FibyTree follows an exponential growth, just because we need to balance the tree so often.
Adding random elements to FibyTree up to 10K of size seems to be about 10x more efficient then adding to Python Set or SortedSet.
Check if an element is in the set
Here is the Python code, where we measure the time to call __contains__
method:
def perf_test_contains_p(size, sorted=True):
total = 0
s = SortedSet() if sorted else set()
for i in range(size):
s.add(random.randint(-size, size))
for i in range(size):
tik = time.time_ns()
r = s.__contains__(i)
tok = time.time_ns()
total += (tok - tik)
return total / size
And itβs Mojo counterpart:
fn perf_test_contains(size: Int, inout found: Int) -> Float64:
var f = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f.add(i)
var total = 0
var tik = now()
var tok = now()
var res = DynamicVector[Bool](size)
for i in range(size):
tik = now()
let r = f.__contains__(i)
tok = now()
res.push_back(r)
total += (tok - tik)
var count = 0
for i in range(len(res)):
if res[i]:
count += 1
found = count
return total / size
There is also
perf_test_contains_on_balanced
function, where we callf.balance()
after all elements are added.
As we can see, the search in FibyTree gets constantly slower with bigger set. That said finding elements in a balanced FibyTree is ~5x fastert than SortedSet and about 3x faster than Set.
Delete an element from the set
The operation is quite similar to contains, as we first need to find the element and then delete it.
Bellow you can find corresponding Python and Mojo code:
def perf_test_delete_p(size, sorted=True):
total = 0
s = SortedSet() if sorted else set()
for i in range(size):
s.add(random.randint(-size, size))
for i in range(size):
tik = time.time_ns()
r = s.discard(i)
tok = time.time_ns()
total += (tok -tik)
return total / size
fn perf_test_delete(size: Int, inout found: Int) -> Float64:
var f = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f.add(i)
var total = 0
var tik = now()
var tok = now()
var res = DynamicVector[Bool](size)
for i in range(size):
tik = now()
let r = f.delete(i)
tok = now()
res.push_back(r)
total += (tok - tik)
var count = 0
for i in range(len(res)):
if res[i]:
count += 1
found = count
return total / size
Balanced FibyTree is about 4x faster than Set and whooping 20β30x faster than SortedSet. Itβs also interesting to notice that delete operation in FibyTree is on average faster then contains, which kind of make sense as the tree get smaller through the process. This is however not the case for Python Set.
Build a union of two sets
For this operation we create two random sets and apply the union operation. The time measured for operation execution is divided by the set size.
Python code:
def perf_test_union_p(size, sorted=True):
total = 0
s1 = SortedSet() if sorted else set()
s2 = SortedSet() if sorted else set()
for i in range(size):
s1.add(random.randint(-size, size))
s2.add(random.randint(-size, size))
tik = time.time_ns()
s3 = s1.union(s2)
tok = time.time_ns()
return (tok - tik) / size
Mojo code:
fn perf_test_union(size: Int) -> Float64:
var f1 = fiby()
var f2 = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f1.add(i)
f2.add(i)
let tik = now()
f1.union_inplace(f2)
let tok = now()
return (tok - tik) / Float64(size)
We use
union_inplace
method in Mojo, which does not create a new instance but mutates the tree in-place, because of a bug I stumbled upon and reported in Mojo compiler. From performance point of view this should not make a big difference though.
The growth of execution time for FibyTree seems to be quite linear based on the set size, where Set and SortedSet have a bit of a degenerating performance. It also shows that Set is 1β7x slower than FibyTree and SortedSet 6β12x slower, with a spike of 26x for 10 elements set.
Build an intersection of two sets
Similar procedure as for union, Python and Mojo code below:
def perf_test_intersection_p(size, sorted=True):
total = 0
s1 = SortedSet() if sorted else set()
s2 = SortedSet() if sorted else set()
for i in range(size):
s1.add(random.randint(-size, size))
s2.add(random.randint(-size, size))
tik = time.time_ns()
s3 = s1.intersection(s2)
tok = time.time_ns()
return (tok - tik) / size
fn perf_test_intersection(size: Int) -> Float64:
var f1 = fiby()
var f2 = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f1.add(i)
f2.add(i)
let tik = now()
f1.intersection_inplace(f2)
let tok = now()
return (tok - tik) / Float64(size)
It seems like Python Set has a very efficient way to compute intersection, at least for smaller sets, the larger sets are still ~2x slower than FibyTree. Sorted Set is 3β5x slower, with a spike of 17x for 10 elements set. Not sure what is happening there but the spike was consistent throughout multiple runs.
Build a difference of two sets
Similar procedure, Python and Mojo code below:
def perf_test_difference_p(size, sorted=True):
total = 0
s1 = SortedSet() if sorted else set()
s2 = SortedSet() if sorted else set()
for i in range(size):
s1.add(random.randint(-size, size))
s2.add(random.randint(-size, size))
tik = time.time_ns()
s3 = s1.difference(s2)
tok = time.time_ns()
return (tok - tik) / size
fn perf_test_difference(size: Int) -> Float64:
var f1 = fiby()
var f2 = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f1.add(i)
f2.add(i)
let tik = now()
f1.difference_inplace(f2)
let tok = now()
return (tok - tik) / Float64(size)
We observe Set being 1β5x slower and SortedSet 5β9x slower with a 19x spike for 10 elements set.
Build a symmetric difference of two sets
Similar procedure, Python and Mojo code below:
def perf_test_symmetric_difference_p(size, sorted=True):
total = 0
s1 = SortedSet() if sorted else set()
s2 = SortedSet() if sorted else set()
for i in range(size):
s1.add(random.randint(-size, size))
s2.add(random.randint(-size, size))
tik = time.time_ns()
s3 = s1.symmetric_difference(s2)
tok = time.time_ns()
return (tok - tik) / size
fn perf_test_symmetric_difference(size: Int) -> Float64:
var f1 = fiby()
var f2 = fiby()
for _ in range(size):
let i = random_si64(-size, size).to_int()
f1.add(i)
f2.add(i)
let tik = now()
f1.symmetric_difference_inplace(f2)
let tok = now()
return (tok - tik) / Float64(size)
We observe Set being 1β6x slower and SortedSet 7β11x slower with a 20x spike for 10 elements set.
All in all, I think FibyTree is a viable data structure to investigate further. As I mention in the beginning of the article, this benchmarks can be seen as preliminary, but they already establish some baselines we can expect from the data structure. As next steps, I will consider, if there are any vectorisation and parallelisation techniques I could utilise to make the operations run faster and maybe reducing some memory management overhead by utilising stack allocations and direct heap allocations (dropping DynamicVector
in favour of direct memory allocation and manual capacity management).
Benchmark results summarised
Insert an element to the set:
Set: 4β13x
SortedSet 6β13x
Check if an element is in the set:
Set: 3β4x, 9x spike for 10 elements set
SortedSet: 4β6x, 10x spike for 10 elements set
Delete an element from the set:
Set: 4x
SortedSet: 14β36x
Build a union of two sets:
Set: 1β7x
SortedSet: 6β12x, 26x spike for 10 elements set
Build an intersection of two sets:
Set: 1β2x
SortedSet: 3β5x, 17x spike for 10 elements set
Build a difference of two sets:
Set: 1β5x
SortedSet: 5β9x, 19x spike for 10 elements set
Build a symmetric difference of two sets:
Set: 1β6x
SortedSet: 7β11x, 20x spike for 10 elements set
A Message from AI Mind
Thanks for being a part of our community! Before you go:
- π Clap for the story and follow the author π
- π° View more content in the AI Mind Publication
- π§ Improve your AI prompts effortlessly and FREE
- π§° Discover Intuitive AI Tools