C - Binary Tree

C - Binary Tree

Hey,

I am currently trying to program a Binary Tree for the first time, and I have hit a bit of a stumbling block and hope that I can get some insight from you guys about to how to continue.

I have most of the functionality in place at the moment (building the tree, inserting and finding nodes), however the part that I am perplexed by is the deletion of nodes from said tree.

I understand the theory behind node deletion, which is as follows (please correct me if I am wrong):

- If the node has no children, delete it outright.
- If the node only has a left or right child, replace the node to be deleted with it's child and delete the child
- If the node has has two children, find the logical successor to the node to be deleted (my understanding of this is, if possible, move right, and then as far left as you can go, thus finding the next largest node to replace the one being removed).

My question is this:

When I find the node that I want to replace my node that is to be deleted with, should I simply be swapping their values? Or, would it make more sense to re-assign the pointer of the parent node to the logical successor and do a bit of clean up thereafter?

Here is a pastebin of my code thus far (yes, I haven't put in my deleteNode(), as soon as I know the right approach, I will):

http://pastebin.com/BtsbmEZP

If you guys could help I would much appreciate it, and also, if you have any input as to how I could make my current code more efficient, please tell me.

Some typos - 20110517
nullseifer
New User

Posts: 20
Joined: Wed Dec 29, 2010 8:39 am
Blog: View Blog (0)

Re: C - Binary Tree

It's more typical and makes more sense to just swap the pointers then to swap the value. This is especially true when you implement generic data structures where the value type may vary in size per BST.

Other then the missing deletion routine, good job.

You should look into creating a self balancing binary tree next. A self balancing tree is a BST that will shift it's elements around as they are inserted and deleted so that the depth is full before creating arbitrary depths, this enforces that lookups, deletes and insertions remain O(log n) at all times. A typical BST, like yours can become unbalanced when elements are inserted in sequence and the lookup, deletion and insertion performance becomes equivalent to a linked list. Typical self balancing trees are red-black trees and my favorite being the AVL tree because it's more rigidly balanced and funner to implement.
"If art interprets our dreams, the computer executes them in the guise of programs!" - SICP

“If at first, the idea is not absurd, then there is no hope for it” - Albert Einstein

thetan
Contributor

Posts: 653
Joined: Thu Dec 17, 2009 6:58 pm
Location: Various Bay Area Cities, California
Blog: View Blog (0)

Re: C - Binary Tree

I'll get the delete method finished tonight, unfortunately, moving house had to take priority over this. And I can't just code myself a new girlfriend.

Creating a self-balancing tree was actually next on my 'to-do' list, but I'll be honest and say that an RB tree was the only one I had considered.
After looking into AVL trees (a little), they appear to be pretty similar to RB trees, especially in their lookup, insertion and deletion times. (Although Wiki does say that AVL trees have faster lookups, with RB trees winning on insertion and removal).

My next question would be this:

If trees like a RB tree or an AVL tree are self balancing, maintain O(log n) and have (seemingly) no probability that they will end up being the equivilent of a linked list - why use Binary trees at all? Am I being short sighted here?
nullseifer
New User

Posts: 20
Joined: Wed Dec 29, 2010 8:39 am
Blog: View Blog (0)

Re: C - Binary Tree

In general, there are very few real world applications for a vanilla binary search tree. They're typically used as educational steps to get people used to advanced data structures. So they're kind of like the training wheels for computer science.

You will use unbalanced trees a good majority in implementing advanced data abstractions but they won't really be binary search trees. A perfect example is a tree representation of a parsed XML document. Compilers also parse all operations into a nested structure called an Abstract Syntax Tree (AST) that it can easily walk and apply optimization routines too and transform back into machine code. Interpreters will convert text into an Abstract Syntax Tree and and walk it as it executes the code and this is a very natural representation of code as a data structure.

Mostly all databases and more and more filesystems use a data structure called a B-tree or a B+tree, neither of which should be confused with a binary tree. They specialize in minimizing disk seeks and node rebalancing overhead.

Also, most higher level languages use what's known as a hash table for key value stores (perl, python and php for example) in liu of tree type lookups. Notable exceptions beeing C++ (they just don't have a native hash table implementation) and haskell (immutable data types makes trees more favorable)

Heres a good white paper on a ton of data structures http://www-old.oberon.ethz.ch/WirthPubl/AD.pdf
"If art interprets our dreams, the computer executes them in the guise of programs!" - SICP

“If at first, the idea is not absurd, then there is no hope for it” - Albert Einstein

thetan
Contributor

Posts: 653
Joined: Thu Dec 17, 2009 6:58 pm
Location: Various Bay Area Cities, California
Blog: View Blog (0)