Skip to main content

Implementation of TRIE data structure in Python


What is Trie Data Structure?

Trie is a kind of tree data structure used to store that data in an ordered and efficient way. We use Trie to store string. A single node in the Trie data structure has a character and 26 references to keep 26 different alphabets. Using the Trie data structure we can optimize the search complexity to O(l), the length of the string.


Node Class

Represents a single node of the Trie data structure 

It has 3 instance variables-

1) char - to store the character of this node.
2) is_terminal - A boolean that represents whether it is the end of any word.
3) children - A dictionary that can store this node's 26 references


Trie Class

It has a single instance variable 
1) root - root node of Trie, it does not store any character 

It has 2 main methods 
1) insert() - to insert a word in Trie.
2) search() - to search a word i Trie.

Algorithm

1) Insert 

We are using recursion to insert words. 
a) If the word has a length of 0 -
Check if the word that we are inserting have character length 0, means all character of the word got inserted, as the word ends here we have to mark our current node as a terminal node, is_terminal = True.
 
b)If the word has a length >  0
If the length of characters of the word is greater than 0, get the character at 0 index(new_char) of the word, check in the children dictionary if there is any value with this key (new_char), 

b1) If present, no need to insert, set the new current node(node) having the value of new_char key in children dictionary 

b2) If not present, create a new node (new_node), and in the children dictionary, set new_char as key and new_node as value. Set the new current node(node) having the value of new_node.

c) Call the method on the rest characters of the word 


2)Search

We are using recursion to search words. 
a) If the length of characters of the word we are searching is 0

a1) If the current node(node) is terminal node(is_terminal==True), we found the word, return True.    
   
a2) If the current node(node) is not terminal, the word is not present, return False.

b) If the word has the length > 0 

Get the character at index 0(current_char), 
b1)If current nodes(node) children dictionary has any value whose key is current_char, means this character found in the path, we have to call our method on next the character of the word.
b2) If current nodes(node) children dictionary has not any value whose key is current_char, means this character is not present in Trie and hence the word is also not present, return False.

class Node:
    """ A node in the trie structure """

    def __init__(self, char) -> None:
        # the character stored in the node 
        self.char = char

        # terminal_node, weather this can be end of a word
        self.is_terminal = False

        #a dictionary of child nodes
        #keys are character and values are Node
        self.children = {}

class Trie:
    """The Trie object"""

    def __init__(self) -> None:
        """
        The Trie has atleast the root node 
        The root node does not store any character
        """
        self.root = Node("")
    
    def search(self, word):
        node = self.root
        if self._search_util(node, word):
            print("found the word", word)
        else:
            print(word, "word not found")

    def _search_util(self, node, word):
        if len(word) == 0:
            if node.is_terminal:
                return True
            else:
                return False

        current_char = word[0]

        #if found current char 
        if node.children.get(current_char):
            node = node.children.get(current_char)
            return self._search_util(node, word[1:])
        else:
            #if not found current char 
            return False

            
    def insert(self, word):
        """Insert a word in Trie"""
        node = self.root
        self._insert_util(node, word)

    def _insert_util(self, node, word):

        #Check if len of word is 0, means all character of word got inserted
        if len(word) == 0:
            #as word ends here declare this node as terminal_node 
            node.is_terminal = True 
            return

        #get the characte to be inserted at first position
        new_char = word[0]

        if node.children.get(new_char):
            # if node already present in path
            node = node.get(new_char)
        else:
            # if node not present in path 
            new_node = Node(new_char)
            node.children[new_char] = new_node
            node = new_node
        self._insert_util(node, word[1:])

trie_object = Trie()
trie_object.insert("abhishek")
trie_object.search("abhishek")
trie_object.search("abhishe")

Problems on Trie data structure

1) Implement Trie-
https://www.codingninjas.com/codestudio/problems/implement-trie_631356

2) Longest common prefix-
https://www.codingninjas.com/codestudio/problems/longest-common-prefix_2090383

Comments

Popular posts from this blog

Implementation of LRU Cache in Python

     How to implement LRU cache in python? What is LRU cache? Least-recently-used (LRU) cache replacement is algorithms that removes least recent used data to make room for newdata. Terminology Used in post DLL -> Doubly Linked List Front of DLL -> node just after head node of DLL Last node of DLL-> Node just before tail node of DLL LRU CLASS We have implemented LRU using a Doubly Linked list and Dictionary data structure. There are two nodes at the beginning and end of DLL, (HEAD and TAIL) that both have Key = -1 and Val = -1.  We have used dictionary to store key as key of item and value as node object DLL of the LRU class. We are saving item in dictionary having key as key of LRU cache(Node) item and value as Node object of LRU cache item, so at any instance if we have Key of item we can get the item by getting value of key which is Node object. Instance variable "size" is defined in order to define max size of LRU cache. ALGORITHM put(key, valu...