Friday, November 15, 2013

Knuth Morris Pratt (KMP) Algorithm

Objective:
Given a (smaller) pattern string P, find all its occurrences in a (larger) text string T.

Learning Resources:
http://en.wikipedia.org/wiki/Knuth-Morris-Pratt_algorithm
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=stringSearching 

Good Intuition and worked examples:
http://jakeboxer.com/blog/2009/12/13/the-knuth-morris-pratt-algorithm-in-my-own-words/

Short Explanation:
1) Use the pattern string in order to build the Failure Function array. 
2) Use the failure function to skip certain characters in the text and print each occurrence of the pattern in the text.

Complexity: O(length_of_pattern + length_of_text)

Implementation:
// P - pattern, F - failure function, m - P.size()
void FailureFunction(string &P, vector<int> &F, int m) {
    int i = 1, j = 0;
    F[0] = 0;
    while(i < m) {
        if(P[i] == P[j])
            {F[i] = j+1; i++; j++;}
        else if(j > 0)
            j = F[j-1];
        else
            {F[i] = 0; i++;}
    }
}

// P - pattern, T - text
void KMP(string &P, string &T) {
    int m=P.size(), n=T.size(), i=0, j=0;
    vector<int> F(m+1);
    FailureFunction(P,F,m);
    while(i < n) {
        while(j && T[i] != P[j]) j = F[j-1]; 
        if(T[i] == P[j]) j++;
        if(j == m) printf("%d\n", i-j+1);
        i++;
    }
}

Tuesday, November 12, 2013

CPSIG Week 4: String Algorithms and Range Querying

Learning Material

Basic Track:

String Algorithms:

Regular Expressions:
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=regularExpressions


Knuth Morris Pratt (KMP) Algorithm:
Concept:
http://jakeboxer.com/blog/2009/12/13/the-knuth-morris-pratt-algorithm-in-my-own-words/
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=stringSearching

Implementation: 
http://machlearner.blogspot.in/2013/11/knuth-morris-pratt-kmp-algorithm.html


Range Querying:

Sparse Table (ST) Algorithm:
http://www.topcoder.com/tc?d1=tutorials&d2=lowestCommonAncestor&module=Static

Segment Tree:
http://www.topcoder.com/tc?d1=tutorials&d2=lowestCommonAncestor&module=Static
http://letuskode.blogspot.in/2013/01/segtrees.html

Binary Indexed Tree:
http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=binaryIndexedTrees

Advanced Track:

String Algorithms:

Suffix Array: 

O(n*(logn)^2) - Easier to implement in a contest scenario.
Explanation of the Algorithm:
http://discuss.codechef.com/questions/21385/a-tutorial-on-suffix-arrays

Good Examples:
http://www.stanford.edu/class/cs97si/suffix-array.pdf

O(n*logn) - Manber and Myers Algorithm
http://apps.topcoder.com/forums/?module=RevisionHistory&messageID=1171511

O(n) - DC3 Algorithm
http://algo2.iti.kit.edu/documents/jacm05-revised.pdf


A resource where you can find a compilation of resources and problems:
http://qr.ae/GTA78

Practice Problems

Basic Track:

KMP: NHAY, PERIOD, EPALIN, FINDSR, VPALIN
Segment Tree: BRCKTS, LITE, GSS1, GSS3, HORRIBLE, ORDERSET
BIT: INVCNT, MATSUM, YODANESS

Advanced Track:
SARRAY