There are two different algorithms for solving the substring finding problem. The first one is to build a suffix tree and the second is to use dynamic programming.
Using a suffix tree has the benefit of having the complexity O(n + m), but is hard to parallelize. Has someone tried to parallelize this algorithm? How good does it scale? I haven't tried it, because I assume that it wont scale well due to data congestion.
Using dynamic programming has the benefit of good scalability, but has a complexity of O(n * m). It will loose against O(n + m) if n and m are both big. We focused on improving the dynamic programming algorithm.
I am curious what you are using. Are you confident enough in your solution that you are willing to share some of your knowledge?