Currently, the updated, 2003 version of the SPARE Parts toolkit is available for download (zip). It has been succesfully compiled with both GCC (version 3.3.1) and Microsoft Visual C++ .NET 2003, but should work with any recent C++ compiler that comes with a decent STL implementation.
- SPARE Parts 2003 includes prefix- and suffix-based algorithms for single and multiple keyword pattern matching.
- The toolkit is described in the following article:
Bruce W. Watson and Loek Cleophas. SPARE Parts: a C++ toolkit for string pattern recognition. Software---Practice & Experience, 34(7):697-710, John Wiley & Sons, June 2004.
For more information, see Publications -> Journal Articles
SPARE Time is a new version in our series of toolkits for pattern matching. Like its predecessor, SPARE Parts, it is based on a taxonomy of formally derived algorithms and written in C++. SPARE Time has been succesfully compiled using both GCC and Microsoft Visual C++ .NET, and will soon be available for noncommercial use through this website.
- It contains prefix-, suffix-, factor- and factor oracle-based algorithms for both single and multiple keyword pattern matching.
- We are currently performing benchmarking, both on English text and DNA sequences. This benchmarking will probably give rise to some code tuning.
- SPARE Fuel, a DSL for SPARE Time, is being developed, using the benchmarking results to enable easy selection of the best algorithm in a given situation. SPARE Fuel will be generated by feeding an XML specification to Fuel, a generic DSL Parser & Generator written in Ruby.
- Future work will extend SPARE Time's applicability to extended patterns, regular expressions, approximate and multi-dimensional pattern matching.
|