Here's some sources of data which could be used with machine learning or data mining algorithms.
If you want to do real statistically valid comparisons, you need to be aware of potential problems. For a good introduction to these, see the paper "On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach" ( http://www.cs.jhu.edu/~salzberg/critique.ps ) by Steven Salzberg? ( http://www.cs.jhu.edu/~salzberg/ )