01. Machine Learning
Paste an email and the model scores it as spam or not using a Naive Bayes classifier trained on the Enron corpus. All inference runs in your browser, no server involved.
Ctrl+Enter to classify
A Naive Bayes classifier is trained on 10,000 emails from the Enron spam dataset. For each email, the model scores the probability that the email belongs to the spam or ham class by summing log-likelihoods over the words present in the message.
Binary bag-of-words representation: each word is counted once per email regardless of frequency. Laplace smoothing handles words not seen during training. Model weights are pre-computed and loaded as JSON at startup.
Training data source: Enron email corpus spam subset. The original classifier used Apache Spark for distributed processing; this demo runs the scoring step entirely in the browser.