Custom classifiers are text classifiers that you can fine-tune to classify whatever you want. They show up the same as all other classifiers in your feeds.
You create them by clicking New Classifier in the Classifiers section.
Methods for creating a custom classifier
You have three methods you can choose from for creating custom classifier: Create an ensemble, Tag similar sentences, or Train a model.
If you don't have the time to train a classifier, start here. Ensembles in Caravel are the easiest way to create a custom classifier because it requires zero training.
In Machine Learning, ensembles are a combination of models that serve as a single model to get better prediction results than what one model can provide. In Caravel, ensembles operate similarly but allow you to use more than just models.
Using ensembles in Caravel you can combine zero-shot classification, keyword search, and prebuilt models to classify your text.
They are incredibly flexible and very quick to get started.
If you have a need to provide feedback to your classifier to continuously improve its accuracy start here.
With the tagging method, you can tag sentences from your sources and train your classifier to predict similar statements. You can start with a few sentences and increase the confidence threshold of these classifiers as you add more samples to make them more precise.
If you have a complex topic you need to classify that is more general in nature, perhaps it's a better fit for a traditional model.
At the time of this writing, model training is accessible by request only.
When using the Ensemble method you have the option to input a Zero-shot topic. This enables you to use inference to predict custom topics with zero training. Caravel has a built-in model that it uses for inference that has been trained on billions of samples of text across the web. Whatever you input as a Zero-shot topic, Caravel's model will work to understand the topic using the knowledge of samples it has observed in the past and predict mentions of the topic within your text. To learn more about Zero-Shot Classification and how it works see here.
All label predictions in Caravel are probabilistic. Meaning, Caravel reports how likely every label is to be a match on a scale of 0 - 100. When you apply a threshold, you tell Caravel to require the probability to be above a certain threshold for it to be a match. The higher the value, the more scrutiny your classifier will apply.
With trainable classifiers, like ones that use the Sentence Tagging method, you can increase the confidence as you add samples to improve the classifier's precision with more data and you can lower the confidence to get predictions with less samples.
When cleaning and parsing your sources of text for classification, Caravel automatically breaks your text into sentences. Each sentence is then passed into your classifiers to be labeled.
When providing samples to train your classifiers, you can tag sentences using our Highlighting UI. This enables you to quickly provide many samples to your classifiers. It also enables Caravel to identify multiple labels from the same classifier within a single message.