When you are working with streams of data and content these days you almost always want to be applying some sort of machine learning along the way, to help you make sense of the information coming in via your real-time streams. There are a growing number of machine learning platforms available out there today, and one of our favorites is Algorithmia. They provide a marketplace of text analysis, computer vision, and deep learning algorithms, available as simple to use and affordable APIs. To help some of our customers expedite the ways in which they can make sense of their data and content streams, we want to be regularly profiling different machine learning APIs that can be used to augment and enrich data you are delivering using Streamdata.io. Here are 30+ of our favorite text analysis machine learning APIs from Algorithmia.
– Summarizer (Summarizer) – Summarize english text
– Sentiment Analysis (SentimentAnalysis) – Determine positive or negative sentiment from text
– AutoTag (AutoTag) – Automatically extract tags from text
– Auto-Tag URL (AutoTagURL) – Automatically generate keyword tags for a URL.
– Social Sentiment Analysis (SocialSentimentAnalysis) – Gives the positive, negative and neutral sentiment of an English sentence.
– Parsey McParseface (deeplearning/Parsey) – Parse sentences with ease.
– Named Entity Recognition (NamedEntityRecognition) – Retrieves recognized entities from a body of text using the StanfordNLP library. Currently, it identifies named noun type entities such as PERSON, LOCATION, ORGANIZATION, MISC and numerical MONEY, NUMBER, DATA, TIME, DURATION, SET types.
– Programming Language Identification (ProgrammingLanguageIdentification) – Detect the programming language of source code
– Keywords For Document Set (KeywordsForDocumentSet) – Compute relevant keywords for a set of documents
– GetNGramFrequencies (GetNGramFrequencies) – Gets lists of N-grams from an input text.
– Sentiment By Term (SentimentByTerm) – Find the sentiment associated with particular words in a document
– Sentiment Analysis (SentimentAnalysis) Introduction Sentiment analysis (also known as opinion mining) refers to the use of natural language processing, text.
– Word2Vec (Word2Vec) – Get similar words by vector arithmetic
– Analyze Twitter User (AnalyzeTwitterUser) – Analyze any Twitter User
– Profanity Detection (ProfanityDetection) – Detect profanity in text automatically
– Smart Text Extraction (SmartTextExtraction) – Extract text the smart way
– Named Entity Recognition (NamedEntityRecognition) – Recognizes and returns entities in a given sentence
– Keyword Extraction (KeywordExtraction) – Keyword/KeyPhrase Extraction from Sentence
– Text Similarity (TextSimilarity) – Find the most similar text files within a collection of documents
– Google Translate (GoogleTranslate) – Translation between languages
– Lemmatizer (Lemmatizer) – Maps all words to their canonical forms for easier analysis.
– Language Detector (LanguageDetector) – Detects language of a given text
– Address Extraction From Text (AddressExtractionFromText) – Extracts addresses, contact data, company names and other information from text.
– DateExtractor (DateExtractor) – Extracts dates from raw text
– Word Frequency Counter (WordFrequencyCounter) Takes in a string and returns a Map of [word, frequency].
– Social Media Image Recommender (SocialMediaImageRecommender) – Get image recommendations based on text content
– Text 2 Emoji (Text2Emoji) – Translate English text to emojis
– ExtractLocation (ExtractLocation) – Extracts location from raw text
– Firstname Detection (FirstnameDetection) – Detect first names in texts
– Email Extractor/Parser (EmailExtractor) – Extracts emails from text with regular expression
– NumberExtractor (NumberExtractor) Extracts numbers (positive, negative and floats) from raw text.
– Phone Number Extraction (PhoneNumberExtraction) – Extracts phone numbers from a piece of text
That represents a pretty nice cross-section of the types of machine learning models you will be wanting to apply to data and content. Helping us all make sense of, and enrich data that is moving along via our data pipes. It is common for our customers to perform sentiment analysis, enrich with tags, and extract names, dates, emails, and other relevant information for streams as they arrive, or as they are being delivered to other destinations. By adding additional tags, meaning, and other metadata, it makes it easier to connect and aggregate data across real-time streams, and transform existing streams into richer topical feeds.
We are working on profiling, not just Algorithmia, but a number of other machine learning APIs. As we establish interesting collections of text analysis, deep learning, and other algorithms that can be applied to Streamdata.io streams, we’ll publish here on the blog. If you have specific data and content, or machine learning model that you’d like to have delivered as part of your real-time infrastructure let us know. We are happy to prioritize specific types of data or profile more relevant machine learning APIs providers to help expedite your work. We are beginning to ramp up our efforts to profile relevant machine learning models, as the demand from our customers’ increases, hoping to satisfy our customers demand for machine learning intelligence as they continue to optimize their streams of data across their organization.