We are bench-marking more of the APIs we have published to the Streamdata.io API Gallery. Looking for the APIs that have the highest StreamRank, identifying the ones that change the most frequently. We’ve been spending a lot of time playing with the Stack Exchange API, because it has a lot of interesting possibilities across the large network of QA sites. Getting to know each of the APIs a little more we like to poll each individual API path, so that we can benchmark how much an API changes over a 24 hour period.
After profiling all of the Stack Exchange APIs, and benchmarking many of them, we found the following API paths to have the highest StreamRank hovering around 50%, and as high as 75%. Providing the best opportunity for turning into real time streams, as well as topical streams based upon the top tags for each of the QA sites:
– Get Answers – Returns all the undeleted answers in the system
– Get Badges – Returns all the badges in the system
– Get Comments – Gets all the comments on the site
– Get Events – Returns a stream of events that have occurred on the site
– Get Posts – Fetches all posts (questions and answers) on the site
– Search – Searches a site for any questions which fit the given criteria
– Get Tags – Returns the tags found on a site
We will keep benchmarking the Stack Exchange APIs and see if we can increase the StreamRank over time. We suspect there are busier days than the one we polled the API on. The Get Events API path is the most active with a 76% change rate, compared to the 50% for most of the rest. Allowing us to prioritize which APIs we should be targeting when it comes to mining data across Stack Exchange APIs.
StreamRank is all about benchmarking the APIs that change the most, and will make for the most active streams. The next step in this research is to look at the parameters available for these APIs, and further understand the many other dimensions that exist across the Stack Exchange API platform. The answers, badges, comments, events, posts, and tags are accessible across a number of very active QA sites, providing a wealth of opportunities to provide real time streams of data to interested subscribers.