Reddit allows for API searches without being authenticated, but if you authenticate using OAuth, they’ll let you make more API calls, which allows for longer running streams. Additionally, if you authenticate with GitHub, it will allow you to save the topical stream to a GitHub repository stored within your GitHub account. The feature requires repository level access to your GitHub to turn on this feature and allow for data to be saved to the repository, but it is something that opens up an entirely new way for developing data lakes, that can then be put to work using Git, or with the GitHub API. Streaming in data from a variety of 3rd party API sources, and storing it within private GitHub repositories, for use in a variety of applications, and the training of machine learning models.
Right now the prototype will only stream one topic at a time, but it should demonstrate the potential for streaming from Reddit using their API, Streamdata.io, and GitHub. If you have any questions about the prototype, feel free to submit an issue for the GitHub repository. We are working on deploying a Stack Exchange, Twitter, and GitHub search versions of the same prototype. For right now we’ll keep publishing collections of topics, publishing streams of data within intended areas, which is something we’ll eventually open up to wider search capabilities in future versions. This streaming Reddit topic subscription prototype is not production ready and just meant to demonstrate the potential of streaming APIs on GitHub. If you are looking for a specific implementation or would like to obtain a more stable version of this micro application, please let us know.