This application requires being authenticated with GitHub to fetch the search results from the GitHub API, but also uses the GitHub API to save the results to the repository. Something that opens up an entirely new way for developing data lakes, that can then be put to work using Git, or with the GitHub API. Streaming in data from a variety of 3rd party API sources, and storing it within private GitHub repositories, for use in a variety of applications, and the training of machine learning models. Transforming GitHub into not just a data lake, but the source and the destination of real time streams of valuable data, and changing the way we look at how we move data around.
Right now the prototype will only stream one topic at a time, but it should demonstrate the potential for streaming from GitHub using their API, Streamdata.io, and GitHub. If you have any questions about the prototype, feel free to submit an issue for the GitHub repository. We have been working on deploying a Stack Exchange, Twitter, and Reddit search versions of the same prototype. We will keep publishing collections of topics, publishing streams of data within intended areas, which is something we’ll eventually open up to wider search capabilities in future versions. This streaming GitHub topic subscription prototype is not production ready and just meant to demonstrate the potential of streaming APIs on GitHub. If you are looking for a specific implementation or would like to obtain a more stable version of this micro application, please let us know.