http://magma.ua4vjlx72wv5crhkificaeysp62hizhazipfshsdvs6jxqvhtkpllcad.onion/guide/osint-sources.html
Internet-Wide Scan Data Repository - A public archive of
research datasets that describe the hosts and sites on the Internet. Common Crawl - A corpus of web
crawl data composed of over 25 billion web pages. GDELT - The Global Database of Events,
Language and Tone, a project that " monitors the world's broadcast, print, and
web news from nearly every corner of every country in over 100 languages and
identifies the people, locations, organizations, counts,...