eRSA services in use: Nectar Cloud
The Cooperative Research Centres (CRC) programme is an Australian Government Initiative which brings researchers and industry together to improve outcomes, the Data to Decisions CRC (D2D CRC) is Australia’s leading provider of Big Data capability for defence and national security.
Established in July 2014, with a grant of $25 million, the D2D CRC is tasked with the role of using Big Data technology to extract useful intelligence and insight from otherwise unmanageable data, providing a safer and more secure Australia.
David Blockow, Software Architect for D2D CRC, has been working on two major projects since the CRC was established, for the predictive and retrospective analysis of data.
“D2D is slightly different in that we’re not just an administrative body – we facilitate research like other CRCs but we also have a core engineering team, responsible for building software platforms for our end-users to support their research and industry projects,” David said.
After undertaking a Future Study to better understand the needs of their end-users, the D2D CRC found that open source intelligence was a major requirement, particularly among security and intelligence organisations.
“Open source intelligence is about extracting information from blogs, news sites and social media platforms, any information that can be freely accessed online,” he said.
“There’s way too much information out there for an individual or group of people to comprehend, so we have created automated tools to allow our users to extract the data they need.”
Working on two major platforms, a predictive tool called Beat the News and a retrospective tool called Apostle, David said D2D’s partnership with eRSA has been invaluable in allowing them to build, store and share their data and software.
“As a CRC partner, eRSA provide us with compute (CPU cores) and storage infrastructure to host our platforms – we’re running a Hadoop cluster and Apache Spark as our processing framework through eRSA,” David said.
“The nice thing about Hadoop and Apache Spark is that they process all of the data locally. The tools allow researchers to perform processing in the Cloud without having to download or move any data.
“All of the infrastructure is hosted in the Nectar Cloud through eRSA, allowing us to build indexing and search functionality into our platforms as well as research-specific functionality, so users can quickly filter for a particular data set from their desktop.”
Used by researchers and government organisations, Beat the News is an analysis tool that aims to predict events that will happen in the future by looking at social media platforms, such as Twitter, to gauge sentiment.
“When leading up to an election for instance, the tool can be used to assess which political parties are being mentioned on twitter and the feeling towards those parties – allowing our users to make predictions about who might win,” David said.
“We’ve already used the platform to do this successfully, and have results that are more accurate than traditional polling.”
The platform was built for experimentation, for making predictions, running analytics and ingesting data, but it also provides researchers with an opportunity for collaboration.
“It’s a sandpit area for researchers to work – they can connect remotely, run their algorithms, test things out, bounce ideas off of others, then when they’re happy they can promote their algorithm to the production system so others can use it to make predictions,” David said.
“Apostle is more retrospective, and allows users to explore an event that has taken place in the past, by looking at all of the open data that has been ingested into the system over time, and analysing it after the fact.
“It’s a fast growing field, and with eRSA we have access to the latest technology allowing us to continue to provide the best platforms to our end-users.”
Want to have a chat with Dave Blockow about his project and tools used? Details below: