‘Big Data’ Today and Tomorrow

This article reviews Big Data’s growth and influence on the companies of tomorrow; and outlines five trendy open source technologies to look out for.

Accoridng to Tim Gasper, Product Manager at Infochimps, companies will have spent $4.3 billion on Big Data technologies by the end of this year. But the initial investments will trigger a domino effect of upgrade and new initiatives, supposedly valued at $34 billion in 2013. So that’s $232 billion in another five years. Evidently, this is just the tip of the iceberg that is Big Data.

Big Data is closely associated with Hadoop and “NoSQL” technologies. It is now possible to steam real-time analytics with ease, spin clusters up and down a cinch, and keep the time down to 20 minutes or less.

Innovation is rapidly changing the face of the current landscape. There are about 250,000 viable open source technologies in the market today, making the system increasing complex with each addition. Have a look at the image at the image on the right. Also check developing software strategies.

What’s on our own radar, and what’s coming down the pipe for Fortune 2000 companies? What new projects are the most viable candidates for production-grade usage? Which deserve your undivided attention?

Let’s look at five new technologies that are shaking things up in Big Data. Here is the newest class of tools that you can’t afford to overlook, coming soon to an enterprise near you.

Storm and Kafka

  • Future of stream processing.
  • Process at linear scale. Works on real-time and is very reliable.
  • Superior approach to ETL (extract, transform, load) and data integration.
  • In use at high-profile companies including Groupon, Alibaba, The Weather Channel

Drill and Dremel

  • Ad-hoc querying of data with low latencies, working on a large scale.
  • Enables scanning of petabytes of data in seconds, to answer ad-hoc queries and power compelling virtualizations.
  • Allows for comparing and contrasting, zooming in and out to create fundamentally new insights.

R

  • Statistical programming language and very powerful.
  • Introduced in 1997, modern language of S language and currently in use by over 2million analysts.
  • R libraries available for practically everything and makes any kind of data available without new code
  • Constantly innovating, having introduced thousands of features in the last few month.
  • Best way to future proof Big Data programms for companies.

Gremlin and Giraph

  • Empowers graph analysis
  • Models computer networks, social networks, map linking, geographic pathways and any form of data linking.
  • Opitimsed illustration of showing how Big Data does not rely on just one database or programme framework.

SAP Hana

  • In-memory analytics platform.
  • Features and in-memory database and suite of tools and software for creating analytical processes and moving data in and out in the right formats.
  • Powerful and free for development use.
  • Benefits applications with unusually fast processing needs, such as financial modeling and decision support, website personalization, and fraud detection.

Honorable mention: D3

  • D3 is still in an evolving process.
  • Javascript document virtualization library making data completely interactive (makes it possible to generate HTML tables from an array of numbers, use numbers to create an interactive bar chart etc.)

Suitable software strategies for your company are a click away.