Overview of Hadoop characteristics and solution provided by Google

  • Apache Hadoop provides a reliable shared storage and analytics (MapReduce) system (HDFS).
  • It is highly scalable, and the Apache Hadoop scales linearly unlike the relational databases. An Apache Hadoop Cluster can contain tens, hundreds, or even thousands of servers because of the linear scale.
  • Moreover, it is very cost-effective because it can work with hardware from components and does not require costly high-end hardware.
  • It is highly flexible, being able to process both structured and unstructured data.
  • You can have built-in tolerance to the error. Data is replicated across multiple nodes (replication factor is configurable) and if a node goes down, it is possible to read the required data from another node that has that data copied. And it also ensures that the replication factor is preserved by replicating the data to other available nodes, even when a node goes down.
  • It operates on the one-time writing theory and multiple reads.
  • You can use it for both large and very large data sets. For example, when fed to Apache Hadoop, a small amount of data, like 10 MB, typically takes longer to process than conventional systems.
  • Analysts
  • Search
  • Retention of data
  • Record The creation of data
  • Text, Image, Audio, and Video content analysis
  • Recommendation systems such as Websites for E-Commerce

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store