What is Our View on Big Data?

“Big Data” is a popular phrase these days as large companies have demonstrated the value of analyzing the huge data streams generated by their clients. Passing the data through a suitable analysis environment creates more business opportunities and fosters growth. The definition of “Big Data” is flexible, depending on the topic of the article you [...]

By | October 8th, 2014|Big Data, Data Intensive Computing|0 Comments

What is Hadoop?

Hadoop is an open source Apache project hadoop.apache.org (downloads are available through mirror sites which are listed on their home page) The current tool for Big Data analysis is Hadoop, a software framework that supports the distribution of computational work to many nodes in a computer cluster. Hadoop’s design makes it particularly effective at processing [...]

By | September 5th, 2014|Big Data, Data Intensive Computing, Hadoop|0 Comments

Why Do Parallel Computation?

The picture of how data gets analysed on a single computer is very simple: a program is written to read the data, do some manipulation and write out the results. This is something that every student of programming learns to do. The figure below shows a whimsical interpretation of software processing a data file. The [...]

By | July 16th, 2014|Data Intensive Computing, HPC|0 Comments

Cluster Interconnect using PCI Express

There is new interconnect technology under development based on the PCI Express bus which promises another jump in communication speed inside a cluster. Versions of the technology are being developed independently by the companies PLX Technologies and A3Cube. Interconnect technology has continued to evolve since the first cluster appeared on the Top500 list. The interconnect allows the machines [...]

By | March 19th, 2014|Data Intensive Computing, HPC, Interconnect|0 Comments