Big or Large data?

In early 2010x the term Big Data became a very popular buzz word.

There was even a widespread joke by Dan Ariely:

Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it…

While companies like Facebook and telecoms naturally process the tremendous amount of information for everyday life you usually face the problem of a large data.

In short large data can be described as:

  • anything MS Excel cannot handle


  • too big to fit into a memory (RAM) of a single computer

While I personally love Excel in can process up to ~1 million lines of data.

Another example would be working with a .csv file of a 100-200Mb size. Possible but rather fragile.

What if your task requires data which are x10, x100 or even x1000 times bigger?

The answer is short:

  • Definitely, need special treatment (not suitable for a single computer)
  • Surely not an Excel domain
  • Honestly, still a large data