Big Data for Freshers

Big Data for Freshers

Hi, I am an engineering student and wanna know how to start working in the field of Big Data ? Cause I wanna learn more about it before stepping into the industry. I have a great interest in Big Data technologies.

30 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

The simplest way of describing “big data” is to say “lots and lots and lots of data” – because that’s what the term means. When people talk about big data, though, they also generally mean the process for making sense of all that data – filtering it to draw out meaningful patterns, predictions and conclusions.

What is all this data, then, and where is it coming from? Well, it’s everywhere. Think about your working day and you’ll realise you interact digitally with organisations, hundreds, if not thousands of times: you check your emails on your phone, send tweets, order something online, interact with websites, buy your lunch from the local supermarket, and so on. All these interactions create data points that can be captured.

As it happens, every day we are generating 15 petabytes of data (that’s 1,000 to the power of 5) and 12 terabytes of tweets worldwide. We create 350 billion meter readings per annum, and 500 million call data records. And those examples are just the tip of an enormous data iceberg.

Of course, for any one business, you won’t be handling 15 petabytes of data – but you could well be handling hundreds of thousands of data points; if, that is, you have systems in place to capture the way your customers interact with you on their phones, your website, in store, at point of sale, and so on. Each interaction can be captured, giving you reams of useful data that can give you invaluable insights into individual and group customer behaviour – if only you could make sense of it.

That’s where data analytics comes in. Data analytics is the process of making sense of all that data and drawing out useful patterns and insights.

That's a great insight about the field of Data Analytics yaswanth k. Can you also elaborate about the tools and tech I should brace myself with to be able to work in the field of Big Data ?

Here is my advise...

Start programming! For example, in C or C++ write a simple program: Create an array of 1 Giga Elements of random Single-Precision values ( 4 Giga Bytes of memory in total ) and sort it with some sorting algorithm, like Merge, Heap or Quick sort.

Relational database management systems and desktop statistics and visualization packages often have difficulty handling big data. The work instead requires "massively parallel software running on tens, hundreds, or even thousands of servers". What is considered "big data" varies depending on the capabilities of the users and their tools, and expanding capabilities make Big Data a moving target. Thus, what is considered "big" one year becomes ordinary later. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration

Best Reply

some of the skills you may required for "BIG DATA"

1. Apache Hadoop

Sure, it’s entering its second decade now, but there’s no denying that Hadoop had a monstrous year in 2014 and is positioned for an even bigger 2015 as test clusters are moved into production and software vendors increasingly target the distributed storage and processing architecture. While the big data platform is powerful, Hadoop can be a fussy beast and requires care and feeding by proficient technicians. Those who know there way around the core components of the Hadoop stack–such as HDFS, MapReduce, Flume, Oozie, Hive, Pig, HBase, and YARN–will be in high demand.

2. Apache Spark

If Hadoop is a known quantity in the big data world, then Spark is a black horse candidate that has the raw potential to eclipse its elephantine cousin. The rapid rise of the in-memory stack is being proffered as a faster and simpler alternative to MapReduce-style analytics, either within a Hadoop framework or outside it. Best positioned as one of the components in a big data pipeline, Spark still requires technical expertise to program and run, thereby providing job opportunities for those in the know.

3. NoSQL

On the operational side of the big data house, distributed, scale-out NoSQL databases like MongoDB and Couchbase are taking over jobs previously handled by monolithic SQL databases like Oracle and IBM DB2. On the Web and with mobile apps, NoSQL databases are often the source of data crunched in Hadoop, as well as the destination for application changes put in place after insight is gleaned from Hadoop. In the world of big data, Hadoop and NoSQL occupy opposite sides of a virtuous cycle.

4. Machine Learning and Data Mining

People have been mining for data as long as they’ve been collecting it. But in today’s big data world, data mining has reached a whole new level. One of the hottest fields in big data last year is machine learning, which is poised for a breakout year in 2015. Big data pros who can harness machine learning technology to build and train predictive analytic apps such as classification, recommendation, and personalization systems are in super high demand, and can command top dollar in the job market.

5. Statistical and Quantitative Analysis

This is what big data is all about. If you have a background in quantitative reasoning and a degree in a field like mathematics or statistics, you’re already halfway there. Add in expertise with a statistical tool like R, SAS, Matlab, SPSS, or Stata, and you’ve got this category locked down. In the past, most quants went to work on Wall Street, but thanks to the big data boom, companies in all sorts of industries across the country are in need of geeks with quantitative backgrounds.

6. SQL

The data-centric language is more than 40 years old, but the old grandpa still has a lot of life yet in today’s big data age. While it won’t be used with all big data challenges (see: NoSQL above), the simplify of Structured Query Language makes it a no-brainer for many of them. And thanks to initiatives like Cloudera‘s Impala, SQL is seeing new life as the lingua franca for the next-generation of Hadoop-scale data warehouses.

7. Data Visualization

Big data can be tough to comprehend, but in some circumstances there’s no replacement for actually getting your eyeballs onto data. You can do multivariate or logistic regression analysis on your data until the cows come home, but sometimes exploring just a sample of your data in a tool like Tableau or Qlikview can tell you the shape of your data, and even reveal hidden details that change how you proceed. And if you want to be a data artist when you grow up, being well-versed in one or more visualization tools is practically a requirement.

8. General Purpose Programming Languages

Having experience programming applications in general-purpose languages like Java, C, Python, or Scala could give you the edge over other candidates whose skill sets are confined to analytics. According to Wanted Analytics, there was a 337 percent increase in the number of job postings for “computer programmers” that required background in data analytics. Those who are comfortable at the intersection of traditional app dev and emerging analytics will be able to write their own tickets and move freely between end-user companies and big data startups.

9. Creativity and Problem Solving

No matter how many advanced analytic tools and techniques you have on your belt, nothing can replace the ability to think your way through a situation. The implements of big data will inevitably evolve and new technologies will replace the ones listed here. But if you’re equipped with a natural desire to know and a bulldog-like determination to find solutions, then you’ll always have a job offer waiting somewhere.











Thanks Mr. Sergey Kostrov and Mr.yaswanth k. for taking time out and helping me !! I am gonna start right away with everything you have suggested. Kudos !!

have a great start and better future.....! :-)

Sure yaswanth k. sir !! Thank You !!

Thanks for your time nancy a, but I am not a resident of chennai. Hence, the links are of no use !

Hey Ashish A!

I am one of the fans and followers of Big Data. I consider that it is the future, and whoever is in this industry, he is the winner!

It's nice to see that you are interested in this topic too. Thee so many tools now to understand and use big data correctly.

If you are that much interested in big data, I can suggest you visiting this website to find out more about the newest big data tools

DataPlay is an integrated suite of applications, which fully meets your analysis, visualization and presentation needs. It gives integrated project management, complete data management, better and faster analysis, as well as automated rich visualization.

You are welcome with any question you'll have!

John Rosenberg

Awesome comments


Good to hear that John R. and thanks a ton for your suggestion.

Thanks for the explained view ,Ashisha :)


You can go for a training of hadoop and big analytics. I also joined the same from an online training institute. You can also join the training. For more information write at:

thanks for the post were very helpful. I am a student who is studying also about big data and this forum is very helpful.

Our pleasure Aulia R. :)

I have learn the new technical tricks to recover big data with our latest ideas.

thank you yaswanth k for your post about some of the skills you may required for "BIG DATA" .it was very helpfull.

I'm trying to learn Python for machine learning. As I go deeper and deeper I come across more tools and SDK's I have no idea what to do. I'm confused with stuffs like Theano, NumPhy etc.I hope that I'll understand all these sometime

I have look in post for getting data retrieve technique and know about lots of latest technology.

Hi luke l., hope you receive all the necessary information from this post. You can write your queries, if any, for the experts in the community to help you out.

Very nice post here and thanks for it .I always like and such a super contents of these post.Excellent and very cool idea and great content of different kinds of the valuable information's.

Thank you For the Post..! Useful for us.

Big Training courses,


Here i have shared some reference about bigdata & Hadoop Training. hope it will be useful to you.

Big data is a term used to refer to data sets that are too large or complex for traditional data-processing application software to adequately deal with. Data with many cases (rows) offer greater statistical power, while data with higher complexity (more attributes or columns) may lead to a higher false discovery rate.



LIVEWIRE Velachery

NO-01, 2nd floor, railway station service road, Annai Indira Nagar,

Velachery, Chennai-600042

How is Big data fairing among students these days? Is there a lot of interest still?

I am Student I used Internet. Internet means international network connection If we have any information we go and search on the Internet then we will collect information.Many of the students are depending on the Internet service for completing their work.Custom essay writing service provides good thesis papers for academic career and it’s also giving great guidelines for reducing student’s doubts and more ideas and information’s for writing thesis papers. So student can write thesis papers with their own ideas.

Leave a Comment

Please sign in to add a comment. Not a member? Join today