Data Analytics & Data Science

The Sexiest Job of the 21st Century

Image for post
Image for post

The world is producing more and more data every year. Scientist estimate the total data stored as of 2011 was 295 Exabyte’s of data. David Reinsel estimates the amount of data being produced to reach 163 Zettabytes by 2025, and as of 2016 IBM reported about 90% of the worlds data had been created within the past 2 years. With all of this data companies will need someone to make sense of it all.

IBM Predicts Demand For Data Scientists Will Soar 28% By 2020

What is Data Science ?

Data science is a mixture of statistics, data analysis, machine learning, computer science, and knowledge of the data / business that aims to provide insights and understanding from data. In fact it was used as a substitute for computer science by Peter Naur. Data science has recently become a buzzword after the Harvard Business Review called it “The Sexiest job of the 21st century”. H.C. Carver characterized it as a trilogy of collecting data, modeling and analyzing the data and then making decisions based off of the data.

Data Scientist Salary

One of the things that make data science so appealing besides it being in high demand, is its high salary. According to Glassdoor.com the average base pay for a data scientist is $120,931 per year as of this writing, with a low salary of about $87,000 and a high of about $158,000.

Image for post
Image for post

What is Data Analytics ?

Data analysts are sometimes called “junior data scientists” or “data scientists in training.” A data analyst is someone who inspects, cleanse, transform, and models data with the goal of discovering useful information, informing conclusions, and supporting decision making. There are plenty of companies that don’t make a clear discussion between a data analyst and a data scientist.

Depending on their level of expertise, data analysts may:

  • Work with IT teams, management and/or data scientists to determine organizational goals
  • Mine data from primary and secondary sources
  • Clean and prune data to discard irrelevant information
  • Analyze and interpret results using standard statistical tools and techniques
  • Pinpoint trends, correlations and patterns in complicated data sets
  • Identify new opportunities for process improvement
  • Provide concise data reports and clear data visualizations for management
  • Design, create and maintain relational databases and data systems
  • Triage code problems and data-related issues

Data Analyst Salary

Although data science and data analytics are very similar, the pay scale does not seem to represent that as data analyst get paid less than a data scientist according to Glassdoor.com. No wonder why so many Universities and people have changed their title or curriculum from data analyst to data scientist. Glassdoor.com shows the average base pay per year for a data analyst is $83,878 with a low of about $54,000 and a high of about $107,000 as of Sep 23, 2018.

Image for post
Image for post
https://www.glassdoor.com/Salaries/data-analyst-salary-SRCH_KO0,12.htm

Why Learn Data Science/Analytics

If the work that you will be doing, and the salary haven’t already convinced you that this is a great field, maybe this paragraph will. Statistics is used everywhere to find patterns and make predictions. Statistics is a huge part of data science/analytics, and it is used in the medical field to figure out how effective a medicine is, it’s used in sports to help players play better, be healthier, and play longer , statistics is being used in finance to predict the stock market, as well as engineering, politics and psychology. Basically if you have the skillset of a data analyst / scientist, you can pretty much work in any industry.

Many people consider machine learning a super power, so yes data scientists who do machine learning are pretty much super heroes. They are so influential in fact , that they could help influence a whole U.S. presidential election, I’m looking at you Cambridge Analytica.

Where Is Data Science / Analytics Being Used ?

The NBA:

One of my favorite sports to play and watch is basketball, and I have become even more interested in not just the game itself or the players but the statistics in sports in general. Based on Oliver who has a Ph.D in engineering from North Carolina and later went on to work for the professional basketball team the Seattle Supersonics to put his theories to practice, believed that the four most important keys for team success in basketball and their relative weights are:

  • Shoot a high field goal percentage (10).
  • Do not commit turnovers (5–6).
  • Get offensive rebounds (4–5).
  • Get to the foul line frequently (2–3).

You can read his book called “Basketball on Paper” (Brassy’s, Inc. 2004). Oliver has called his book the “Moneyball” of basketball. In fact the NBA likes data so much that they have a hackathon event every year for data analyst / scientists.

Image for post
Image for post

Healthcare:

Our health is probably one of the most important things in life, and it too is another field that uses data science and data analytics to improve. A good example would be the application of machine learning to identify/ diagnose diseases. According to a 2015 report, more than 800 medicines and vaccines to treat cancer were in trial from machine learning research.

Disney World:

Disney has combined data analytics and entertainment/experience into one to create a more personal and supreme magical entertainment experience for guests at Walt Disney World. Disney collects a lot of data, and all of this data mining allows Disney to understand past behavior and make personalized offers using predictive analytics. One of the main projects Disney used to gather data was the MyMagic+ initiative which combined (FastPass+ , Magic Bands, and My Disney Experience ) and helped Disney accommodate 3,000 additional daily visitors.

In 2013 Disney introduced the Magic Bands at Walt Disney World Orlando FL. These wrist bands are water proof and use short range Radio Frequency IDentification (RFID) technology and a 2.4 GHz transmitter that tracks your location while inside the park. They are fully customizable and allow users to have a personal experience. Oh and did I mention it allows Disney to gather all of this data about the individual, of course they keep your data private. The amount of information that is able to be collected from a family that uses the bands during their vacation is priceless.

Image for post
Image for post

OKCupid:

OKCupid is a dating website, created from a group of Harvard mathematicians that have found 3 questions that you could ask on a first date that could help determine whether a couple have the potential to last the distance, all from the data they collected on OKCupid users. According to these mathematicians, if you can find someone that answers all three of the questions the same way you do, the two of you are a perfect match. There are many other important questions of course like “Do you want kids ?” , “Do you believe in God?” , etc. but those are pretty heavy questions for the first date, so the mathematicians found some lighter questions that can be asked that have statistical significance. The 3 questions you can ask on a first date are below.

1. Do you like horror/scary movies?
2. Have you ever traveled to another country alone?
3. Would you like to ditch it all and go live on a sailboat?

Other fun first date questions that have underlying meanings.

If you want to know if the person you are on a date with will have sex with you on the first date ask:

Do you like the taste of beer?

If the answer is yes, it is likely they will have sex with you on the first date.

If you want to know if the person you are on a date with has the same political views ask:

Do you prefer the people in your life to be simple or complex?

If the person chooses simple then they are most likely conservative, and if they chose complex they are most likely liberal.

If you want to know if the person you are on a date with has the same religious beliefs ask:

Do spelling and grammar mistakes annoy you?

If your date answers ‘No’ , meaning that grammar mistakes do not annoy your date then there is a good chance that your date is at least moderately religious.

Fitbit:

Fitbit tracks the user’s activities throughout the day, like exercise, sleep and calorie intake. Their devices help the user to monitor their habits like eating and activity habits to make better lifestyle choices. They are using users data to create algorithms to diagnose sleep apnea and atrial fibrillation (AFib).

Image for post
Image for post

That’s just to name a few organizations and fields that you may or may not expect to have data scientists and data analysts. Of course I didn’t mention the big tech companies like Facebook, Google, Apple, Microsoft, and Amazon who all collect data on their users.

Image for post
Image for post

Awesome Data Science / Analytic Projects To Get Started:

Iris Data Set:(Classification)

Image for post
Image for post

Problem: Predict the class (Iris Setosa, Iris Versicolour, Iris Virginica) of the flower based on the attributes available in the data set which is composed of 4 columns (sepal length, sepal width, petal length, petal width) all in centimeters and 150 rows.

2. Loan Prediction Dataset:(Classification)

Image for post
Image for post

Problem: Predict if a loan will get approved or not. This is another classification problem and will allow you to work on an insurance companies data set. The data has 13 columns and 615 rows.

3. Bigmart Sales Data Set (Regression)

Image for post
Image for post

Problem: Predict the sales of a store. The data set for this problem contains 12 columns and 8,523 rows, and comprises of transaction records of a sales store.

4. Movie Lens Data Set

Image for post
Image for post

Problem: Make movie suggestions for users, a recommendation engine. This data set will allow you to build your own movie recommendation engine. This dataset contains 100,000 ratings, 1,300 tags applied to 9,000 movies by 700 uses.

5. Boston Housing Data Set

Image for post
Image for post

Problem: Predict the median value of owner occupied homes. This is another popular dataset used in pattern recognition literature. The data set comes from the real estate industry in Boston (US). This is a regression problem. The data has 506 rows and 14 columns. Thus, it’s a fairly small data set where you can attempt any technique without worrying about your laptop’s memory being overused.

6. Time Series Data Set

Image for post
Image for post

Problem: Predict the future price of stocks.

What Kind of Skills Will You Need?

According to mastersindatascience.org , you need the following:

Technical Skills

  • Statistical methods and packages (e.g. SPSS)
  • R, Python and/or SAS languages
  • Data warehousing and business intelligence platforms
  • SQL databases and database querying languages
  • Programming (e.g. XML, Javascript or ETL frameworks)
  • Database design
  • Data mining
  • Data cleaning and munging
  • Data visualization and reporting techniques
  • Working knowledge of Hadoop & MapReduce
  • Machine learning techniques

Business Skills

  • Analytic Problem-Solving: Employing best practices to analyze large amounts of data while maintaining intense attention to detail.
  • Effective Communication: Using reports and presentations to explain complex technical ideas and methods to an audience of laymen.
  • Creative Thinking: Questioning established business practices and brainstorming new approaches to data analysis.
  • Industry Knowledge: Understanding what drives your chosen industry and how data can contribute to the success of a company/organization strategy.

What About Certifications?

There are many big data certifications that are available.(e.g. SAS & SQL).

Where Can You Learn More About Data Science / Analytics

Dive Right Into Machine Learning

Here are some of the best free introductory courses on the “interwebs”.

Free online YouTube Videos for CS231n Winter 2016.

Machine Learning Crash Course By Google

Andrew Ng Machine Learning Course on Coursera

Foundations of Data Science

Prediction: Machine Learning

How To Become A Data Scientist / Analyst

  1. Earn a bachelor’s degree in math, statistics computer science, information management, finance or economics.
  2. Decide if you want to earn a master’s or doctoral degree as most data scientist have one of these degrees.
  3. Sign up for classes that target a specific subject.
  4. Master college-level algebra
  5. Learn statistics
  6. Learn to program.
  7. Have strong communication and presentation skills so you can explain a complicated subject simply.
  8. Learn how to use Microsoft Excel
  9. Learn machine learning
  10. Look for data analysts jobs
  11. Get an internship as a data analyst

There is a great book on Data Science & Data Analytics called Python Data Science Handbook: Essential Tools for Working with Data. It goes over topics in machine learning, Python visualizations, and data manipulation.

Image for post
Image for post
Python Data Science Handbook: Essential Tools for Working with Data

If you are interested in data science and data analytics after reading this article you should check out the website FiveThirtyEight.com they have many interesting articles and statistics on politics, sports, science, health, economics and culture. Thanks for reading this article, I hope you found it very helpful and enjoyable as I enjoyed writing it! Happy learning and I will see you all in my next article.

If you enjoyed this article, please leave many claps on here and share it, thanks for taking the time out of your busy day to read this article !

Check Out the following for content / videos on Computer Science, Algorithm Analysis, Programming and Logic:

YouTube Channel:
randerson112358: https://www.youtube.com/channel/UCaV_0qp2NZd319K4_K8Z5SQ

compsci112358:
https://www.youtube.com/channel/UCbmb5IoBtHZTpYZCDBOC1CA

Website:
http://everythingcomputerscience.com/

Video Tutorials on Recurrence Relation:
https://www.udemy.com/recurrence-relation-made-easy/

Video Tutorial on Algorithm Analysis:
https://www.udemy.com/algorithm-analysis/

Twitter:
https://twitter.com/CsEverything

RESOURCES:

NBA