{"id":8090,"date":"2019-01-10T11:50:46","date_gmt":"2019-01-10T16:50:46","guid":{"rendered":"https:\/\/blog.brainstation.io\/?p=8090"},"modified":"2020-05-15T11:26:46","modified_gmt":"2020-05-15T15:26:46","slug":"5-free-tools-to-make-data-science-easier","status":"publish","type":"post","link":"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier","title":{"rendered":"5 Free Tools to Make Data Science Easier"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">One of the great things about data science is that many of the state-of-the-art tools used by Data Scientists are free. In fact, the volume of free data tools available is so large it can sometimes be overwhelming. To help you cut through the noise and identify which tools to use, here is a list of the best free software tools for working with data.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">Anaconda Distribution<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">What makes Python a great tool for data science is the large community of Developers who have built Python-based data science libraries. Libraries like NumPy, SciPy, Pandas, scikit-learn, and many others are indispensable to Data Scientists working in Python. Unfortunately, juggling all of these Python libraries is challenging even for the most seasoned Programmer. They can be difficult to install, and many of them have dependencies on certain software outside of Python. <\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/www.anaconda.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">Anaconda<\/a> is a freely available Python distribution and package manager that solves this problem. The Anaconda Python distribution comes pre-installed with over 200 of the most popular data science Python libraries, and the Anaconda package manager provides an easy way to install over 2,000 additional packages without worrying about software dependencies. Anaconda also comes with many other popular tools, including Jupyter Notebook, which enables Data Scientists to work interactively in a browser-based environment.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">RStudio &amp; RStudio Server<\/span><\/h2>\n<p><span style=\"font-weight: 400;\"><a href=\"https:\/\/www.rstudio.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">RStudio<\/a> is an Integrated Development Environment (IDE) tailor-made for performing interactive data analysis and more formal programming in R. It provides the perfect balance between an environment for interactive work with an R console and a data visualization panel, and a fully featured text editor with syntax highlighting and code completion.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">A lesser-known tool is RStudio Server, a fully functional version of the RStudio IDE that runs on a server and is accessed through the browser. This means that you can access the RStudio IDE from anywhere with an internet connection, and offload the computation to dedicated resources. This permits Data Scientists to work with potentially sensitive data without having to download it onto personal machines and to perform complex and computationally heavy work in R from any device.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">OpenRefine<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Originally developed by engineers at Google, <a href=\"http:\/\/openrefine.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">OpenRefine<\/a> is an open source tool for data cleaning. It allows practitioners to read in data that is messy or corrupted, perform bulk transformations to fix errors and generate clean data, and export the results in a range of useful formats.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">One of the best features of OpenRefine is that it tracks every operation performed on a dataset, making it easy to retrace steps and recreate workflows. This is especially useful when you have multiple files that have the same data integrity issues and require the same transformations. OpenRefine allows you to export the sequence of changes that you made to the first data file and apply it to the second, saving hours of repeated work and reducing the potential for human error.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">OpenRefine also has very powerful tools for dealing with messy text fields. For instance, if you have a column in your dataset with the entries \u201cVancouver, BC.\u201d, \u201cVANCOUVER BC\u201d, and \u201cvancouver b.c.,\u201d OpenRefine\u2019s text clustering tools can recognize that these are probably the same, and perform bulk transformations to apply a single label to each occurrence.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">Apache Airflow<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">At most organizations, data does not reside in a single place, nor is it accessible by a single method. There are usually multiple databases, data stores, APIs, and other processes keeping track of data across the organization. A big part of the data team\u2019s job is to move that data from where it resides to where it needs to be for analytics, transforming it as necessary along the way. Ideally, this work should be as automated as possible, and <a href=\"https:\/\/airflow.apache.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Apache Airflow<\/a> can help.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Airflow was developed for internal use by Engineers at Airbnb, and Open Sourced in 2015. It is a tool for mapping out, automating, and scheduling complex workflows that involve many different systems with interdependencies. It provides tools for monitoring the success of these pipelines and alerting Engineers if something goes wrong. Airflow also has a web-based user interface that presents workflows as a network of small jobs so that dependencies can be easily visualized.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">H2O<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">As Machine Learning has matured, a few basic algorithms have become widely applicable. Generalized Linear Models, Tree-Based Models, and Neural Networks have all become fundamental aspects of the Machine Learning toolkit. However, while many of the usual implementations of these algorithms in R and Python are great for prototyping and proofs-of-concept, they don\u2019t scale well to production.<\/span><\/p>\n<p><span style=\"font-weight: 400;\"><a href=\"http:\/\/h2o.ai\" target=\"_blank\" rel=\"noopener noreferrer\">H2O<\/a> is an open-source tool that provides efficient and scalable implementations of the most popular statistical and machine learning algorithms. It can connect to many different types of data stores and will run on anything from a single laptop to a massive computing cluster. It has robust and flexible tools for building and fine-tuning model prototypes, and models built in H2O are easy to deploy in production environments. Best of all, H2O has Python and R APIs so that data scientists can seamlessly integrate it with their existing environments.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">While there are many data tools available, free tools are an excellent place to start in order to speed up and refine your data processes.<\/span><\/p>\n<div class=\"lead-grid-container\">\n<div class=\"lead__card\">\n<div class=\"lead__image\"><img decoding=\"async\" class=\"hide--mobile\" src=\"https:\/\/brainstation.io\/blog\/wp-content\/uploads\/2020\/03\/Data.jpg\" alt=\"Icon\" \/><\/div>\n<div class=\"lead__content\">\n<p id=\"lead__heading\" class=\"heading--4\">Become a Data Scientist in just 12 weeks!<\/p>\n<p class=\"lead__description\">BrainStation&#8217;s <a href=\"https:\/\/brainstation.io\/course\/online\/remote-data-science-bootcamp?utm_source=Blog&amp;utm_medium=BlogPost&amp;utm_campaign=lead_bookCall\" target=\"_blank\" rel=\"noopener noreferrer\">Data Science Diploma Program<\/a> is a full-time, 12-week program that provides professionals with the skills and experience to start a new career in data.<\/p>\n<p id=\"lead__button--margin\"><a id=\"lead__button--hover\" class=\"lead__button\" href=\"https:\/\/brainstation.io\/book-call\/data-science-bootcamp?utm_source=Blog&amp;utm_medium=BlogPost&amp;utm_campaign=lead_bookCall\" target=\"_blank\" rel=\"noopener noreferrer\" data-wplink-edit=\"true\">Speak to a Learning Advisor<\/a><\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>To help you cut through the noise and identify which tools to use, here is a list of the best free software tools for working with data.<\/p>\n","protected":false},"author":7,"featured_media":8091,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[343],"tags":[717,405,716],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v18.9 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>5 Free Tools to Make Data Science Easier | BrainStation\u00ae Blog<\/title>\n<meta name=\"description\" content=\"Trying to make sense of data? To help you cut through the noise and identify which tools to use, here is a list of the best free software tools for working with data.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"5 Free Tools to Make Data Science Easier | BrainStation\u00ae Blog\" \/>\n<meta property=\"og:description\" content=\"Trying to make sense of data? To help you cut through the noise and identify which tools to use, here is a list of the best free software tools for working with data.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier\" \/>\n<meta property=\"og:site_name\" content=\"BrainStation\u00ae Blog\" \/>\n<meta property=\"article:published_time\" content=\"2019-01-10T16:50:46+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2020-05-15T15:26:46+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/d2re7sjnpekmig.cloudfront.net\/prod\/wp-content\/uploads\/2019\/01\/adeolu-eletu-13086-unsplash.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1159\" \/>\n\t<meta property=\"og:image:height\" content=\"400\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"BrainStation\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/brainstation.io\/blog\/#website\",\"url\":\"https:\/\/brainstation.io\/blog\/\",\"name\":\"BrainStation\u00ae Blog\",\"description\":\"The Digital Learning Company\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/brainstation.io\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier#primaryimage\",\"url\":\"https:\/\/d2re7sjnpekmig.cloudfront.net\/prod\/wp-content\/uploads\/2019\/01\/adeolu-eletu-13086-unsplash.jpg\",\"contentUrl\":\"https:\/\/d2re7sjnpekmig.cloudfront.net\/prod\/wp-content\/uploads\/2019\/01\/adeolu-eletu-13086-unsplash.jpg\",\"width\":1159,\"height\":400,\"caption\":\"data scientist\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier#webpage\",\"url\":\"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier\",\"name\":\"5 Free Tools to Make Data Science Easier | BrainStation\u00ae Blog\",\"isPartOf\":{\"@id\":\"https:\/\/brainstation.io\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier#primaryimage\"},\"datePublished\":\"2019-01-10T16:50:46+00:00\",\"dateModified\":\"2020-05-15T15:26:46+00:00\",\"author\":{\"@id\":\"https:\/\/brainstation.io\/blog\/#\/schema\/person\/9f37983a6c4da6cf5dd422481ac8cf11\"},\"description\":\"Trying to make sense of data? To help you cut through the noise and identify which tools to use, here is a list of the best free software tools for working with data.\",\"breadcrumb\":{\"@id\":\"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/brainstation.io\/blog\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"5 Free Tools to Make Data Science Easier\"}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/brainstation.io\/blog\/#\/schema\/person\/9f37983a6c4da6cf5dd422481ac8cf11\",\"name\":\"BrainStation\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/brainstation.io\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/80c14b8388838ae1453aec36606b232d?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/80c14b8388838ae1453aec36606b232d?s=96&d=mm&r=g\",\"caption\":\"BrainStation\"},\"description\":\"BrainStation is a global leader in digital skills training, empowering businesses and brands to succeed in the digital age. Established in 2012, BrainStation has worked with over 250 instructors from the most innovative companies, developing cutting-edge, real-world digital education that has empowered more than 50,000 professionals and some of the largest corporations in the world.\",\"url\":\"https:\/\/brainstation.io\/blog\/author\/brainstation\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"5 Free Tools to Make Data Science Easier | BrainStation\u00ae Blog","description":"Trying to make sense of data? To help you cut through the noise and identify which tools to use, here is a list of the best free software tools for working with data.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier","og_locale":"en_US","og_type":"article","og_title":"5 Free Tools to Make Data Science Easier | BrainStation\u00ae Blog","og_description":"Trying to make sense of data? To help you cut through the noise and identify which tools to use, here is a list of the best free software tools for working with data.","og_url":"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier","og_site_name":"BrainStation\u00ae Blog","article_published_time":"2019-01-10T16:50:46+00:00","article_modified_time":"2020-05-15T15:26:46+00:00","og_image":[{"width":1159,"height":400,"url":"https:\/\/d2re7sjnpekmig.cloudfront.net\/prod\/wp-content\/uploads\/2019\/01\/adeolu-eletu-13086-unsplash.jpg","type":"image\/jpeg"}],"twitter_card":"summary_large_image","twitter_misc":{"Written by":"BrainStation","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebSite","@id":"https:\/\/brainstation.io\/blog\/#website","url":"https:\/\/brainstation.io\/blog\/","name":"BrainStation\u00ae Blog","description":"The Digital Learning Company","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/brainstation.io\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier#primaryimage","url":"https:\/\/d2re7sjnpekmig.cloudfront.net\/prod\/wp-content\/uploads\/2019\/01\/adeolu-eletu-13086-unsplash.jpg","contentUrl":"https:\/\/d2re7sjnpekmig.cloudfront.net\/prod\/wp-content\/uploads\/2019\/01\/adeolu-eletu-13086-unsplash.jpg","width":1159,"height":400,"caption":"data scientist"},{"@type":"WebPage","@id":"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier#webpage","url":"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier","name":"5 Free Tools to Make Data Science Easier | BrainStation\u00ae Blog","isPartOf":{"@id":"https:\/\/brainstation.io\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier#primaryimage"},"datePublished":"2019-01-10T16:50:46+00:00","dateModified":"2020-05-15T15:26:46+00:00","author":{"@id":"https:\/\/brainstation.io\/blog\/#\/schema\/person\/9f37983a6c4da6cf5dd422481ac8cf11"},"description":"Trying to make sense of data? To help you cut through the noise and identify which tools to use, here is a list of the best free software tools for working with data.","breadcrumb":{"@id":"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/brainstation.io\/blog\/5-free-tools-to-make-data-science-easier#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/brainstation.io\/blog"},{"@type":"ListItem","position":2,"name":"5 Free Tools to Make Data Science Easier"}]},{"@type":"Person","@id":"https:\/\/brainstation.io\/blog\/#\/schema\/person\/9f37983a6c4da6cf5dd422481ac8cf11","name":"BrainStation","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/brainstation.io\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/80c14b8388838ae1453aec36606b232d?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/80c14b8388838ae1453aec36606b232d?s=96&d=mm&r=g","caption":"BrainStation"},"description":"BrainStation is a global leader in digital skills training, empowering businesses and brands to succeed in the digital age. Established in 2012, BrainStation has worked with over 250 instructors from the most innovative companies, developing cutting-edge, real-world digital education that has empowered more than 50,000 professionals and some of the largest corporations in the world.","url":"https:\/\/brainstation.io\/blog\/author\/brainstation"}]}},"_links":{"self":[{"href":"https:\/\/brainstation.io\/blog\/wp-json\/wp\/v2\/posts\/8090"}],"collection":[{"href":"https:\/\/brainstation.io\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/brainstation.io\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/brainstation.io\/blog\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/brainstation.io\/blog\/wp-json\/wp\/v2\/comments?post=8090"}],"version-history":[{"count":7,"href":"https:\/\/brainstation.io\/blog\/wp-json\/wp\/v2\/posts\/8090\/revisions"}],"predecessor-version":[{"id":11606,"href":"https:\/\/brainstation.io\/blog\/wp-json\/wp\/v2\/posts\/8090\/revisions\/11606"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/brainstation.io\/blog\/wp-json\/wp\/v2\/media\/8091"}],"wp:attachment":[{"href":"https:\/\/brainstation.io\/blog\/wp-json\/wp\/v2\/media?parent=8090"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/brainstation.io\/blog\/wp-json\/wp\/v2\/categories?post=8090"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/brainstation.io\/blog\/wp-json\/wp\/v2\/tags?post=8090"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}