Introduction to Data Science (for beginners & the curious)


Data Science 2018: An Introduction

We continue our kick off for the year with an introduction to the world of Data Science.

What is Data Science?

Data Science is the study of the generalizable extraction of knowledge from data. Being a data scientist requires an integrated skill set spanning mathematics, statistics, machine learning, databases and other branches of computer science along with a good understanding of the craft of problem formulation to engineer elective solutions. This short lecture will introduce you to this rapidly growing field and introduce some of its basic principles and tools as well as its general mindset. Students will learn concepts, techniques and tools they need to deal with various facets of data science practice, including data collection and integration, exploratory data analysis, predictive modeling, descriptive modeling, data product creation, evaluation, and effective communication. The focus will be on the very high level treatment of these topics, synthesis of concepts and their application to solving problems.

Following this there will a general discussion, Q&A and update on the current state of landscape (recent major annoucenments will be also shared)


Sharan Kalwani
Sharan Kalwani of dataSwing LLC


Data Science for all of us

In the world of High Performance Computing (HPC), data analysis has been part and parcel since day 1. In those days it was known as "Data Intensive Computing". These days the term 'Big Data' sounds more fashionable, but it is more popular and catchy. All of this has been enabled by technqiues pioneered by HPC and the silent world of stats, plus low cost compute clusters, storage and of course boatloads of open source software. The formal term 'data scientist' sounds odd, but describes a new sort of profession, where one combines the knowledge of their own domain (e.g. engineering data, marketing, retail, medical field, whatever...) along with mathematical technqiues and sound understanding of statistics, plus comprehensive ingest of all data, in order to do diagnostic, predictive, now prescriptive and pretty soon very pervasive cognitive decisions.

We will be introduced to the glossary of terms used in this field and several examples of it. This will NOT make you an instant overnight data scientist or expert, but at least you will be able to know, navigate, avoid bumping and recognize many pieces of the larger picture.

Note: CEUs/PDHs available, but must be requested in advance (free to IEEE members)



A seasoned scientifc, technical and computing professional, Sharan has spent over 20+ years implementing many new and pioneering technologies from operating systems (*nix) , high performance computing (Cray, SGI, compute clusters), engineering applications (CAE simulations), optimization, networking (TCP/IP, Infiniband), operations (ITIL, ITSM), scientific domain (BioInformatics) and project management. Sharan looks to increase the professional approach of every individual he interacts with. He enjoys teaching, contributing to STEM activities and publishing. He is a senior member of IEEE, ACM, Emeritus member of Michigan!/usr/group, and recently elected President of SEMCO (one of the pioneer computer user groups in Michigan). He is also currently the Chair of the IEEE SE Michigan Education Society Chapter for 2017. He has also a published author on the topic: "UNIX and TCP/IP network security" " ISBN: 1581430213, ISBN-13: 9781581430219, Publisher: ProsoftTraining (May 1999), Format: paperback. Currently he is working hard on writing his next book and trying to become an expert in an emerging new parallel SD storage system


