CS 102: Big Data Tools and Techniques Discoveries and Pitfalls. Spring 2018

Similar documents
Welcome to CS106A! Four Handouts Today: Course Overview Why Learn to Program? Meet Karel the Robot

Undergraduate Majors and Minors

Catalog Contents. Catalog Contents 1

Electrical Engineering

Domain: Computer Science and Information Technology Curricula for the First Year (2012/2013)

COURSE TOPICS: The following topics will be covered this semester:

CSCI-564 Advanced Computer Architecture

Information Infrastructure II (Data Mining) I211

BCN 1251C Construction Drawing Section: Credits Spring 2016

Breadth Requirements Effective 2011 Fall Quarter

University of Wisconsin-Madison, Nelson Institute for Environmental Studies September 2, 2014

The Brooklyn College Core Curriculum Louise Hainline

ME 487 Mechatronics. Office: JH 515, Tel.: (505)

Welcome to Engr 1202

Placement Offer Form 1, Makri & Dionysiou Aeropagitou Str., Athens Ph.: / Fax:

Welcome to CSC384: Intro to Artificial MAN.

Carleton University. Faculty of Engineering and Design, Department of Electronics. ELEC 2507 Electronic - I Summer Term 2017

Lecture 1. Introduction

Chemical and Biological Engineering Student Learning Outcome Assessment Report

Carleton University. Faculty of Engineering, Department of Electronics ELEC 2507 / PLT 2006A - Electronic - I Winter Term 2016

16-17 Fall Undergraduate Final Exams Schedule

Table of Contents. Graduate and Postdoctoral Studies Fellowship Awards Database GPS-FAD 2012

COS 402 Machine Learning and Artificial Intelligence Fall Lecture 1: Intro

GIS Programming Practicuum

NSDL/NSTA Web Seminar Teach Engineering: Because Dreams Need Doing

Preview Day Preview Day 2018

Undergraduate and Integrated Masters Programmes

Graduate Programs. Graduate Programs 1

BCN 1251C Construction Drawing Section: Credits Fall 2016

COLLEGE OF ENGINEERING

Stanford CS Commencement Alex Aiken 6/17/18

ENSC327/328 Communication Systems Course Information. Paul Ho Professor School of Engineering Science Simon Fraser University

Agricultural & Biosystems Engineering (

FALL 2015 PATHWAYS APPROVED 4/6/15

Automating NSF HERD Reporting Using Machine Learning and Administrative Data

ECS15: Introduction to Computers

General Education Program

Course Syllabus OSE 3200 Geometric Optics

202000AAW ASSOCIATE OF ARTS

SAN JOSE STATE UNIVERSITY Department of Aviation and Technology AE/ME/CMPE/ENGR/TECH 198--Technology and Civilization COURSE OUTLINE Fall 2012

COURSE SYLLABUS. ISE545: Technology Development and Implementation

GRADUATE MAJORS. PH.D. Programs. Iowa State University

CSE 166: Image Processing. Overview. What is an image? Representing an image. What is image processing? History. Today

GRADUATE PROGRAMMES Semester 1 Examination EXAM TIMETABLE

HUMA 1301: Exploration of the Humanities Fall 2013 MC Tu-Th 10:00-11:15 Professor Kenneth Brewer

The School of Engineering

Field of Science Codes (FOS) A. Engineering 1. Aerospace, Aeronautical, and Astronautical Engineering Aerodynamics Aerospace engineering Space

NAE Grand Challenges

Bachelor of Business Administration. B.A. Digital Arts and Animation: 3D Animation Concentration

ENGINEERING OPTIMIZATION S RAO

ISCED: INTERNATIONAL STANDARD CLASSIFICATION OF EDUCATION 2013

Syllabus for ENGR065-01: Circuit Theory

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE

AMERICAN INTERNATIONAL UNIVERSITY-BANGLADESH Spring PROPOSED EXAM SCHEDULE [Released on October 30, 2018]

EE (3L-1.5P) Analog Electronics Department of Electrical and Computer Engineering Fall 2015

Department of Electrical Engineering

Graduate Course List

Christan Grant and Andrew H. Fagg: CS

Iowa State University Library Collection Development Policy Computer Science

ECEN 5817 Resonant and Soft-Switching Techniques in Power Electronics

visit website regularly for updates and announcements

Global Trends in Neuroscience Publishing Background and Developments

Trimester Two Pre-Approved Course List

Research Categories Bioenergy Machinery Transportation. Seed Science Soil Soybeans Water

Agricultural & Biosystems Engineering (

ABOUT COMPUTER SCIENCE

Prof.: Michael Roukes, x2916, 131 Bridge Annex T.A.: Raj Katti, x5814, B139 West Bridge

Curriculum Template Changes

Sustainability-Related Learning Outcomes Department/ Program

AMERICAN INTERNATIONAL UNIVERSITY-BANGLADESH Spring MID-TERM EXAM SCHEDULE [Released on February 10, 2019]

OPEN HOUSE Fall Marty Wood. Mun Y. Choi. Dean of Engineering. Assistant Dean Undergraduate Education

ECONOMICS 321 History of Economic Thought. Fall X3592 (office), (home) Office Hrs: M W 3:30-5:00, T Th 2-3, or by appointment

MASTER S DEGREE PROGRAMMES FOR 2017/2018 ACADEMIC YEAR

END SEMESTER EXAMINATION (UG COVENTRY) - SPRING 2017 FINAL TIMETABLE

CyberCivics: A Novel Approach to Reaching K-12 Students with the Social Relevance of Computing. Jim Owens and Jeanna Matthews Clarkson University

ES 330 Electronics II Fall 2016

Science, Technology and Society

CS 309: Autonomous Intelligent Robotics FRI I. Instructor: Justin Hart.

Course Syllabus OSE 3200 Geometric Optics

J316 Introduction to Photographic Communication

Subject Content Knowledge Requirements (Abridged)

Info 2950, Lecture 26

Course Syllabus OSE 4240 OPTICS AND PHOTNICS DESIGN, 3 CREDIT HOURS

Subject Content Knowledge Requirements (Abridged)

ARH 011: History of Western Art: Ancient to Medieval

Revolutions in Science and Technology HSS 201 Fall 2009, KAIST Wednesday & Friday, 11:00 am -12:15 pm N4 Building 1124

POCKET FACTS. ltu.se

ENGINEERING What can I do with this degree?

NSDL/NSTA Web Seminar Teach Engineering: Because Dreams Need Doing

CSC320H: Intro to Visual Computing. Course WWW (course information sheet available there):

ARH 011: History of Western Art: Ancient to Medieval

ARH 011: History of Western Art: Ancient to Medieval

Spring 2010 Cross Listed Courses

DEPARTMENT OF CLASSICAL STUDIES COURSE OUTLINE: Classical Studies 2810A/Film Studies 2198A ANCIENT GREECE IN FILM AUTUMN 2013

ARH 021: Contemporary Art

Introduction to Artificial Intelligence

The Federal Polytechnic, Nasarawa

MOREHEAD STATE UNIVERSITY

Individual and Society

Unit One: Part One: The Science of Biology. 5/16/2013 Averett

Transcription:

CS 102: Big Data Tools and Techniques Discoveries and Pitfalls Spring 2018

What s This Course About? Aimed at non-cs undergraduate and graduate students who want to learn the basics of big data tools and techniques and apply that knowledge in their areas of study. Many of the world's biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing massive data sets. At the same time, it is surprisingly easy to make errors or come to false conclusions from data analysis alone. This course provides a broad and practical introduction to big data: data analysis techniques including databases, data mining, and machine learning; data analysis tools including spreadsheets, relational databases and SQL, Python, and R; data visualization techniques and tools; pitfalls in data collection and analysis; historical context, privacy, and other ethical issues. Tools and techniques are hands-on but at a cursory level, providing a basis for future exploration and application. Prerequisites: comfort with basic logic and mathematical concepts, along with high school AP computer science, CS106A, or other equivalent programming experience. 2

What s This Course About? Aimed at non-cs undergraduate and graduate students who want to learn the basics of big data tools and techniques and apply that knowledge in their areas of study. Many of the world's biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing massive data sets. At the same time, it is surprisingly easy to make errors or come to false conclusions from data analysis alone. This course provides a broad and practical introduction to big data: data analysis techniques including databases, data mining, and machine learning; data analysis tools including spreadsheets, relational databases and SQL, Python, and R; data visualization techniques and tools; pitfalls in data collection and analysis; historical context, privacy, and other ethical issues. Tools and techniques are hands-on but at a cursory level, providing a basis for future exploration and application. Prerequisites: comfort with basic logic and mathematical concepts, along with high school AP computer science, CS106A, or other equivalent programming experience. 3

What s This Course About? Aimed at non-cs undergraduate and graduate students who want to learn the basics of big data tools and techniques and apply that knowledge in their areas of study. Many of the world's biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing massive data sets. At the same time, it is surprisingly easy to make errors or come to false conclusions from data analysis alone. This course provides a broad and practical introduction to big data: data analysis techniques including databases, data mining, and machine learning; data analysis tools including spreadsheets, relational databases and SQL, Python, and R; data visualization techniques and tools; pitfalls in data collection and analysis; historical context, privacy, and other ethical issues. Tools and techniques are hands-on but at a cursory level, providing a basis for future exploration and application. Prerequisites: comfort with basic logic and mathematical concepts, along with high school AP computer science, CS106A, or other equivalent programming experience. 4

Who Should Take It? Aimed at non-cs undergraduate and graduate students who want to learn the basics of big data tools and techniques and apply that knowledge in their areas of study. Many of the world's biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing massive data sets. At the same time, it is surprisingly easy to make errors or come to false conclusions from data analysis alone. This course provides a broad and practical introduction to big data: data analysis techniques including databases, data mining, and machine learning; data analysis tools including spreadsheets, relational databases and SQL, Python, and R; data visualization techniques and tools; pitfalls in data collection and analysis; historical context, privacy, and other ethical issues. Tools and techniques are hands-on but at a cursory level, providing a basis for future exploration and application. Prerequisites: comfort with basic logic and mathematical concepts, along with high school AP computer science, CS106A, or other equivalent programming experience. 5

Who Should Take It? Aimed at non-cs undergraduate and graduate students who want to learn the basics of big data tools and techniques and apply that knowledge in their areas of study. Many of the world's biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing massive data sets. At the same time, it is surprisingly easy to make errors or come to false conclusions from data analysis alone. This course provides a broad and practical introduction to big data: data analysis techniques including databases, data mining, and machine learning; data analysis tools including spreadsheets, relational databases and SQL, Python, and R; data visualization techniques and tools; pitfalls in data collection and analysis; historical context, privacy, and other ethical issues. Tools and techniques are hands-on but at a cursory level, providing a basis for future exploration and application. Prerequisites: comfort with basic logic and mathematical concepts, along with high school AP computer science, CS106A, or other equivalent programming experience. 6

Who Should Take It? Aimed at non-cs undergraduate and graduate students who want to learn the basics of big data tools and techniques and apply that knowledge in their areas of study. Many of the world's biggest discoveries and decisions in science, technology, business, medicine, politics, and society as a whole, are now being made on the basis of analyzing massive data sets. At the same time, it is surprisingly easy to make errors or come to false conclusions from data analysis alone. This course provides a broad and practical introduction to big data: data analysis techniques including databases, data mining, and machine learning; data analysis tools including spreadsheets, relational databases and SQL, Python, and R; data visualization techniques and tools; pitfalls in data collection and analysis; historical context, privacy, and other ethical issues. Tools and techniques are hands-on but at a cursory level, providing a basis for future exploration and application. Prerequisites: comfort with basic logic and mathematical concepts, along with high school AP computer science, CS106A, or other equivalent programming experience. 7

Who Shouldn t Take It? Computer Science or MCS students (except by petition) If you re in the wrong place, it s okay to leave now 8

Course Staff Instructor Jennifer Widom Course Assistants Steven Chen Alex Haigh Arjun Kunna Jesse Min Lucy Wang 9

History of the Course Fall 2015 Freshman seminar Spring 2016 First offering of Basis for sabbatical instructional 2016-17 odyssey - 30+ institutions in 18 countries Second offering of, by Spring 2017 graduate students Fall 2017 First offering as Dean, had fun! 10

Who s Taking It Spring 2018 11 Undergraduates, Masters, MBA, JD, PhD, DCI Biochemistry Bioengineering Biomedical Informatics Business Administration Chemical Engineering Chemistry Civil & Environmental Engg Classics Communication Community Health Earth Systems Economics Education Electrical Engineering Energy Resource Engineering English Environment & Resources Epidemiology Geological & Env Science History Human Biology International Policy Studies International Relations Law Management Materials Science & Engineering Mathematics Management Science & Engg Mechanical Engineering Public Policy Science, Technology, & Society Sociology Symbolic Systems Urban Studies Undeclared

Who s Taking It 12

Who s Taking It 13

Who s Taking It 14

Who s Taking It 15

Assigned Work Assignment/Project Assigned Due Assignment #1 Spreadsheets for Data Analysis and Visualization Project #1 Personal Data Analysis Assignment #2 Data Visualization Using Tableau, SQL Assignment #3 Python for Data Analysis and Visualization Assignment #4 Machine Learning Project #2 Movie-Rating Predictions Assignment #5 Data Mining, R Language, Network Analysis April 9 April 16 April 9 April 23 May 14 April 16 April 26 April 26 May 7 May 14 May 24 May 14 May 31 May 24 June 6 16

Exams Exam Midterm exam In class Final exam 12:15-3:15 PM (2 hours) Date May 10 June 8 17

Logistics Units - 4 for undergraduates, 3-4 for graduates WAYS requirement - Applied Quantitative Reasoning (WAY-AQR) Textbook? No Readings? Recommended Class attendance Expected Ø Hand-on activities Ø Only cursory notes Ø All class material game for exams 18

Logistics Grade weighting - 1/3 each assignments, projects, exams Graded on a curve? Not really Late policy - 10%/30% for 24/48 hours late, four free late days 19

Office Hours TA office hours 20 hours per week Times and locations can vary Always check the course calendar! Prof. Widom office hours Wednesdays 4:00-5:00 PM Huang building 2 nd floor Dean s Office #227 20

Online Website - http://cs102.stanford.edu Piazza Announcements Q&A (private and public) Discussion Gradescope - Assignment submission & grading 21

For Thursday s Class 1) Get set up on Google Drive if you re not already 2) Download Europe city temperatures data from course website (two files) 3) Copy data files into Google Drive, make sure you can open with Google Sheets 4) Bring laptop to class (or share) 22

CS 102: Big Data Tools and Techniques Discoveries and Pitfalls Questions?