Joe Ganser's CV¶
Employment¶
TMCI International / Healthcare - Data/Machine Learning Engineer
- Designed a big data machine learning system to identify 100,000+ duplicated hospital records across 200 million+ data points in the military hospital data system, on behalf of the Defense Health Agency (DoD).
- Trained & led a team of five on the usage of PySpark ML unsupervised learning models (Splink/Fellegi Sunter) for the clustering of hundred- million-row tabular datasets via distributed cloud computing.
- Designed automation systems connecting AWS S3, AWS EMR, and AWS Redshift to run repetitive ETL tasks via PySpark using shell scripts on Linux cron jobs
- Automated the cleaning and distribution of multi-million-row datasets using PySpark, AWS EMR, MySQL & Redshift
- Worked with Python, Unsupervised Machine learning, Spark, Shell, Linux, distributed computing, MySQL, PostgreSQL, Windows, Putty, Databricks
- Acquired and maintained a Tier 2.5 secret government security clearance for access to tabular data for
Saint Peters University / Education - Data Science/Engineering Adjunct Professor
- Lectured on Python, Spark, Machine Learning & SQL before in person & online audiences of 30+ graduate students
- Wrote and designed in person & online courses on Big Data Computing, Spark and Hadoop, Databricks, Intro to python for data science.
- Provided mentorship and guidance to students changing their careers
- Courses taught:
- Python for data science - DS542 - click for the syllabus
- Big Data computing with Spark & Hadoop - DS610 - click for the syllabus
Berkeley College / Education - Data Science/Machine Learning Adjunct Professor
- Lectured on Python, supervised Learning & unsupervised learning to an online audience of 10+ senior undergraduates.
- Wrote and developed course educational content for beginner coders
- Courses taught: Advanced Programming for AI and Big Data - BDS4440 - click for the syllabus
IAC / App Marketing - Data Scientist/Engineer
- Designed & implemented cloud based cLTV automation tools used by finance team of 20+ people including executives & stakeholders
- Lead a team five people to replace out dated financial/marketing forecasting models with cloud automated statistical models
- Built extract transform load (ETL) Python/SQL frameworks on cloud platforms connecting Google Cloud to Snowflake to Google Sheets
- Designed & implemented Pyspark ALS collaborative filtering Recommendation Engines for app advertisements using implicit user behavior data
- Performed causality time series analysis to measure user subscription lift in response to in app advertising
- Worked with Python, SQL, Snowflake, Pyspark, DataBricks, Amazon Web Service, http/lambda functions, Sagemaker, Google Cloud
- Promoted to Data Scientist from Senior Data Analyst in August 2020
- Research focused on subscription user behaviors and the marketing of mobile web apps
Flatiron School / Education - Data Science Assistant Instructor
- Lectured on intro to Python for data science to in person audience of 30+ adult students
- Wrote and developed course educational content for beginner Python/Machine learning coders
- Provided mentorship and guidance to students changing their
Simulmedia / TV Marketing - Data Scientist Intern
- Analyzed past television marketing campaign data to extract key features used to determine the types of individuals that converted on an advertisement versus the types that didn't
- Assisted in the design of a user based collaborative filtering recommendation system to find individuals who should be targeted for future advertisement campaigns• Worked with Python 3x, Amazon Web Service Redshift & Postgres
- Data science applied to television marketing
PebblePost / Mail Marketing - Data Scientist Intern
- Used PySpark to automate data cleaning of millions of snail-mail postage addresses
- Created forecasting models to describe and predict customer conversion rates
- Worked with cloud based cluster computing to analyze multi-billion row data sets
- Used Python, PySpark, MySQL & PostgreSQL to analyze big data hosted through Amazon web service
- Data science applied to mail based marketing
General Assembly - Data Science Fellow
- Coded in Python packages Numpy, Pandas, SciKit Learn, Seaborn, ARIMA, et. al to produce graphical visualizations, calculations and predictions for real world data problems
- Lead teams to solve data science problems and created presentations weekly
- Created a predictive model on the price of bitcoin using time series data modeled with ARIMA
- Solved problems in regression, prediction, classification and clustering
Rutgers Newark - Graduate Physics Teacher's assistant
- Teacher assistant for college Physics 1 & 2 for STEM undergraduate classes
- Lectured on Physics during recitation sessions
New Jersey Institute of Technology/Rutgers Newark - Research Assistant/PhD student in Physics
- Coded in Python and Wolfram Mathematica to create computer models of radiation fields
- Performed mathematical analysis on electromagnetic equations to calculate WiFi transmission during snow storms
- Lab work on Multi spectral imaging
City University of New York - Adjunct Lecturer & Physics Lab Technician
- Lecturered on Physics 1, Physics 2, Physics 1 lab and Physics 2 lab courses for undergraduate students
- Taught courses for both science and non-science majors
- Six semesters of teaching
- Taught Physics 110, Physics 111, Physics 210, Physics 211 and their corresponding lab sections
Education¶
Columbia University 2011 - New York, NY
- Masters of Science - Applied Physics/Applied Mathematics
- School of Engineering and Applied Science
- Masters Project - Designed LabVIEW control system for voltage control rod to be put into Nuclear Fusion machine
Pace University 2008 - Pleasantville, NY
- Bachelors of Science - Applied Mathematics/Physics
- Dyson School of Arts & Sciences
- Won the Whose who award in Colleges & Universities for founding a campus science club
Skills¶
Some of the algorithms & technoligies I've worked with
- Python - expert - coding daily since 2016
- SQL - expert - coding daily since 2016
- Snowflake platform
- AWS Redshift platform
- Web development - competent - html/javascript/css et al.
- Made my first .com web page in 1999 ("mechanicalthought.com"). Dabbled with it since.
- Machine Learning:
- Supervised Learning:
- Linear and Non linear models, ensemble learning
- Worked with everything from Orinary least squares to XGBoost
- Unsupervised Learning:
- Kmeans, Hierarchial clustering, Density based scanning, Gaussain Mixture Models, tSNE, PCA, SVD
- Neural networks:
- Computer Vision Applications
- Keras framework
- Transfer learning
- Recommendation engines:
- Alternating Least Squares, Cosine/Pearson Similarity
- Supervised Learning:
- Statistics:
- A/B testing in frequentist & Bayesian frameworks
- Causality inferencing in time series
- Hypothesis testing
- Big Data Technologies:
- Spark
- Map Reduce
- Databricks
- Goole Cloud Platform & Amazon Web Service:
- lambda/http functions
- cloud functions
- load balancers
- buckets
- sagemaker/colab