Skip to content

Joe Ganser's CV

Employment

TMCI International / Healthcare - Data/Machine Learning Engineer

Remote - Jan. 2023 to Nov. 2023

  • Designed a big data machine learning system to identify 100,000+ duplicated hospital records across 200 million+ data points in the military hospital data system, on behalf of the Defense Health Agency (DoD).
  • Trained & led a team of five on the usage of PySpark ML unsupervised learning models (Splink/Fellegi Sunter) for the clustering of hundred- million-row tabular datasets via distributed cloud computing.
  • Designed automation systems connecting AWS S3, AWS EMR, and AWS Redshift to run repetitive ETL tasks via PySpark using shell scripts on Linux cron jobs
  • Automated the cleaning and distribution of multi-million-row datasets using PySpark, AWS EMR, MySQL & Redshift
  • Worked with Python, Unsupervised Machine learning, Spark, Shell, Linux, distributed computing, MySQL, PostgreSQL, Windows, Putty, Databricks
  • Acquired and maintained a Tier 2.5 secret government security clearance for access to tabular data for

Saint Peters University / Education - Data Science/Engineering Adjunct Professor

Jersey City, NJ - Sept. 2022 to Mar. 2023

  • Lectured on Python, Spark, Machine Learning & SQL before in person & online audiences of 30+ graduate students
  • Wrote and designed in person & online courses on Big Data Computing, Spark and Hadoop, Databricks, Intro to python for data science.
  • Provided mentorship and guidance to students changing their careers
  • Courses taught:

Berkeley College / Education - Data Science/Machine Learning Adjunct Professor

Remote/Online - Aug. 2021 to Dec. 2022

  • Lectured on Python, supervised Learning & unsupervised learning to an online audience of 10+ senior undergraduates.
  • Wrote and developed course educational content for beginner coders
  • Courses taught: Advanced Programming for AI and Big Data - BDS4440 - click for the syllabus

IAC / App Marketing - Data Scientist/Engineer

New York, NY / Hybrid - Sept. 2019 to May 2022

  • Designed & implemented cloud based cLTV automation tools used by finance team of 20+ people including executives & stakeholders
  • Lead a team five people to replace out dated financial/marketing forecasting models with cloud automated statistical models
  • Built extract transform load (ETL) Python/SQL frameworks on cloud platforms connecting Google Cloud to Snowflake to Google Sheets
  • Designed & implemented Pyspark ALS collaborative filtering Recommendation Engines for app advertisements using implicit user behavior data
  • Performed causality time series analysis to measure user subscription lift in response to in app advertising
  • Worked with Python, SQL, Snowflake, Pyspark, DataBricks, Amazon Web Service, http/lambda functions, Sagemaker, Google Cloud
  • Promoted to Data Scientist from Senior Data Analyst in August 2020
  • Research focused on subscription user behaviors and the marketing of mobile web apps

Flatiron School / Education - Data Science Assistant Instructor

New York, NY - Sept. 2018 to Apr. 2019

  • Lectured on intro to Python for data science to in person audience of 30+ adult students
  • Wrote and developed course educational content for beginner Python/Machine learning coders
  • Provided mentorship and guidance to students changing their

Simulmedia / TV Marketing - Data Scientist Intern

New York, NY June 2018 to Sept. 2018

  • Analyzed past television marketing campaign data to extract key features used to determine the types of individuals that converted on an advertisement versus the types that didn't
  • Assisted in the design of a user based collaborative filtering recommendation system to find individuals who should be targeted for future advertisement campaigns• Worked with Python 3x, Amazon Web Service Redshift & Postgres
  • Data science applied to television marketing

PebblePost / Mail Marketing - Data Scientist Intern

New York, NY - Jan. 2018 to May 2018

  • Used PySpark to automate data cleaning of millions of snail-mail postage addresses
  • Created forecasting models to describe and predict customer conversion rates
  • Worked with cloud based cluster computing to analyze multi-billion row data sets
  • Used Python, PySpark, MySQL & PostgreSQL to analyze big data hosted through Amazon web service
  • Data science applied to mail based marketing

General Assembly - Data Science Fellow

New York, NY - June 2017 to Sept. 2017

  • Coded in Python packages Numpy, Pandas, SciKit Learn, Seaborn, ARIMA, et. al to produce graphical visualizations, calculations and predictions for real world data problems
  • Lead teams to solve data science problems and created presentations weekly
  • Created a predictive model on the price of bitcoin using time series data modeled with ARIMA
  • Solved problems in regression, prediction, classification and clustering

Rutgers Newark - Graduate Physics Teacher's assistant

Newark, NJ - June 2015 to Sept 2015

  • Teacher assistant for college Physics 1 & 2 for STEM undergraduate classes
  • Lectured on Physics during recitation sessions

New Jersey Institute of Technology/Rutgers Newark - Research Assistant/PhD student in Physics

Newark,NJ - Aug. 2014 to June 2016

  • Coded in Python and Wolfram Mathematica to create computer models of radiation fields
  • Performed mathematical analysis on electromagnetic equations to calculate WiFi transmission during snow storms
  • Lab work on Multi spectral imaging

City University of New York - Adjunct Lecturer & Physics Lab Technician

New York, NY - Jan. 2012 to Jan 2015

  • Lecturered on Physics 1, Physics 2, Physics 1 lab and Physics 2 lab courses for undergraduate students
  • Taught courses for both science and non-science majors
  • Six semesters of teaching
  • Taught Physics 110, Physics 111, Physics 210, Physics 211 and their corresponding lab sections

Education

Columbia University 2011 - New York, NY

  • Masters of Science - Applied Physics/Applied Mathematics
    • School of Engineering and Applied Science
    • Masters Project - Designed LabVIEW control system for voltage control rod to be put into Nuclear Fusion machine

Pace University 2008 - Pleasantville, NY

  • Bachelors of Science - Applied Mathematics/Physics
    • Dyson School of Arts & Sciences
    • Won the Whose who award in Colleges & Universities for founding a campus science club

Skills

Some of the algorithms & technoligies I've worked with

  • Python - expert - coding daily since 2016
  • SQL - expert - coding daily since 2016
    • Snowflake platform
    • AWS Redshift platform
  • Web development - competent - html/javascript/css et al.
    • Made my first .com web page in 1999 ("mechanicalthought.com"). Dabbled with it since.
  • Machine Learning:
    • Supervised Learning:
      • Linear and Non linear models, ensemble learning
      • Worked with everything from Orinary least squares to XGBoost
    • Unsupervised Learning:
      • Kmeans, Hierarchial clustering, Density based scanning, Gaussain Mixture Models, tSNE, PCA, SVD
    • Neural networks:
      • Computer Vision Applications
      • Keras framework
      • Transfer learning
    • Recommendation engines:
      • Alternating Least Squares, Cosine/Pearson Similarity
  • Statistics:
    • A/B testing in frequentist & Bayesian frameworks
    • Causality inferencing in time series
    • Hypothesis testing
  • Big Data Technologies:
    • Spark
    • Map Reduce
    • Databricks
  • Goole Cloud Platform & Amazon Web Service:
    • lambda/http functions
    • cloud functions
    • load balancers
    • buckets
    • sagemaker/colab