Summary

Overview

Work History

Education

Skills

Websites

Certification

Publications

Timeline

Keyur Vaidya

Boston

Summary

Accomplished Data Scientist with a proven track record at Bermuda Monetary Authority, specializing in machine learning and data pipelines. Enhanced model accuracy by 15% through innovative forecasting techniques. Skilled in data visualization and agile project management, delivering impactful insights that drive financial decision-making and regulatory compliance.

Overview

years of professional experience

Certification

Work History

Data Scientist

Bermuda Monetary Authority

06.2024 - Current

Fine-tuned BERT-based models for macroeconomic trend analysis, leveraging domain-specific pretraining to improve text-based financial risk assessment
Enhanced model interpretability and reduced forecasting errors, leading to more reliable insights for market stability analysis
Developed an end-to-end ETL pipeline using DLT Hub and Apache Spark to efficiently process macroeconomic data from the Nasdaq API, ensuring high-throughput data ingestion, transformation, and storage for regulatory risk assessments
Designed Power BI dashboards integrated with LSTM-based time-series forecasting models, enabling dynamic visualization of macroeconomic indicators and delivering real-time predictive insights to enhance financial decision-making
Built and fine-tuned a BERT-based document classification pipeline using BERT embeddings, TF-IDF vectorization, and RoBERTa, automating regulatory compliance document processing
Achieved a 40% reduction in manual review time, a 12% increase in classification F1-score, and significantly improved document retrieval efficiency for audit workflows
Implemented A/B testing using Double Machine Learning and Bayesian Structural Time Series, combined with uplift modeling (XGBoost), to optimize macroeconomic forecasting
Improved predictive accuracy by 15%, enabling better detection of causal relationships between economic indicators and financial market stability, driving data-driven policy interventions
Developed an AI-driven fraud detection system to mitigate financial crime risks in regulatory markets, identifying anomalous transaction patterns and high-risk activities using Isolation Forest
Optimized hyperparameters (contamination factor, estimators, and max samples) to improve fraud detection precision and reduce false positives
Implemented cryptographic security measures by integrating RSA encryption (2048-bit) with anomaly detection pipelines, ensuring secure transaction flagging and tamper-proof audit trails to enhance regulatory compliance and fraud prevention
Applied feature selection techniques (RFE, PCA, correlation analysis) to refine fraud detection models, reducing computational overhead and improving risk assessment efficiency for financial market monitoring
Strengthened financial risk oversight by embedding fraud detection insights into market surveillance frameworks, enabling proactive intervention strategies to mitigate fraudulent activities in macroeconomic data streams

Data Scientist Intern

Boston University

05.2023 - 06.2023

Implemented neural style transfer using VGG19 pre-trained on ImageNet, extracting multi-layer feature representations to separately compute content loss (Mean Squared Error) and style loss (Gram Matrix-based optimization)
Optimized model performance by leveraging L-BFGS optimizer for efficient convergence and fine-tuning content-to-style weight ratios, reducing artifacts and enhancing stylization quality
Pre-processed and normalized images using OpenCV and NumPy, applying per-channel mean subtraction and re-scaling to ensure compatibility with the neural network
Integrated dynamic hyperparameter tuning, enabling users to adjust iterations, learning rate, and regularization weights (total variation loss) to balance stylization detail and computational efficiency

Data Scientist Intern

J.B. Boda Insurance & Reinsurance Brokers

06.2022 - 11.2022

Enhanced underwriting risk assessment by leveraging Generalized Linear Models (GLMs) and Cox Proportional Hazards Models to analyze insurance policies
Applied logistic regression for claim probability estimation and decision trees for rule-based risk segmentation, identifying key risk factors to optimize policy structuring and improve loss ratio management through Monte Carlo simulations
Developed high-performance data extraction scripts utilizing vectorized operations in Pandas and NumPy, optimizing structured and unstructured claims data processing
Implemented parallelized batch processing and memory-efficient data handling to enhance data throughput and computational efficiency
Applied Isolation Forest for anomaly detection in fraudulent claims and DBSCAN for clustering claim patterns, improving predictive model accuracy and reducing claim settlement times by 30% through feature engineering and outlier detection techniques
Developed a logistic regression model achieving 85% accuracy for automated insurance claim approvals, leveraging L1 regularization for feature selection and SMOTE for handling class imbalance
This improved prediction reliability, reduced manual review time, and enhanced underwriting efficiency by streamlining risk assessment and decision-making
Conducted a 5-year time series analysis using ARIMA and Prophet, incorporating trend decomposition and seasonality adjustments to enhance actuarial risk modeling
Improved forecasting accuracy by 12%, enabling more precise underwriting decisions, better capital reserve planning, and proactive risk mitigation strategies

Data Scientist

Deep Agency

05.2020 - 04.2022

Built a predictive text analytics system with TF-IDF vectorization, dependency parsing, and named entity recognition (NER) for automated inquiry classification
Deployed via a database-integrated API, improving response prioritization, reducing resolution time, and enhancing service efficiency through automated case routing and real-time monitoring
Implemented an anomaly detection framework using Z-score analysis and Isolation Forest for outlier detection and K-Means clustering with anomaly score thresholds to identify suspicious claims and transactions
Integrated real-time monitoring with SQL triggers and rule-based alerts, reducing false positives by 15% and detecting 20% more fraudulent claims
Strengthened fraud prevention measures, minimized unwarranted payouts, and enhanced regulatory compliance in underwriting and claims processing
Developed a predictive text analytics system using TF-IDF vectorization, dependency parsing, and named entity recognition (NER) to classify customer inquiries and detect priority cases
Integrated the system with CRM databases and workflow automation, enabling real-time case routing and sentiment-based prioritization
Deployed via a database-integrated API, reducing resolution time by 25%, improving response efficiency, and enhancing customer satisfaction in policy servicing and claims management
Engineered a high-performance logistic regression model for binary classification of insurance claim approvals, leveraging L2 regularization (Ridge regression) to mitigate overfitting and enhance generalizability
Conducted feature selection using logistic regression coefficients and variance thresholding, isolating critical predictors to improve model interpretability and decision-making transparency in claim evaluations
Accelerated model training and inference by implementing batch processing and NumPy-based vectorized computations, significantly reducing computational overhead and enabling real-time, high-throughput predictions
Integrated model tracking and versioning directly within a Streamlit-based interactive interface, ensuring seamless deployment, real-time monitoring, and reproducibility for continuous model optimization
Delivered a robust, scalable, and production-ready solution that enhances operational efficiency, reduces manual processing delays, and provides a data-driven approach to claim approval decisions

Education

Master's - Applied Data Science

Boston University

05.2024

Bachelor of Science - Statistics

Ramnarian Ruia Autonomous College

04.2022

Skills

Machine Learning Algorithms
Deep Learning & LLMs
Time Series Forecasting
Statistical & Probabilistic Modeling
Programming Languages

Data Preprocessing & Cleaning
Data Pipelines & ETL
Big Data Technologies
Data Visualization
Agile & Project Management

Websites

Certification

Fine-Tuning Large Language Models - Deeplearning.AI
Neural Networks and Deep Learning - Deeplearning.AI
Python for Data Science, AI & Development - IBM
Machine Learning Specialization - Deeplearning.AI

Publications

Comparative study on Llama 3.2 3B vs. DeepSeek V3, https://medium.com/@vaidya.keyur2/llama-3-2-3b-vs-deepseek-v3-performance-cost-and-use-case-comparison-522e072ad3c9
Implementing different models using NumPy and PyTorch, https://medium.com/@vaidya.keyur2/machine-learning-implementing-different-models-using-numpy-and-pytorch-626dc78e1090

Timeline

Data Scientist

Bermuda Monetary Authority

06.2024 - Current

Data Scientist Intern

Boston University

05.2023 - 06.2023

Data Scientist Intern

J.B. Boda Insurance & Reinsurance Brokers

06.2022 - 11.2022

Data Scientist

Deep Agency

05.2020 - 04.2022

Master's - Applied Data Science

Boston University

Bachelor of Science - Statistics

Ramnarian Ruia Autonomous College

Keyur Vaidya

Summary

Overview

Work History

Data Scientist

Data Scientist Intern

Data Scientist Intern

Data Scientist

Education

Master's - Applied Data Science

Bachelor of Science - Statistics

Skills

Websites

Certification

Publications

Timeline

Data Scientist

Data Scientist Intern

Data Scientist Intern

Data Scientist

Master's - Applied Data Science

Bachelor of Science - Statistics

Similar Profiles

Valerie DouglasValerie Douglas

Kashia ThomasKashia Thomas

Miranda Ma, ARM, AIDAMiranda Ma, ARM, AIDA

Indrajeet Aditya RoyIndrajeet Aditya Roy

Katrina PortleyKatrina Portley