B.Sc. Statistics & Data Scientist

DECODING

DATA

DYNAMICS.

Hi, I'm Bhaskar Jha. Bridging the gap between rigorous mathematical theory and modern data engineering. Building intelligent tools with Python, SQL, and Deep Learning. Open source contributor at Open Food Facts.

MACHINE LEARNING • DATA ENGINEERING • STATISTICS • PYTHON • DEEP LEARNING • MACHINE LEARNING • DATA ENGINEERING • STATISTICS • PYTHON • DEEP LEARNING •

Data Tools Built

B.Sc. Stats Cohort

GitHub Contributions

GSoC

Participant 2025

The
Journey

2025 — PRES

Google Summer of Code

Contributor for Open Food Facts. Navigating large-scale legacy codebases (Perl, HTML) and fixing critical UI persistence bugs. Bridging the gap between stats and global production systems.

2024 — PRES

Independent Project Builder

Engineering reactive business dashboards, spaced repetition systems, and local EDA automation tools. Deployed multiple high-polish web applications using Python and JS.

ONGOING

B.Sc. Statistics

Deep specialization in Probability, Inference, and Sampling Distributions. Applying theory to practical data science projects to solve real-world anomalies.

Selected
Work

DataSense EDA

Python NLP Automation

Offline CSV EDA tool with NLP insights, Pearson correlation engine, and predictive ML models for rapid data analysis. Built for efficiency and depth in local environments.

import pandas as pd
from datasense import AutoExplorer

df = pd.read_csv('raw_data.csv')
engine = AutoExplorer(df)
engine.analyze_correlations()

NexusSales

Data Pipeline CRM Analysis Visualization

Full data pipeline transformation. Ingesting messy CRM data and outputting a reactive business dashboard with actionable statistical insights for sales growth.

MNEMO System

Gamification Algorithms Python

Ebbinghaus-based spaced repetition tool. Features an XP-based leveling system and custom scheduling algorithms to optimize information retention for complex subjects.

DECODING DATA DYNAMICS.

The Journey