SAURABH SINGH

PROJECTS

Reinforcement Learning AI

Q-Learning | Monte Carlo Methods

Overview

Implemented reinforcement learning algorithms in JavaScript to train an AI agent for the game of blackjack. The project demonstrates the application of Q-learning and Monte Carlo methods to derive an optimal playing policy.

Technical Features

Algorithms Implemented

Q-Learning: Tabular approach to estimate state-action values and iteratively update Q-values based on rewards.
Monte Carlo Methods: Episode-based learning with returns averaging to estimate state-action value functions.

AI Agent Development

Designed and trained an AI agent to learn optimal blackjack strategies.
State representation included hand value, dealer card, and usable ace conditions.
Implemented reward functions reflecting wins, losses, and ties.

Training & Policy Optimization

Simulated multiple episodes to converge on the optimal policy.
Evaluated performance by comparing learned policy to known blackjack optimal strategies.
Applied epsilon-greedy strategy for balancing exploration and exploitation.

Implementation Details

Entire RL environment and agent implemented in JavaScript for browser-based simulation.
Modular architecture separating environment logic, agent, and policy evaluation for maintainability.

Tech Stack

Have a Challenge? Let’s Solve It.

Copyright © 2025 Saurabh Singh