PROJECTS

Reinforcement Learning AI

Q-Learning | Monte Carlo Methods

Overview

Implemented reinforcement learning algorithms in JavaScript to train an AI agent for the game of blackjack. The project demonstrates the application of Q-learning and Monte Carlo methods to derive an optimal playing policy.

Technical Features

Algorithms Implemented
  • Q-Learning: Tabular approach to estimate state-action values and iteratively update Q-values based on rewards.
  • Monte Carlo Methods: Episode-based learning with returns averaging to estimate state-action value functions.
AI Agent Development
  • Designed and trained an AI agent to learn optimal blackjack strategies.
  • State representation included hand value, dealer card, and usable ace conditions.
  • Implemented reward functions reflecting wins, losses, and ties.
Training & Policy Optimization
  • Simulated multiple episodes to converge on the optimal policy.
  • Evaluated performance by comparing learned policy to known blackjack optimal strategies.
  • Applied epsilon-greedy strategy for balancing exploration and exploitation.
Implementation Details
  • Entire RL environment and agent implemented in JavaScript for browser-based simulation.
  • Modular architecture separating environment logic, agent, and policy evaluation for maintainability.

Tech Stack

Have a Challenge? Let’s Solve It.

Copyright © 2025 Saurabh Singh