https://github.com/AashiqRamachandran/i-am-a-bot
This project provides a solution for automatically solving various types of CAPTCHAs using a multi-modal Large Language Model (LLM). It leverages the capabilities of Google’s Vertex AI and a custom set of agents to interpret and solve CAPTCHA challenges.
Features
- Identification of whether an image is a CAPTCHA.
- Determination of CAPTCHA type (text, math equation, image rotation, puzzle, image selection, etc.).
- Solving text and mathematical CAPTCHAs.
- Integration with Google Cloud’s Vertex AI for model inference (Using gemini-vision-pro).
Installation
Before you can use the CAPTCHA solver, you need to install the required dependencies:pip install –upgrade google-cloud-aiplatform
Usage
To use the CAPTCHA solver, you need to have a Google Cloud project with Vertex AI enabled and a service account with access to Vertex AI endpoints.
Example
from iamabot import solve # Initialize the solver with your Google Cloud project ID and service account credentials solver = solve.Solve( project_id=1077607249524, credential_file_path=”google-service-account-credential-file.json” ) # Run the solver on a CAPTCHA image solved_response = solver.run(“sample_captchas/text_moderate.png”) # Print the solution print(solved_response)
Project Structure
agents.py
: Contains the definitions of the agents used to identify and solve CAPTCHAs.gemini_core.py
: Handles the interaction with Google Cloud’s Vertex AI to process the CAPTCHA images.solve.py
: The main entry point for the CAPTCHA solver. It orchestrates the process of solving a CAPTCHA using the defined agents.sample.py
: An example script demonstrating how to use the CAPTCHA solver.
Agents
The project defines four agents, each with a specific role in the CAPTCHA solving process:
CheckIfImageLooksLikeCaptchaAgent
: Determines if an image is a CAPTCHA.DecideCaptchaTypeAgent
: Identifies the type of CAPTCHA presented.TextSolveAgent
: Solves CAPTCHAs that require text recognition.MathSolveAgent
: Solves CAPTCHAs that present a mathematical equation.
Configuration
To configure the CAPTCHA solver, you must provide your Google Cloud project ID and the path to your service account JSON file. These are used to authenticate with Google Cloud’s Vertex AI service.
Flow Of The Tool
+-----------------------------------+
| Start run function |
+-----------------------------------+
|
v
+-----------------------------------+
| Load Agents |
+-----------------------------------+
|
v
+-----------------------------------+
| Generate prompt for image check |
+-----------------------------------+
|
v
+-----------------------------------+
| Check if image looks like captcha |
+-----------------------------------+
|
+--------------------------+ No
| |
Yes v
| +-------------------+
v | Raise ValueError |
+-----------------------------------+ |
| Generate prompt for captcha type | |
+-----------------------------------+ |
| |
v |
+-----------------------------------+ |
| Determine captcha type | |
+-----------------------------------+ |
| |
+----------+------------+ |
| | | |
v v | |
+---------+--+ +---+---------+ | |
| Text captcha| | Math captcha| | |
+---------+--+ +---+---------+ | |
| | | |
v v | |
+---------+--+ +---+---------+ | |
| Solve text | | Solve math | | |
| captcha | | captcha | | |
+---------+--+ +---+---------+ | |
| | | |
v v | |
+---------+--+ +---+---------+ | |
| Return text | | Return math | | |
| captcha | | captcha | | |
| solution | | solution | | |
+---------+--+ +---+---------+ | |
| | | |
+----------+------------+ |
| |
+---------------------------+
|
+--------------------v-----------------------------+
| Raise ValueError if captcha type is unsupported |
+--------------------------------------------------+
|
v
+-----------------------------------+
| End run function |
+-----------------------------------+
Limitations
The current implementation supports text and mathematical CAPTCHAs. Other types of CAPTCHAs, such as image rotation, puzzles, and image selection, are recognized but not solved in this version.
What do you think?
Show comments / Leave a comment