Before an agent writes code, it needs to understand the existing codebase. GrepCodebaseTool finds patterns across files. DatabaseSchemaTool reveals table structures. Together, they give agents the context that prevents blind code generation.
GrepCodebaseTool
Regex-powered search across the codebase:
public function getInputSchema(): array
{
return [
'type' => 'object',
'properties' => [
'pattern' => ['type' => 'string', 'description' => 'Regex pattern to search for'],
'file_type' => ['type' => 'string', 'description' => 'File extension filter: php, js, css'],
'path' => ['type' => 'string', 'description' => 'Subdirectory to search in'],
'max_results' => ['type' => 'integer', 'description' => 'Max matching lines (default 50)'],
],
'required' => ['pattern'],
];
}
The tool uses PHP's RecursiveDirectoryIterator + preg_match(). Not as fast as ripgrep, but zero external dependencies.
Why Agents Grep
The most common agent workflow before patching:
- "Fix the N+1 in InvoicesController" → agent reads the controller
- Agent identifies the query that needs eager loading
- Agent greps for all callers of the affected method
- Agent verifies no other code depends on the current behavior
- Agent generates the patch
grep_codebase tool makes this possible.
Truncation
Results are capped at ~6K characters (configured in AgentExecutor's smart truncation). Grep results are repetitive by nature — 50 matches of the same pattern show the same code structure. The first matches (6K worth) give enough context; the rest adds noise.
DatabaseSchemaTool
Read-only database introspection:
public function execute(array $params): array
{
$action = $params['action'] ?? 'list_tables';
return match ($action) {
'list_tables' => $this->listTables(),
'describe' => $this->describeTable($params['table']),
'indexes' => $this->showIndexes($params['table']),
default => ['success' => false, 'error' => 'Unknown action'],
};
}
Three actions:
list_tables→SHOW TABLES(what tables exist)describe→DESCRIBE table_name(columns, types, nullability, defaults)indexes→SHOW INDEX FROM table_name(index structure)
Read-only. No CREATE, ALTER, DROP, INSERT, UPDATE, DELETE. The tool can't modify the database — it can only inspect it. Agents that need to write SQL (like KnjigAgent) use dedicated domain tools that enforce business rules.
Why Schema Context Matters
When VajbCoder writes a new controller that queries pm_subtasks, it needs to know:
- What columns exist (to write correct SELECT)
- Which are nullable (to handle null in PHP)
- What types they are (to add correct type hints)
- What indexes exist (to write queries that use them)
Without schema context, the agent guesses column names — and guesses wrong. DatabaseSchemaTool turns "I think there's a status column" into "There's a
status VARCHAR(20) NOT NULL DEFAULT 'todo', indexed."
The truncation limit for schema is 8K chars (higher than grep's 6K) because schema completeness matters more than search result completeness.
Up Next
Next up: GeneratePatchTool: The Crown Jewel — From AI Suggestion to Atomic File Write — the tool that turns AI suggestions into actual code changes, with PatchValidator's 6-step security check and PatchApplier's atomic write + git branch isolation.
Comments (0)
No comments yet. Be the first to share your thoughts!
Leave a Comment