I goto to example.com
I click the search bar
I type in “fruit”
Expect to see a list of fruit
Then the LLM would execute something like playwright to achieve each of these steps and report back if any steps failed…
convert this free text
I goto to example.com
I click the search bar
I type in “fruit”
Expect to see a list of fruit
to playwright scriptconst { chromium } = require('playwright');
(async () => { const browser = await chromium.launch(); const context = await browser.newContext(); const page = await context.newPage();
// Navigate to example.com
await page.goto('https://example.com');
// Click the search bar
await page.click('input[type="search"]');
// Type in "fruit"
await page.type('input[type="search"]', 'fruit');
// Press Enter to submit the search
await page.press('input[type="search"]', 'Enter');
// Wait for the page to load
await page.waitForLoadState('networkidle');
// Expect to see a list of fruit
const fruitList = await page.$$('ul li');
if (fruitList.length > 0) {
console.log('List of fruit found!');
} else {
console.error('Failed to find the list of fruit.');
}
// Close the browser
await browser.close();
})();
https://github.com/hitchdev/hitchstory
I dont think that this type of task is really appropriate for an LLM though. It is better to use hard, prewritten abstractions for the truly deterministic stuff (e.g. click).
For other situations where you may need to do subtle trade offs (e.g. choosing an HTML selector for the search bar) an LLM will generally do a bad job.