Skip to content

Commit 447a2d0

Browse files
jk4235buttercannfly
authored andcommitted
feat(dom-snapshot): add DOM snapshot utility with serialization and search capabilities
1 parent 99ba4da commit 447a2d0

File tree

14 files changed

+2896
-2
lines changed

14 files changed

+2896
-2
lines changed

biome.json

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,13 @@
4040
}
4141
],
4242
"files": {
43-
"includes": ["**", "!**/dist", "!**/coverage", "!**/build", "!**/assets"]
43+
"includes": [
44+
"**",
45+
"!**/dist",
46+
"!**/coverage",
47+
"!**/build",
48+
"!**/assets",
49+
"!.history"
50+
]
4451
}
4552
}

packages/dom-snapshot/README.md

Lines changed: 225 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,225 @@
1+
# @aipexstudio/dom-snapshot
2+
3+
A lightweight library for capturing DOM snapshots without relying on Chrome DevTools Protocol (CDP) Accessibility Tree (AXTree). This library provides a pure JavaScript/TypeScript solution for creating structured page snapshots that can be used for web automation, testing, and AI-powered browser agents.
4+
5+
## Why Not CDP AXTree?
6+
7+
Traditional approaches to capturing page structure often rely on CDP's Accessibility Tree, which has several limitations:
8+
9+
- **Browser dependency**: Requires Chrome/Chromium with DevTools Protocol
10+
- **Performance overhead**: CDP communication adds latency
11+
- **Complex setup**: Needs browser debugging port configuration
12+
- **Limited portability**: Doesn't work in all browser contexts
13+
14+
This library takes a different approach by directly traversing the DOM and building a semantic snapshot that mimics accessibility tree structure, but works in any browser environment with just JavaScript.
15+
16+
## Features
17+
18+
- **Pure DOM-based**: No CDP or browser extensions required
19+
- **Accessibility-aware**: Captures semantic roles, names, and states following ARIA patterns
20+
- **Interactive element focus**: Prioritizes buttons, links, inputs, and other actionable elements
21+
- **Hidden element filtering**: Automatically skips `aria-hidden`, `display:none`, `visibility:hidden`, and `inert` elements
22+
- **Stable node IDs**: Assigns persistent `data-aipex-nodeid` attributes for reliable element targeting
23+
- **Text content extraction**: Captures static text nodes for full page context
24+
- **Configurable options**: Control text length limits, hidden element inclusion, and text node capture
25+
- **Search functionality**: Built-in glob pattern search across snapshot text
26+
27+
## Installation
28+
29+
```bash
30+
npm install @aipexstudio/dom-snapshot
31+
# or
32+
pnpm add @aipexstudio/dom-snapshot
33+
```
34+
35+
## Usage
36+
37+
### Basic Snapshot Collection
38+
39+
```typescript
40+
import { collectDomSnapshot, collectDomSnapshotInPage } from '@aipexstudio/dom-snapshot';
41+
42+
// Collect snapshot from current page
43+
const snapshot = collectDomSnapshotInPage();
44+
45+
// Or specify a custom document
46+
const snapshot = collectDomSnapshot(document, {
47+
maxTextLength: 160, // Max characters for element text (default: 160, does not affect StaticText)
48+
includeHidden: false, // Include hidden elements (default: false)
49+
captureTextNodes: true, // Capture StaticText nodes (default: true)
50+
});
51+
52+
console.log(snapshot.totalNodes); // Total nodes captured
53+
console.log(snapshot.root); // Root node of the tree
54+
console.log(snapshot.idToNode); // Flat map of id -> node
55+
console.log(snapshot.metadata.url); // Page URL
56+
```
57+
58+
### Converting to Text Format
59+
60+
```typescript
61+
import { collectDomSnapshot, DomSnapshotManager } from '@aipexstudio/dom-snapshot';
62+
63+
const manager = new DomSnapshotManager();
64+
65+
// Collect raw snapshot
66+
const serialized = collectDomSnapshot(document);
67+
68+
// Convert to TextSnapshot format
69+
const textSnapshot = manager.buildTextSnapshot(serialized, { tabId: 1 });
70+
71+
// Format as readable text representation
72+
const formatted = manager.formatSnapshot(textSnapshot);
73+
console.log(formatted);
74+
```
75+
76+
Output example:
77+
```
78+
→uid=dom_abc123 RootWebArea "My Page" <body>
79+
uid=dom_def456 button "Submit" <button>
80+
uid=dom_ghi789 textbox "Email" <input> desc="Enter your email"
81+
StaticText "Welcome to our site"
82+
*uid=dom_jkl012 link "Learn More" <a>
83+
```
84+
85+
Markers:
86+
- `*` - Currently focused element
87+
- `` - Ancestor of focused element
88+
- ` ` (space) - Regular element
89+
90+
### Searching Snapshots
91+
92+
```typescript
93+
import { searchSnapshotText } from '@aipexstudio/dom-snapshot';
94+
95+
const formatted = manager.formatSnapshot(textSnapshot);
96+
97+
// Simple text search
98+
const result = searchSnapshotText(formatted, 'Submit');
99+
100+
// Multiple terms with | separator
101+
const result = searchSnapshotText(formatted, '登录 | Login | Sign In');
102+
103+
// Glob pattern search
104+
const result = searchSnapshotText(formatted, 'button* | *submit*', {
105+
useGlob: true,
106+
contextLevels: 2, // Lines of context around matches
107+
caseSensitive: false,
108+
});
109+
110+
console.log(result.matchedLines); // Line numbers of matches
111+
console.log(result.contextLines); // All lines to display (with context)
112+
console.log(result.totalMatches); // Total match count
113+
```
114+
115+
## API Reference
116+
117+
### `collectDomSnapshot(document, options?)`
118+
119+
Collects a DOM snapshot from the specified document.
120+
121+
**Parameters:**
122+
- `document` - The Document to snapshot
123+
- `options` - Optional configuration:
124+
- `maxTextLength` (number, default: 160) - Maximum text length for element nodes (does not affect StaticText nodes which preserve full content)
125+
- `includeHidden` (boolean, default: false) - Include hidden elements
126+
- `captureTextNodes` (boolean, default: true) - Capture text nodes as StaticText
127+
128+
**Returns:** `SerializedDomSnapshot`
129+
130+
### `collectDomSnapshotInPage(options?)`
131+
132+
Convenience function that calls `collectDomSnapshot` with the current `document`.
133+
134+
### `DomSnapshotManager`
135+
136+
Manager class for converting and formatting snapshots.
137+
138+
**Methods:**
139+
- `buildTextSnapshot(source, options?)` - Convert serialized snapshot to TextSnapshot
140+
- `formatSnapshot(snapshot)` - Format TextSnapshot as readable text
141+
142+
### `searchSnapshotText(text, query, options?)`
143+
144+
Search snapshot text with optional glob patterns.
145+
146+
**Parameters:**
147+
- `text` - The formatted snapshot text
148+
- `query` - Search query (use `|` to separate multiple terms)
149+
- `options`:
150+
- `contextLevels` (number, default: 1) - Lines of context around matches
151+
- `caseSensitive` (boolean, default: false) - Case-sensitive search
152+
- `useGlob` (boolean, auto-detect) - Enable glob pattern matching
153+
154+
## Node Structure
155+
156+
Each captured node includes:
157+
158+
```typescript
159+
interface DomSnapshotNode {
160+
id: string; // Unique node identifier
161+
role: string; // Semantic role (button, link, textbox, etc.)
162+
name?: string; // Accessible name
163+
value?: string; // Current value (for inputs)
164+
description?: string; // Additional description
165+
children: DomSnapshotNode[]; // Child nodes
166+
tagName?: string; // HTML tag name
167+
168+
// State properties
169+
checked?: boolean | 'mixed'; // Checkbox/radio state
170+
pressed?: boolean | 'mixed'; // Toggle button state
171+
disabled?: boolean; // Disabled state
172+
focused?: boolean; // Focus state
173+
selected?: boolean; // Selection state
174+
expanded?: boolean; // Expanded state
175+
176+
// Additional properties
177+
placeholder?: string; // Input placeholder
178+
href?: string; // Link URL
179+
title?: string; // Element title
180+
textContent?: string; // Text content
181+
inputType?: string; // Input type attribute
182+
}
183+
```
184+
185+
## Role Mapping
186+
187+
The library maps HTML elements to semantic roles:
188+
189+
| HTML Element | Role |
190+
|-------------|------|
191+
| `<button>` | button |
192+
| `<a href="...">` | link |
193+
| `<input type="text">` | textbox |
194+
| `<input type="checkbox">` | checkbox |
195+
| `<input type="radio">` | radio |
196+
| `<input type="range">` | slider |
197+
| `<select>` | combobox |
198+
| `<textarea>` | textbox |
199+
| `<img>` | image |
200+
| Elements with `contenteditable` | textbox |
201+
202+
Explicit `role` attributes are respected and take precedence.
203+
204+
## Skipped Elements
205+
206+
The following are automatically excluded from snapshots:
207+
208+
- `<script>`, `<style>`, `<noscript>`, `<template>`, `<svg>`, `<head>`, `<meta>`, `<link>`
209+
- Elements with `aria-hidden="true"`
210+
- Elements with `hidden` attribute
211+
- Elements with `inert` attribute
212+
- Elements with `display: none`
213+
- Elements with `visibility: hidden`
214+
215+
## Use Cases
216+
217+
- **Web Automation**: Provide page context to AI agents for browser automation
218+
- **Testing**: Capture page state for snapshot testing
219+
- **Accessibility Auditing**: Analyze semantic structure of pages
220+
- **Content Extraction**: Extract meaningful content from web pages
221+
- **Browser Extensions**: Build tools that need page structure without CDP
222+
223+
## License
224+
225+
MIT

packages/dom-snapshot/package.json

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
{
2+
"name": "@aipexstudio/dom-snapshot",
3+
"version": "0.0.6",
4+
"description": "DOM snapshot utility for capturing and serializing web page state",
5+
"main": "./dist/src/index.js",
6+
"types": "./dist/src/index.d.ts",
7+
"repository": {
8+
"type": "git",
9+
"url": "git+https://github.com/AIPexStudio/AIPex.git"
10+
},
11+
"exports": {
12+
".": {
13+
"types": "./dist/src/index.d.ts",
14+
"import": "./dist/src/index.js"
15+
}
16+
},
17+
"files": [
18+
"dist/src/**/*.js",
19+
"dist/src/**/*.d.ts",
20+
"!dist/src/**/*.test.js",
21+
"!dist/src/**/*.test.d.ts"
22+
],
23+
"scripts": {
24+
"build": "tsc",
25+
"test": "vitest run",
26+
"typecheck": "tsc --project tsconfig.json",
27+
"prepublishOnly": "npm run build"
28+
},
29+
"keywords": [
30+
"dom",
31+
"snapshot",
32+
"serialize",
33+
"web",
34+
"browser"
35+
],
36+
"author": "AIPex Studio",
37+
"license": "MIT",
38+
"type": "module",
39+
"devDependencies": {
40+
"@types/node": "^24.10.1"
41+
}
42+
}

0 commit comments

Comments
 (0)