You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -12,26 +12,7 @@ Transform your data validation workflow with conversational AI in VS Code or Pos
12
12
### 1. Install
13
13
14
14
```bash
15
-
pip install pointblank[mcp,pd,excel]
16
-
```
17
-
18
-
**What this installs:**
19
-
20
-
-`mcp` - Model Context Protocol server dependencies
21
-
-`pd` - pandas backend for data processing
22
-
-`excel` - Excel file support (`openpyxl`)
23
-
24
-
**Alternative installs based on your needs:**
25
-
26
-
```bash
27
-
# Minimal MCP server only
28
15
pip install pointblank[mcp]
29
-
30
-
# Add Polars for faster data processing
31
-
pip install pointblank[mcp,pd,pl]
32
-
33
-
# Full installation with all backends
34
-
pip install pointblank[mcp,pd,pl,excel]
35
16
```
36
17
37
18
### 2. Configure Your IDE
@@ -79,38 +60,57 @@ That's basically how you get started.
79
60
80
61
## Essential Commands
81
62
82
-
Master these four command patterns and you'll be able to handle most data validation scenarios. Think of these as your fundamental vocabulary for talking to Pointblank.
63
+
Master these five command patterns and you'll be able to handle most data validation scenarios. Think of these as your fundamental vocabulary for talking to Pointblank.
83
64
84
65
### Load Data
85
66
86
67
```
87
68
"Load the file /path/to/data.csv"
88
-
"Open my customer data from Downloads"
89
-
"Load the Excel file with sales metrics"
69
+
"Load my Netflix dataset from Downloads"
70
+
"Load the CSV file with sales metrics"
71
+
"Load customer_data.csv as my main dataset"
72
+
```
73
+
74
+
### Explore Data
75
+
76
+
```
77
+
"Analyze the data for netflix_data"
78
+
"Show me a preview of the loaded data"
79
+
"Create a column summary table"
80
+
"Generate a missing values analysis"
90
81
```
91
82
83
+
**What you'll get**: Comprehensive data profiling with statistics including missing values, data types, distributions, and summary statistics for each column. The preview and summary tables are automatically generated as beautiful HTML files that open in your browser. This gives you a complete picture of your dataset's structure and characteristics before you define quality rules.
84
+
92
85
### Check Quality
93
86
94
87
```
95
-
"Analyze the data quality"
96
-
"What issues should I worry about?"
97
-
"Check for missing values and duplicates"
88
+
"Create a validator for netflix_data"
89
+
"Add validation that ratings are between 0 and 10"
90
+
"Check that all release years are reasonable"
91
+
"Apply the basic_quality template"
98
92
```
99
93
94
+
**What you'll get**: Actual data quality validation that checks your data against business rules and domain knowledge. This tells you if your data meets your specific quality requirements and identifies rows that fail validation criteria.
95
+
100
96
### Create Data Validations
101
97
102
98
```
103
-
"Set up validation rules for this data"
104
-
"Check that all emails are valid"
105
-
"Make sure amounts are positive"
99
+
"Add validation that show_id values are unique"
100
+
"Check that cast field is not empty for movies"
101
+
"Ensure vote_count is greater than 0"
102
+
"Validate that country field follows ISO format"
106
103
```
107
104
105
+
**What you'll get**: Individual validation rules added to your validator. Each rule tests a specific business requirement and can be customized with thresholds and actions.
106
+
108
107
### Run and Export
109
108
110
109
```
111
-
"Run the validation"
112
-
"Show me what failed"
113
-
"Export problem rows to CSV"
110
+
"Run the validation and show results"
111
+
"Export validation failures to CSV"
112
+
"Get failed rows for step 2"
113
+
"Save the validation report"
114
114
```
115
115
116
116
---
@@ -124,61 +124,99 @@ Once you have Pointblank set up, these scenarios show the most common and powerf
124
124
**When**: You get a new dataset
125
125
126
126
```
127
-
👤 "I just got customer_data.csv. Is it clean?"
128
-
🤖 "Loading... Found 15K customers. Issues:
129
-
- ✅ No duplicate IDs
130
-
- ⚠️ 2% missing phones
131
-
- ❌ 234 invalid emails
132
-
Want me to fix the emails?"
127
+
👤 "I just got netflix_data.csv. Is it clean?"
128
+
🤖 "Loading dataset... Found 16,000 movies/shows with 18 columns.
129
+
Quality analysis complete:
130
+
- ✅ No duplicate show IDs
131
+
- ⚠️ 466 missing countries (2.9%)
132
+
- ⚠️ 132 missing directors (0.8%)
133
+
Opening data preview in your browser..."
133
134
```
134
135
135
136
### 2. 🔄 Daily Validation
136
137
137
138
**When**: Regular data quality checks
138
139
139
140
```
140
-
👤 "Run our standard sales validation"
141
-
🤖 "Running daily_sales template...
142
-
✅ All checks passed!
143
-
3,247 transactions validated
144
-
Ready for dashboard update"
141
+
👤 "Apply the basic_quality template to my data"
142
+
🤖 "Creating validator with basic quality checks...
143
+
✅ Data types validated
144
+
✅ Missing value thresholds checked
145
+
✅ Duplicate detection passed
146
+
Validation complete! All checks passed."
145
147
```
146
148
147
149
### 3. 🚨 Issue Investigation
148
150
149
151
**When**: Something looks wrong
150
152
151
153
```
152
-
👤 "Conversion rates jumped to 95%. That's impossible"
153
-
🤖 "Found the problem! Test data contamination:
154
-
- 1,247 test emails with 100% conversion
155
-
- Real rate is 4.2% (normal)
156
-
Should I clean the data?"
154
+
👤 "Show me movies with ratings above 9.5"
155
+
🤖 "Creating validator for high ratings...
156
+
Found 47 entries with ratings > 9.5
157
+
Exporting suspicious rows to CSV
158
+
Most are documentaries - could be valid!"
157
159
```
158
160
159
161
These examples show how natural conversation can quickly identify and resolve data quality issues that might take hours to diagnose manually.
160
162
163
+
## Core Capabilities
164
+
165
+
Pointblank's MCP server provides powerful tools for comprehensive data validation with beautiful, interactive HTML reports:
166
+
167
+
### Data Exploration
168
+
169
+
-**Interactive HTML previews** with automatic browser opening showing head/tail rows
170
+
-**Column summary tables** with detailed statistics and color-coded data types
171
+
-**Missing values analysis** with visual patterns and percentages
172
+
-**Data quality analysis** with comprehensive profiling insights
173
+
174
+
### Validation Workflows
175
+
176
+
-**Validator creation** with flexible thresholds and configuration
177
+
-**Many validation types** for comprehensive data quality checking
178
+
-**Step-by-step validation** building with natural language commands
179
+
-**Template-based validation** for common data quality patterns
180
+
181
+
### HTML Reports & Analysis
182
+
183
+
-**Interactive validation reports** automatically opened in your browser
184
+
-**Timestamped HTML files** for easy sharing and documentation
185
+
-**Python code generation** for reproducible validation scripts
186
+
187
+
All interactions use natural language, making advanced data validation accessible to users at any technical level while producing publication-ready HTML reports.
188
+
161
189
## Common Validation Rules
162
190
163
-
Understanding what validation rules to ask for will help you quickly build comprehensive data quality checks. These examples cover the most frequent validation scenarios across different industries and data types.
191
+
Understanding what validation rules to ask for will help you quickly build comprehensive data quality checks. These examples cover the most frequent validation scenarios using Pointblank's built-in validation functions.
164
192
165
193
### Data Integrity
166
194
167
-
- "Check for duplicate IDs"
168
-
- "Ensure no missing required fields"
169
-
- "Validate that dates are reasonable"
195
+
- "Check for duplicate show IDs"
196
+
- "Ensure no missing required fields like title"
197
+
- "Validate that release years are between 1900 and 2025"
170
198
171
199
### Business Logic
172
200
173
-
- "Amounts must be positive"
174
-
- "Email addresses must be valid format"
175
-
- "Status must be active, inactive, or pending"
201
+
- "Ratings must be between 0 and 10"
202
+
- "Budget must be positive numbers"
203
+
- "Duration should be greater than 0"
176
204
177
205
### Cross-Field Validation
178
206
179
-
- "End date must be after start date"
180
-
- "Discount percentage between 0 and 100"
181
-
- "Age must match birth date"
207
+
- "Release year should match date_added year"
208
+
- "Vote count should correlate with popularity"
209
+
- "Movies should have directors specified"
210
+
211
+
### Available Templates
212
+
213
+
Pointblank includes pre-built validation templates:
214
+
215
+
-`basic_quality` - Essential data quality checks
216
+
-`financial_data` - Money and numeric validations
217
+
-`customer_data` - Personal information validations
218
+
-`sensor_data` - Time series and measurement checks
219
+
-`survey_data` - Response and rating validations
182
220
183
221
These rule patterns can be combined and customized for your specific data and business requirements. The natural language interface makes it easy to express complex validation logic without learning technical syntax.
184
222
@@ -213,31 +251,54 @@ These recommendations will help you get more value from your Pointblank MCP serv
213
251
"Use our standard survey validation"
214
252
```
215
253
254
+
### Interactive Visual Tables
255
+
256
+
Pointblank automatically generates beautiful, interactive HTML tables for data exploration:
257
+
258
+
```
259
+
"Show me a preview of the data"
260
+
"Generate a column summary table"
261
+
"Create a missing values analysis"
262
+
```
263
+
264
+
These commands create professional HTML tables with:
265
+
266
+
-**Color-coded data types** (numeric in purple, text in yellow)
267
+
-**Gradient styling** tailored to each table type
268
+
-**Automatic browser opening** for immediate viewing
269
+
-**Timestamped files** for easy reference and sharing
270
+
271
+
The tables open automatically in your default browser, making it easy to share data insights with colleagues or include in presentations.
272
+
216
273
These practices help you build data quality workflows that scale with your needs while remaining accessible to those with varying technical backgrounds.
217
274
218
275
## File Support
219
276
220
277
Pointblank works with many major data file formats, making it easy to validate data regardless of how it's stored. This support means you can maintain consistent validation practices across your entire data ecosystem.
The consistent natural language interface works the same regardless of file format, so you can focus on validation logic rather than technical details.
286
+
The consistent natural language interface works the same regardless of file format, so you can focus on validation logic rather than technical details. Polars provides faster processing for large datasets, while Pandas offers broader format support.
230
287
231
288
## Quick Troubleshooting
232
289
233
290
When you encounter issues, these quick fixes resolve the most common problems. Furthermore, the natural language interface means you can always ask for help and explanations.
234
291
235
292
| Problem | Quick Fix |
236
293
|---------|-----------|
237
-
| "File not found" | Use full file path: `/Users/name/data.csv`|
238
-
| Validation too slow | "Use a sample for testing" |
| "Need validation ideas" | Ask "Show me validation templates" or "Suggest validations for my data" |
300
+
301
+
**Browser Issues**: The HTML tables automatically open in your default browser. If they don't appear, check that your browser isn't blocking pop-ups and that you have a default browser set in your system preferences.
241
302
242
303
Remember, you can always ask the AI to explain what's happening or suggest solutions when you run into problems.
0 commit comments