Skip to content

Commit 2a9bca0

Browse files
authored
Merge pull request #265 from posit-dev/fix-mcp-server-issues
fix: MCP server issues
2 parents 72136d0 + 81ad29d commit 2a9bca0

File tree

3 files changed

+1171
-244
lines changed

3 files changed

+1171
-244
lines changed

docs/user-guide/mcp-quick-start.qmd

Lines changed: 129 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -12,26 +12,7 @@ Transform your data validation workflow with conversational AI in VS Code or Pos
1212
### 1. Install
1313

1414
```bash
15-
pip install pointblank[mcp,pd,excel]
16-
```
17-
18-
**What this installs:**
19-
20-
- `mcp` - Model Context Protocol server dependencies
21-
- `pd` - pandas backend for data processing
22-
- `excel` - Excel file support (`openpyxl`)
23-
24-
**Alternative installs based on your needs:**
25-
26-
```bash
27-
# Minimal MCP server only
2815
pip install pointblank[mcp]
29-
30-
# Add Polars for faster data processing
31-
pip install pointblank[mcp,pd,pl]
32-
33-
# Full installation with all backends
34-
pip install pointblank[mcp,pd,pl,excel]
3516
```
3617

3718
### 2. Configure Your IDE
@@ -79,38 +60,57 @@ That's basically how you get started.
7960

8061
## Essential Commands
8162

82-
Master these four command patterns and you'll be able to handle most data validation scenarios. Think of these as your fundamental vocabulary for talking to Pointblank.
63+
Master these five command patterns and you'll be able to handle most data validation scenarios. Think of these as your fundamental vocabulary for talking to Pointblank.
8364

8465
### Load Data
8566

8667
```
8768
"Load the file /path/to/data.csv"
88-
"Open my customer data from Downloads"
89-
"Load the Excel file with sales metrics"
69+
"Load my Netflix dataset from Downloads"
70+
"Load the CSV file with sales metrics"
71+
"Load customer_data.csv as my main dataset"
72+
```
73+
74+
### Explore Data
75+
76+
```
77+
"Analyze the data for netflix_data"
78+
"Show me a preview of the loaded data"
79+
"Create a column summary table"
80+
"Generate a missing values analysis"
9081
```
9182

83+
**What you'll get**: Comprehensive data profiling with statistics including missing values, data types, distributions, and summary statistics for each column. The preview and summary tables are automatically generated as beautiful HTML files that open in your browser. This gives you a complete picture of your dataset's structure and characteristics before you define quality rules.
84+
9285
### Check Quality
9386

9487
```
95-
"Analyze the data quality"
96-
"What issues should I worry about?"
97-
"Check for missing values and duplicates"
88+
"Create a validator for netflix_data"
89+
"Add validation that ratings are between 0 and 10"
90+
"Check that all release years are reasonable"
91+
"Apply the basic_quality template"
9892
```
9993

94+
**What you'll get**: Actual data quality validation that checks your data against business rules and domain knowledge. This tells you if your data meets your specific quality requirements and identifies rows that fail validation criteria.
95+
10096
### Create Data Validations
10197

10298
```
103-
"Set up validation rules for this data"
104-
"Check that all emails are valid"
105-
"Make sure amounts are positive"
99+
"Add validation that show_id values are unique"
100+
"Check that cast field is not empty for movies"
101+
"Ensure vote_count is greater than 0"
102+
"Validate that country field follows ISO format"
106103
```
107104

105+
**What you'll get**: Individual validation rules added to your validator. Each rule tests a specific business requirement and can be customized with thresholds and actions.
106+
108107
### Run and Export
109108

110109
```
111-
"Run the validation"
112-
"Show me what failed"
113-
"Export problem rows to CSV"
110+
"Run the validation and show results"
111+
"Export validation failures to CSV"
112+
"Get failed rows for step 2"
113+
"Save the validation report"
114114
```
115115

116116
---
@@ -124,61 +124,99 @@ Once you have Pointblank set up, these scenarios show the most common and powerf
124124
**When**: You get a new dataset
125125

126126
```
127-
👤 "I just got customer_data.csv. Is it clean?"
128-
🤖 "Loading... Found 15K customers. Issues:
129-
- ✅ No duplicate IDs
130-
- ⚠️ 2% missing phones
131-
- ❌ 234 invalid emails
132-
Want me to fix the emails?"
127+
👤 "I just got netflix_data.csv. Is it clean?"
128+
🤖 "Loading dataset... Found 16,000 movies/shows with 18 columns.
129+
Quality analysis complete:
130+
- ✅ No duplicate show IDs
131+
- ⚠️ 466 missing countries (2.9%)
132+
- ⚠️ 132 missing directors (0.8%)
133+
Opening data preview in your browser..."
133134
```
134135

135136
### 2. 🔄 Daily Validation
136137

137138
**When**: Regular data quality checks
138139

139140
```
140-
👤 "Run our standard sales validation"
141-
🤖 "Running daily_sales template...
142-
✅ All checks passed!
143-
3,247 transactions validated
144-
Ready for dashboard update"
141+
👤 "Apply the basic_quality template to my data"
142+
🤖 "Creating validator with basic quality checks...
143+
✅ Data types validated
144+
✅ Missing value thresholds checked
145+
✅ Duplicate detection passed
146+
Validation complete! All checks passed."
145147
```
146148

147149
### 3. 🚨 Issue Investigation
148150

149151
**When**: Something looks wrong
150152

151153
```
152-
👤 "Conversion rates jumped to 95%. That's impossible"
153-
🤖 "Found the problem! Test data contamination:
154-
- 1,247 test emails with 100% conversion
155-
- Real rate is 4.2% (normal)
156-
Should I clean the data?"
154+
👤 "Show me movies with ratings above 9.5"
155+
🤖 "Creating validator for high ratings...
156+
Found 47 entries with ratings > 9.5
157+
Exporting suspicious rows to CSV
158+
Most are documentaries - could be valid!"
157159
```
158160

159161
These examples show how natural conversation can quickly identify and resolve data quality issues that might take hours to diagnose manually.
160162

163+
## Core Capabilities
164+
165+
Pointblank's MCP server provides powerful tools for comprehensive data validation with beautiful, interactive HTML reports:
166+
167+
### Data Exploration
168+
169+
- **Interactive HTML previews** with automatic browser opening showing head/tail rows
170+
- **Column summary tables** with detailed statistics and color-coded data types
171+
- **Missing values analysis** with visual patterns and percentages
172+
- **Data quality analysis** with comprehensive profiling insights
173+
174+
### Validation Workflows
175+
176+
- **Validator creation** with flexible thresholds and configuration
177+
- **Many validation types** for comprehensive data quality checking
178+
- **Step-by-step validation** building with natural language commands
179+
- **Template-based validation** for common data quality patterns
180+
181+
### HTML Reports & Analysis
182+
183+
- **Interactive validation reports** automatically opened in your browser
184+
- **Timestamped HTML files** for easy sharing and documentation
185+
- **Python code generation** for reproducible validation scripts
186+
187+
All interactions use natural language, making advanced data validation accessible to users at any technical level while producing publication-ready HTML reports.
188+
161189
## Common Validation Rules
162190

163-
Understanding what validation rules to ask for will help you quickly build comprehensive data quality checks. These examples cover the most frequent validation scenarios across different industries and data types.
191+
Understanding what validation rules to ask for will help you quickly build comprehensive data quality checks. These examples cover the most frequent validation scenarios using Pointblank's built-in validation functions.
164192

165193
### Data Integrity
166194

167-
- "Check for duplicate IDs"
168-
- "Ensure no missing required fields"
169-
- "Validate that dates are reasonable"
195+
- "Check for duplicate show IDs"
196+
- "Ensure no missing required fields like title"
197+
- "Validate that release years are between 1900 and 2025"
170198

171199
### Business Logic
172200

173-
- "Amounts must be positive"
174-
- "Email addresses must be valid format"
175-
- "Status must be active, inactive, or pending"
201+
- "Ratings must be between 0 and 10"
202+
- "Budget must be positive numbers"
203+
- "Duration should be greater than 0"
176204

177205
### Cross-Field Validation
178206

179-
- "End date must be after start date"
180-
- "Discount percentage between 0 and 100"
181-
- "Age must match birth date"
207+
- "Release year should match date_added year"
208+
- "Vote count should correlate with popularity"
209+
- "Movies should have directors specified"
210+
211+
### Available Templates
212+
213+
Pointblank includes pre-built validation templates:
214+
215+
- `basic_quality` - Essential data quality checks
216+
- `financial_data` - Money and numeric validations
217+
- `customer_data` - Personal information validations
218+
- `sensor_data` - Time series and measurement checks
219+
- `survey_data` - Response and rating validations
182220

183221
These rule patterns can be combined and customized for your specific data and business requirements. The natural language interface makes it easy to express complex validation logic without learning technical syntax.
184222

@@ -213,31 +251,54 @@ These recommendations will help you get more value from your Pointblank MCP serv
213251
"Use our standard survey validation"
214252
```
215253

254+
### Interactive Visual Tables
255+
256+
Pointblank automatically generates beautiful, interactive HTML tables for data exploration:
257+
258+
```
259+
"Show me a preview of the data"
260+
"Generate a column summary table"
261+
"Create a missing values analysis"
262+
```
263+
264+
These commands create professional HTML tables with:
265+
266+
- **Color-coded data types** (numeric in purple, text in yellow)
267+
- **Gradient styling** tailored to each table type
268+
- **Automatic browser opening** for immediate viewing
269+
- **Timestamped files** for easy reference and sharing
270+
271+
The tables open automatically in your default browser, making it easy to share data insights with colleagues or include in presentations.
272+
216273
These practices help you build data quality workflows that scale with your needs while remaining accessible to those with varying technical backgrounds.
217274

218275
## File Support
219276

220277
Pointblank works with many major data file formats, making it easy to validate data regardless of how it's stored. This support means you can maintain consistent validation practices across your entire data ecosystem.
221278

222-
| Type | Extensions | Example |
223-
|------|------------|---------|
224-
| **CSV** | `.csv` | `sales_data.csv` |
225-
| **Excel** | `.xlsx`, `.xls` | `monthly_report.xlsx` |
226-
| **Parquet** | `.parquet` | `big_data.parquet` |
227-
| **JSON** | `.json`, `.jsonl` | `api_response.json` |
279+
| Type | Extensions | Example | Backend Support |
280+
|------|------------|---------|-----------------|
281+
| **CSV** | `.csv` | `sales_data.csv` | pandas, polars |
282+
| **Parquet** | `.parquet` | `big_data.parquet` | pandas, polars |
283+
| **JSON** | `.json` | `api_response.json` | pandas, polars |
284+
| **JSON Lines** | `.jsonl` | `streaming_data.jsonl` | pandas, polars |
228285

229-
The consistent natural language interface works the same regardless of file format, so you can focus on validation logic rather than technical details.
286+
The consistent natural language interface works the same regardless of file format, so you can focus on validation logic rather than technical details. Polars provides faster processing for large datasets, while Pandas offers broader format support.
230287

231288
## Quick Troubleshooting
232289

233290
When you encounter issues, these quick fixes resolve the most common problems. Furthermore, the natural language interface means you can always ask for help and explanations.
234291

235292
| Problem | Quick Fix |
236293
|---------|-----------|
237-
| "File not found" | Use full file path: `/Users/name/data.csv` |
238-
| Validation too slow | "Use a sample for testing" |
239-
| Don't understand error | "Explain why validation failed" |
240-
| Need help | "Show me examples of data quality checks" |
294+
| "File not found" | Use absolute path: `/Users/name/Downloads/data.csv` |
295+
| "DataFrame not found" | Check loaded datasets with "List my loaded dataframes" |
296+
| "Validator not found" | Use "List active validators" to see available validators |
297+
| "Validation too slow" | Try "Use pandas backend" or sample your data first |
298+
| "HTML tables won't open" | Check your default browser settings |
299+
| "Need validation ideas" | Ask "Show me validation templates" or "Suggest validations for my data" |
300+
301+
**Browser Issues**: The HTML tables automatically open in your default browser. If they don't appear, check that your browser isn't blocking pop-ups and that you have a default browser set in your system preferences.
241302

242303
Remember, you can always ask the AI to explain what's happening or suggest solutions when you run into problems.
243304

0 commit comments

Comments
 (0)