|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const BASE_URL = (process.env.FINANCE_HUB_URL || '[URL_WITH_CREDENTIALS] '^1.29.0'}
Thinking
Thinking
Good. Now let me test the MCP server can at least start and call the health check:
Bash
Smoke test MCP server startup
IN
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | timeout 5 node server.js 2>/dev/null | head -3 || echo "NODE test (expected timeout on stdin wait)"
Copy code to clipboard
OUT
{"result":{"tools":[{"name":"list_transactions","description":"List finance transactions with optional filters. Returns paginated results including total amount for the filtered set.","inputSchema":{"type":"object","properties":{"page":{"type":"number","description":"Page number (default 1)"},"limit":{"type":"number","description":"Results per page, max 200 (default 50)"},"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD (inclusive)"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD (inclusive)"},"tag":{"type":"string","description":"Filter by tag name"},"recipient":{"type":"string","description":"Substring match on payee name"},"type":{"type":"string","description":"Transaction type: POS | ATM | WALLET"},"source":{"type":"string","description":"Import source: INGEST | UPLOAD"},"search":{"type":"string","description":"Full-text search across rawMessage and recipient"},"hideBalanceAlerts":{"type":"boolean","description":"Exclude balance-notification SMS (default false)"},"sortBy":{"type":"string","description":"Sort field: date | amount | recipient | createdAt"},"sortDir":{"type":"string","description":"asc or desc (default desc)"}}}},{"name":"spending_by_tag","description":"Aggregate spending totals by tag for a date range. Fetches up to 200 transactions and groups them. Good for monthly summaries.","inputSchema":{"type":"object","properties":{"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD"}}}},{"name":"get_transaction","description":"Get a single transaction by its numeric ID.","inputSchema":{"type":"object","required":["id"],"properties":{"id":{"type":"number","description":"Transaction import ID"}}}},{"name":"list_tags","description":"List all available tags with their colors.","inputSchema":{"type":"object","properties":{}}},{"name":"add_tag","description":"Add a tag to a transaction. Creates the tag globally if it does not exist.","inputSchema":{"type":"object","required":["id","name"],"properties":{"id":{"type":"number","description":"Transaction ID"},"name":{"type":"string","description":"Tag name (e.g. \"Groceries\")"},"color":{"type":"string","description":"Hex color (e.g. \"#22c55e\"). Optional — auto-assigned if omitted."}}}},{"name":"remove_tag","description":"Remove a tag from a transaction.","inputSchema":{"type":"object","required":["id","tagId"],"properties":{"id":{"type":"number","description":"Transaction ID"},"tagId":{"type":"number","description":"Tag ID (from list_tags or the transaction object)"}}}},{"name":"health_check","description":"Check Finance Hub backend connectivity and database status.","inputSchema":{"type":"object","properties":{}}}]},"jsonrpc":"2.0","id":1}
Thinking
Thinking
MCP server responds correctly. Now test an actual tool call:
Bash
Test MCP health_check tool against live backend
IN
\
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"health_check","arguments":{}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 5 node server.js 2>/dev/null | tail -1 | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'])"
Copy code to clipboard
OUT
{
"status": "ok",
"timestamp": "2026-05-09T16:19:09.977Z",
"storage": {
"type": "PostgreSQL",
"host": "db",
"database": "finance_hub"
}
}
Bash
Rebuild backend with API key auth changes
IN
# Test the API key auth directly against the backend (backend still has DEV_BYPASS_AUTH=true, need to restart first)
# First rebuild with the new code
cd /volume2/docker/finance/finance-hub && docker compose up -d --build backend 2>&1 | tail -8
Copy code to clipboard
OUT
time="2026-05-09T19:19:13+03:00" level=warning msg="/volume2/docker/finance/finance-hub/docker-compose.yml: `version` is obsolete"
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.45/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Dfinance-hub%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Bash
Rebuild backend with sudo
IN
sudo docker compose -f /volume2/docker/finance/finance-hub/docker-compose.yml up -d --build backend 2>&1 | tail -10
Copy code to clipboard
OUT
sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required
Thinking
Thinking
Docker needs sudo+password. Let me verify the auth changes are correct by testing the logic directly:
Bash
Unit test the auth middleware logic
IN
node - << 'EOF'
// Simulate the auth middleware logic
const [ENV_SECRET];
process.env.API_KEY = API_KEY;
const { authentikMiddleware } = require('/volume2/docker/finance/finance-hub/backend/src/auth.js');
function testReq(path, headers = {}) {
const res = { status(code) { this.code = code; return this; }, json(body) { return body; } };
const req = { path, headers };
let passed = false;
authentikMiddleware(req, res, () => { passed = true; });
return { passed, code: res.code, user: req.user };
}
// Public path — no auth needed
let r = testReq('/api/health', {});
console.assert(r.passed, 'health should be public');
console.log('✓ /api/health is public');
// Valid API key
r = testReq('/api/payments', { authorization: `Bearer ${API_KEY}` });
console.assert(r.passed, 'valid key should pass');
console.assert(r.user.username === 'api-key', 'api user');
console.log('✓ valid API key passes');
// Wrong API key
r = testReq('/api/payments', { authorization: 'Bearer wrongkey' });
console.assert(!r.passed, 'wrong key should fail');
console.assert(r.code === 401, '401 on bad key');
console.log('✓ wrong API key returns 401');
// No auth + no bypass + no authentik header
process.env.DEV_BYPASS_AUTH = 'false';
r = testReq('/api/payments', {});
console.assert(!r.passed && r.code === 401, 'no creds → 401');
console.log('✓ no credentials returns 401');
// Upload endpoints no longer public
r = testReq('/api/upload/csv', {});
console.assert(!r.passed && r.code === 401, 'upload/csv requires auth');
console.log('✓ /api/upload/csv now requires auth (was public, now fixed)');
// Authentik header pass-through
r = testReq('/api/payments', { 'x-authentik-username': 'lukastest', 'x-authentik-email': '[EMAIL]' });
console.assert(r.passed && r.user.username === 'lukastest', 'authentik header works');
console.log('✓ Authentik X-authentik-username header works');
console.log('\nAll auth tests passed.');
EOF
Copy code to clipboard
OUT
✓ /api/health is public
✓ valid API key passes
✓ wrong API key returns 401
✓ no credentials returns 401
✓ /api/upload/csv now requires auth (was public, now fixed)
✓ Authentik X-authentik-username header works
All auth tests passed.
All auth cases pass. Now test the MCP spending_by_tag tool against the live backend (with DEV_BYPASS_AUTH still on):
Bash
Test spending_by_tag MCP tool against live backend
IN
cd /volume2/docker/finance/finance-hub/mcp && \
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"spending_by_tag","arguments":{"dateFrom":"2026-05-01","dateTo":"2026-05-09"}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 8 node server.js 2>/dev/null | grep '"id":2' | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'][:500])"
Copy code to clipboard
OUT
{
"period": {
"from": "2026-05-01",
"to": "2026-05-09"
},
"totalTransactions": 36,
"breakdown": [
{
"tag": "Bills",
"total": 117.9,
"count": 6
},
{
"tag": "(untagged)",
"total": 1153.87,
"count": 22
}
]
}
MCP tools work correctly against real data. Let me write the Claude desktop config example and the
.gitignore
for the mcp dir:
Write
claude_desktop_config.example.json
claude_desktop_config.example.json
14 lines
{
"_comment": "Add this block to ~/.config/claude/claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/claude_desktop_config.json)",
"mcpServers": {
"finance-hub": {
"command": "node",
"args": ["/volume2/docker/finance/finance-hub/mcp/server.js"],
"env": {
"FINANCE_HUB_URL": "http://localhost:3001",
"FINANCE_HUB_API_KEY": "94bb2917b9a1b9dcbaec49e7009c2a6b222fc310d32f530aa07e411e8d4a058c"
}
}
}
}
Write
.gitignore
.gitignore
2 lines
node_modules/
Update Todos
Fix auth.js: add API key support + remove upload endpoints from PUBLIC_PATHS
Update index.js: add CORS origin restriction via env var
Update .env and .env.example with API_KEY variable
Build MCP server (stdio, Claude desktop compatible)
Test: restart backend, verify API key auth and MCP connection
Now let me save relevant project context to memory:
Read
MEMORY.md
MEMORY.md
Write
MEMORY.md
MEMORY.md
4 lines
# Memory Index
- [project_finance_hub.md](project_finance_hub.md) — Finance Hub architecture, auth design, MCP server, API key setup
Write
project_finance_hub.md...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const BASE_URL = (process.env.FINANCE_HUB_URL || '[URL_WITH_CREDENTIALS] '^1.29.0'}
Thinking
Thinking
Good. Now let me test the MCP server can at least start and call the health check:
Bash
Smoke test MCP server startup
IN
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | timeout 5 node server.js 2>/dev/null | head -3 || echo "NODE test (expected timeout on stdin wait)"
Copy code to clipboard
OUT
{"result":{"tools":[{"name":"list_transactions","description":"List finance transactions with optional filters. Returns paginated results including total amount for the filtered set.","inputSchema":{"type":"object","properties":{"page":{"type":"number","description":"Page number (default 1)"},"limit":{"type":"number","description":"Results per page, max 200 (default 50)"},"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD (inclusive)"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD (inclusive)"},"tag":{"type":"string","description":"Filter by tag name"},"recipient":{"type":"string","description":"Substring match on payee name"},"type":{"type":"string","description":"Transaction type: POS | ATM | WALLET"},"source":{"type":"string","description":"Import source: INGEST | UPLOAD"},"search":{"type":"string","description":"Full-text search across rawMessage and recipient"},"hideBalanceAlerts":{"type":"boolean","description":"Exclude balance-notification SMS (default false)"},"sortBy":{"type":"string","description":"Sort field: date | amount | recipient | createdAt"},"sortDir":{"type":"string","description":"asc or desc (default desc)"}}}},{"name":"spending_by_tag","description":"Aggregate spending totals by tag for a date range. Fetches up to 200 transactions and groups them. Good for monthly summaries.","inputSchema":{"type":"object","properties":{"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD"}}}},{"name":"get_transaction","description":"Get a single transaction by its numeric ID.","inputSchema":{"type":"object","required":["id"],"properties":{"id":{"type":"number","description":"Transaction import ID"}}}},{"name":"list_tags","description":"List all available tags with their colors.","inputSchema":{"type":"object","properties":{}}},{"name":"add_tag","description":"Add a tag to a transaction. Creates the tag globally if it does not exist.","inputSchema":{"type":"object","required":["id","name"],"properties":{"id":{"type":"number","description":"Transaction ID"},"name":{"type":"string","description":"Tag name (e.g. \"Groceries\")"},"color":{"type":"string","description":"Hex color (e.g. \"#22c55e\"). Optional — auto-assigned if omitted."}}}},{"name":"remove_tag","description":"Remove a tag from a transaction.","inputSchema":{"type":"object","required":["id","tagId"],"properties":{"id":{"type":"number","description":"Transaction ID"},"tagId":{"type":"number","description":"Tag ID (from list_tags or the transaction object)"}}}},{"name":"health_check","description":"Check Finance Hub backend connectivity and database status.","inputSchema":{"type":"object","properties":{}}}]},"jsonrpc":"2.0","id":1}
Thinking
Thinking
MCP server responds correctly. Now test an actual tool call:
Bash
Test MCP health_check tool against live backend
IN
\
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"health_check","arguments":{}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 5 node server.js 2>/dev/null | tail -1 | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'])"
Copy code to clipboard
OUT
{
"status": "ok",
"timestamp": "2026-05-09T16:19:09.977Z",
"storage": {
"type": "PostgreSQL",
"host": "db",
"database": "finance_hub"
}
}
Bash
Rebuild backend with API key auth changes
IN
# Test the API key auth directly against the backend (backend still has DEV_BYPASS_AUTH=true, need to restart first)
# First rebuild with the new code
cd /volume2/docker/finance/finance-hub && docker compose up -d --build backend 2>&1 | tail -8
Copy code to clipboard
OUT
time="2026-05-09T19:19:13+03:00" level=warning msg="/volume2/docker/finance/finance-hub/docker-compose.yml: `version` is obsolete"
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.45/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Dfinance-hub%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Bash
Rebuild backend with sudo
IN
sudo docker compose -f /volume2/docker/finance/finance-hub/docker-compose.yml up -d --build backend 2>&1 | tail -10
Copy code to clipboard
OUT
sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required
Thinking
Thinking
Docker needs sudo+password. Let me verify the auth changes are correct by testing the logic directly:
Bash
Unit test the auth middleware logic
IN
node - << 'EOF'
// Simulate the auth middleware logic
const [ENV_SECRET];
process.env.API_KEY = API_KEY;
const { authentikMiddleware } = require('/volume2/docker/finance/finance-hub/backend/src/auth.js');
function testReq(path, headers = {}) {
const res = { status(code) { this.code = code; return this; }, json(body) { return body; } };
const req = { path, headers };
let passed = false;
authentikMiddleware(req, res, () => { passed = true; });
return { passed, code: res.code, user: req.user };
}
// Public path — no auth needed
let r = testReq('/api/health', {});
console.assert(r.passed, 'health should be public');
console.log('✓ /api/health is public');
// Valid API key
r = testReq('/api/payments', { authorization: `Bearer ${API_KEY}` });
console.assert(r.passed, 'valid key should pass');
console.assert(r.user.username === 'api-key', 'api user');
console.log('✓ valid API key passes');
// Wrong API key
r = testReq('/api/payments', { authorization: 'Bearer wrongkey' });
console.assert(!r.passed, 'wrong key should fail');
console.assert(r.code === 401, '401 on bad key');
console.log('✓ wrong API key returns 401');
// No auth + no bypass + no authentik header
process.env.DEV_BYPASS_AUTH = 'false';
r = testReq('/api/payments', {});
console.assert(!r.passed && r.code === 401, 'no creds → 401');
console.log('✓ no credentials returns 401');
// Upload endpoints no longer public
r = testReq('/api/upload/csv', {});
console.assert(!r.passed && r.code === 401, 'upload/csv requires auth');
console.log('✓ /api/upload/csv now requires auth (was public, now fixed)');
// Authentik header pass-through
r = testReq('/api/payments', { 'x-authentik-username': 'lukastest', 'x-authentik-email': '[EMAIL]' });
console.assert(r.passed && r.user.username === 'lukastest', 'authentik header works');
console.log('✓ Authentik X-authentik-username header works');
console.log('\nAll auth tests passed.');
EOF
Copy code to clipboard
OUT
✓ /api/health is public
✓ valid API key passes
✓ wrong API key returns 401
✓ no credentials returns 401
✓ /api/upload/csv now requires auth (was public, now fixed)
✓ Authentik X-authentik-username header works
All auth tests passed.
All auth cases pass. Now test the MCP spending_by_tag tool against the live backend (with DEV_BYPASS_AUTH still on):
Bash
Test spending_by_tag MCP tool against live backend
IN
cd /volume2/docker/finance/finance-hub/mcp && \
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"spending_by_tag","arguments":{"dateFrom":"2026-05-01","dateTo":"2026-05-09"}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 8 node server.js 2>/dev/null | grep '"id":2' | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'][:500])"
Copy code to clipboard
OUT
{
"period": {
"from": "2026-05-01",
"to": "2026-05-09"
},
"totalTransactions": 36,
"breakdown": [
{
"tag": "Bills",
"total": 117.9,
"count": 6
},
{
"tag": "(untagged)",
"total": 1153.87,
"count": 22
}
]
}
MCP tools work correctly against real data. Let me write the Claude desktop config example and the
.gitignore
for the mcp dir:
Write
claude_desktop_config.example.json
claude_desktop_config.example.json
14 lines
{
"_comment": "Add this block to ~/.config/claude/claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/claude_desktop_config.json)",
"mcpServers": {
"finance-hub": {
"command": "node",
"args": ["/volume2/docker/finance/finance-hub/mcp/server.js"],
"env": {
"FINANCE_HUB_URL": "http://localhost:3001",
"FINANCE_HUB_API_KEY": "94bb2917b9a1b9dcbaec49e7009c2a6b222fc310d32f530aa07e411e8d4a058c"
}
}
}
}
Write
.gitignore
.gitignore
2 lines
node_modules/
Update Todos
Fix auth.js: add API key support + remove upload endpoints from PUBLIC_PATHS
Update index.js: add CORS origin restriction via env var
Update .env and .env.example with API_KEY variable
Build MCP server (stdio, Claude desktop compatible)
Test: restart backend, verify API key auth and MCP connection
Now let me save relevant project context to memory:
Read
MEMORY.md
MEMORY.md
Write
MEMORY.md
MEMORY.md
4 lines
# Memory Index
- [project_finance_hub.md](project_finance_hub.md) — Finance Hub architecture, auth design, MCP server, API key setup
Write...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const BASE_URL = (process.env.FINANCE_HUB_URL || '[URL_WITH_CREDENTIALS] '^1.29.0'}
Thinking
Thinking
Good. Now let me test the MCP server can at least start and call the health check:
Bash
Smoke test MCP server startup
IN
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | timeout 5 node server.js 2>/dev/null | head -3 || echo "NODE test (expected timeout on stdin wait)"
Copy code to clipboard
OUT
{"result":{"tools":[{"name":"list_transactions","description":"List finance transactions with optional filters. Returns paginated results including total amount for the filtered set.","inputSchema":{"type":"object","properties":{"page":{"type":"number","description":"Page number (default 1)"},"limit":{"type":"number","description":"Results per page, max 200 (default 50)"},"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD (inclusive)"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD (inclusive)"},"tag":{"type":"string","description":"Filter by tag name"},"recipient":{"type":"string","description":"Substring match on payee name"},"type":{"type":"string","description":"Transaction type: POS | ATM | WALLET"},"source":{"type":"string","description":"Import source: INGEST | UPLOAD"},"search":{"type":"string","description":"Full-text search across rawMessage and recipient"},"hideBalanceAlerts":{"type":"boolean","description":"Exclude balance-notification SMS (default false)"},"sortBy":{"type":"string","description":"Sort field: date | amount | recipient | createdAt"},"sortDir":{"type":"string","description":"asc or desc (default desc)"}}}},{"name":"spending_by_tag","description":"Aggregate spending totals by tag for a date range. Fetches up to 200 transactions and groups them. Good for monthly summaries.","inputSchema":{"type":"object","properties":{"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD"}}}},{"name":"get_transaction","description":"Get a single transaction by its numeric ID.","inputSchema":{"type":"object","required":["id"],"properties":{"id":{"type":"number","description":"Transaction import ID"}}}},{"name":"list_tags","description":"List all available tags with their colors.","inputSchema":{"type":"object","properties":{}}},{"name":"add_tag","description":"Add a tag to a transaction. Creates the tag globally if it does not exist.","inputSchema":{"type":"object","required":["id","name"],"properties":{"id":{"type":"number","description":"Transaction ID"},"name":{"type":"string","description":"Tag name (e.g. \"Groceries\")"},"color":{"type":"string","description":"Hex color (e.g. \"#22c55e\"). Optional — auto-assigned if omitted."}}}},{"name":"remove_tag","description":"Remove a tag from a transaction.","inputSchema":{"type":"object","required":["id","tagId"],"properties":{"id":{"type":"number","description":"Transaction ID"},"tagId":{"type":"number","description":"Tag ID (from list_tags or the transaction object)"}}}},{"name":"health_check","description":"Check Finance Hub backend connectivity and database status.","inputSchema":{"type":"object","properties":{}}}]},"jsonrpc":"2.0","id":1}
Thinking
Thinking
MCP server responds correctly. Now test an actual tool call:
Bash
Test MCP health_check tool against live backend
IN
\
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"health_check","arguments":{}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 5 node server.js 2>/dev/null | tail -1 | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'])"
Copy code to clipboard
OUT
{
"status": "ok",
"timestamp": "2026-05-09T16:19:09.977Z",
"storage": {
"type": "PostgreSQL",
"host": "db",
"database": "finance_hub"
}
}
Bash
Rebuild backend with API key auth changes
IN
# Test the API key auth directly against the backend (backend still has DEV_BYPASS_AUTH=true, need to restart first)
# First rebuild with the new code
cd /volume2/docker/finance/finance-hub && docker compose up -d --build backend 2>&1 | tail -8
Copy code to clipboard
OUT
time="2026-05-09T19:19:13+03:00" level=warning msg="/volume2/docker/finance/finance-hub/docker-compose.yml: `version` is obsolete"
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.45/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Dfinance-hub%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Bash
Rebuild backend with sudo
IN
sudo docker compose -f /volume2/docker/finance/finance-hub/docker-compose.yml up -d --build backend 2>&1 | tail -10
Copy code to clipboard
OUT
sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required
Thinking
Thinking
Docker needs sudo+password. Let me verify the auth changes are correct by testing the logic directly:
Bash
Unit test the auth middleware logic
IN
node - << 'EOF'
// Simulate the auth middleware logic
const [ENV_SECRET];
process.env.API_KEY = API_KEY;
const { authentikMiddleware } = require('/volume2/docker/finance/finance-hub/backend/src/auth.js');
function testReq(path, headers = {}) {
const res = { status(code) { this.code = code; return this; }, json(body) { return body; } };
const req = { path, headers };
let passed = false;
authentikMiddleware(req, res, () => { passed = true; });
return { passed, code: res.code, user: req.user };
}
// Public path — no auth needed
let r = testReq('/api/health', {});
console.assert(r.passed, 'health should be public');
console.log('✓ /api/health is public');
// Valid API key
r = testReq('/api/payments', { authorization: `Bearer ${API_KEY}` });
console.assert(r.passed, 'valid key should pass');
console.assert(r.user.username === 'api-key', 'api user');
console.log('✓ valid API key passes');
// Wrong API key
r = testReq('/api/payments', { authorization: 'Bearer wrongkey' });
console.assert(!r.passed, 'wrong key should fail');
console.assert(r.code === 401, '401 on bad key');
console.log('✓ wrong API key returns 401');
// No auth + no bypass + no authentik header
process.env.DEV_BYPASS_AUTH = 'false';
r = testReq('/api/payments', {});
console.assert(!r.passed && r.code === 401, 'no creds → 401');
console.log('✓ no credentials returns 401');
// Upload endpoints no longer public
r = testReq('/api/upload/csv', {});
console.assert(!r.passed && r.code === 401, 'upload/csv requires auth');
console.log('✓ /api/upload/csv now requires auth (was public, now fixed)');
// Authentik header pass-through
r = testReq('/api/payments', { 'x-authentik-username': 'lukastest', 'x-authentik-email': '[EMAIL]' });
console.assert(r.passed && r.user.username === 'lukastest', 'authentik header works');
console.log('✓ Authentik X-authentik-username header works');
console.log('\nAll auth tests passed.');
EOF
Copy code to clipboard
OUT
✓ /api/health is public
✓ valid API key passes
✓ wrong API key returns 401
✓ no credentials returns 401
✓ /api/upload/csv now requires auth (was public, now fixed)
✓ Authentik X-authentik-username header works
All auth tests passed.
All auth cases pass. Now test the MCP spending_by_tag tool against the live backend (with DEV_BYPASS_AUTH still on):
Bash
Test spending_by_tag MCP tool against live backend
IN
cd /volume2/docker/finance/finance-hub/mcp && \
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"spending_by_tag","arguments":{"dateFrom":"2026-05-01","dateTo":"2026-05-09"}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 8 node server.js 2>/dev/null | grep '"id":2' | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'][:500])"
Copy code to clipboard
OUT
{
"period": {
"from": "2026-05-01",
"to": "2026-05-09"
},
"totalTransactions": 36,
"breakdown": [
{
"tag": "Bills",
"total": 117.9,
"count": 6
},
{
"tag": "(untagged)",
"total": 1153.87,
"count": 22
}
]
}
MCP tools work correctly against real data. Let me write the Claude desktop config example and the
.gitignore
for the mcp dir:
Write
claude_desktop_config.example.json
claude_desktop_config.example.json
14 lines
{
"_comment": "Add this block to ~/.config/claude/claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/claude_desktop_config.json)",
"mcpServers": {
"finance-hub": {
"command": "node",
"args": ["/volume2/docker/finance/finance-hub/mcp/server.js"],
"env": {
"FINANCE_HUB_URL": "http://localhost:3001",
"FINANCE_HUB_API_KEY": "94bb2917b9a1b9dcbaec49e7009c2a6b222fc310d32f530aa07e411e8d4a058c"
}
}
}
}
Write
.gitignore
.gitignore
2 lines
node_modules/
Update Todos...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const BASE_URL = (process.env.FINANCE_HUB_URL || '[URL_WITH_CREDENTIALS] '^1.29.0'}
Thinking
Thinking
Good. Now let me test the MCP server can at least start and call the health check:
Bash
Smoke test MCP server startup
IN
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | timeout 5 node server.js 2>/dev/null | head -3 || echo "NODE test (expected timeout on stdin wait)"
Copy code to clipboard
OUT
{"result":{"tools":[{"name":"list_transactions","description":"List finance transactions with optional filters. Returns paginated results including total amount for the filtered set.","inputSchema":{"type":"object","properties":{"page":{"type":"number","description":"Page number (default 1)"},"limit":{"type":"number","description":"Results per page, max 200 (default 50)"},"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD (inclusive)"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD (inclusive)"},"tag":{"type":"string","description":"Filter by tag name"},"recipient":{"type":"string","description":"Substring match on payee name"},"type":{"type":"string","description":"Transaction type: POS | ATM | WALLET"},"source":{"type":"string","description":"Import source: INGEST | UPLOAD"},"search":{"type":"string","description":"Full-text search across rawMessage and recipient"},"hideBalanceAlerts":{"type":"boolean","description":"Exclude balance-notification SMS (default false)"},"sortBy":{"type":"string","description":"Sort field: date | amount | recipient | createdAt"},"sortDir":{"type":"string","description":"asc or desc (default desc)"}}}},{"name":"spending_by_tag","description":"Aggregate spending totals by tag for a date range. Fetches up to 200 transactions and groups them. Good for monthly summaries.","inputSchema":{"type":"object","properties":{"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD"}}}},{"name":"get_transaction","description":"Get a single transaction by its numeric ID.","inputSchema":{"type":"object","required":["id"],"properties":{"id":{"type":"number","description":"Transaction import ID"}}}},{"name":"list_tags","description":"List all available tags with their colors.","inputSchema":{"type":"object","properties":{}}},{"name":"add_tag","description":"Add a tag to a transaction. Creates the tag globally if it does not exist.","inputSchema":{"type":"object","required":["id","name"],"properties":{"id":{"type":"number","description":"Transaction ID"},"name":{"type":"string","description":"Tag name (e.g. \"Groceries\")"},"color":{"type":"string","description":"Hex color (e.g. \"#22c55e\"). Optional — auto-assigned if omitted."}}}},{"name":"remove_tag","description":"Remove a tag from a transaction.","inputSchema":{"type":"object","required":["id","tagId"],"properties":{"id":{"type":"number","description":"Transaction ID"},"tagId":{"type":"number","description":"Tag ID (from list_tags or the transaction object)"}}}},{"name":"health_check","description":"Check Finance Hub backend connectivity and database status.","inputSchema":{"type":"object","properties":{}}}]},"jsonrpc":"2.0","id":1}
Thinking
Thinking
MCP server responds correctly. Now test an actual tool call:
Bash
Test MCP health_check tool against live backend
IN
\
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"health_check","arguments":{}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 5 node server.js 2>/dev/null | tail -1 | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'])"
Copy code to clipboard
OUT
{
"status": "ok",
"timestamp": "2026-05-09T16:19:09.977Z",
"storage": {
"type": "PostgreSQL",
"host": "db",
"database": "finance_hub"
}
}
Bash
Rebuild backend with API key auth changes
IN
# Test the API key auth directly against the backend (backend still has DEV_BYPASS_AUTH=true, need to restart first)
# First rebuild with the new code
cd /volume2/docker/finance/finance-hub && docker compose up -d --build backend 2>&1 | tail -8
Copy code to clipboard
OUT
time="2026-05-09T19:19:13+03:00" level=warning msg="/volume2/docker/finance/finance-hub/docker-compose.yml: `version` is obsolete"
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.45/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Dfinance-hub%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Bash
Rebuild backend with sudo
IN
sudo docker compose -f /volume2/docker/finance/finance-hub/docker-compose.yml up -d --build backend 2>&1 | tail -10
Copy code to clipboard
OUT
sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required
Thinking
Thinking
Docker needs sudo+password. Let me verify the auth changes are correct by testing the logic directly:
Bash
Unit test the auth middleware logic
IN
node - << 'EOF'
// Simulate the auth middleware logic
const [ENV_SECRET];
process.env.API_KEY = API_KEY;
const { authentikMiddleware } = require('/volume2/docker/finance/finance-hub/backend/src/auth.js');
function testReq(path, headers = {}) {
const res = { status(code) { this.code = code; return this; }, json(body) { return body; } };
const req = { path, headers };
let passed = false;
authentikMiddleware(req, res, () => { passed = true; });
return { passed, code: res.code, user: req.user };
}
// Public path — no auth needed
let r = testReq('/api/health', {});
console.assert(r.passed, 'health should be public');
console.log('✓ /api/health is public');
// Valid API key
r = testReq('/api/payments', { authorization: `Bearer ${API_KEY}` });
console.assert(r.passed, 'valid key should pass');
console.assert(r.user.username === 'api-key', 'api user');
console.log('✓ valid API key passes');
// Wrong API key
r = testReq('/api/payments', { authorization: 'Bearer wrongkey' });
console.assert(!r.passed, 'wrong key should fail');
console.assert(r.code === 401, '401 on bad key');
console.log('✓ wrong API key returns 401');
// No auth + no bypass + no authentik header
process.env.DEV_BYPASS_AUTH = 'false';
r = testReq('/api/payments', {});
console.assert(!r.passed && r.code === 401, 'no creds → 401');
console.log('✓ no credentials returns 401');
// Upload endpoints no longer public
r = testReq('/api/upload/csv', {});
console.assert(!r.passed && r.code === 401, 'upload/csv requires auth');
console.log('✓ /api/upload/csv now requires auth (was public, now fixed)');
// Authentik header pass-through
r = testReq('/api/payments', { 'x-authentik-username': 'lukastest', 'x-authentik-email': '[EMAIL]' });
console.assert(r.passed && r.user.username === 'lukastest', 'authentik header works');
console.log('✓ Authentik X-authentik-username header works');
console.log('\nAll auth tests passed.');
EOF
Copy code to clipboard
OUT
✓ /api/health is public
✓ valid API key passes
✓ wrong API key returns 401
✓ no credentials returns 401
✓ /api/upload/csv now requires auth (was public, now fixed)
✓ Authentik X-authentik-username header works
All auth tests passed.
All auth cases pass. Now test the MCP spending_by_tag tool against the live backend (with DEV_BYPASS_AUTH still on):
Bash
Test spending_by_tag MCP tool against live backend...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const BASE_URL = (process.env.FINANCE_HUB_URL || '[URL_WITH_CREDENTIALS] '^1.29.0'}
Thinking
Thinking
Good. Now let me test the MCP server can at least start and call the health check:
Bash
Smoke test MCP server startup
IN
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | timeout 5 node server.js 2>/dev/null | head -3 || echo "NODE test (expected timeout on stdin wait)"
Copy code to clipboard
OUT
{"result":{"tools":[{"name":"list_transactions","description":"List finance transactions with optional filters. Returns paginated results including total amount for the filtered set.","inputSchema":{"type":"object","properties":{"page":{"type":"number","description":"Page number (default 1)"},"limit":{"type":"number","description":"Results per page, max 200 (default 50)"},"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD (inclusive)"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD (inclusive)"},"tag":{"type":"string","description":"Filter by tag name"},"recipient":{"type":"string","description":"Substring match on payee name"},"type":{"type":"string","description":"Transaction type: POS | ATM | WALLET"},"source":{"type":"string","description":"Import source: INGEST | UPLOAD"},"search":{"type":"string","description":"Full-text search across rawMessage and recipient"},"hideBalanceAlerts":{"type":"boolean","description":"Exclude balance-notification SMS (default false)"},"sortBy":{"type":"string","description":"Sort field: date | amount | recipient | createdAt"},"sortDir":{"type":"string","description":"asc or desc (default desc)"}}}},{"name":"spending_by_tag","description":"Aggregate spending totals by tag for a date range. Fetches up to 200 transactions and groups them. Good for monthly summaries.","inputSchema":{"type":"object","properties":{"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD"}}}},{"name":"get_transaction","description":"Get a single transaction by its numeric ID.","inputSchema":{"type":"object","required":["id"],"properties":{"id":{"type":"number","description":"Transaction import ID"}}}},{"name":"list_tags","description":"List all available tags with their colors.","inputSchema":{"type":"object","properties":{}}},{"name":"add_tag","description":"Add a tag to a transaction. Creates the tag globally if it does not exist.","inputSchema":{"type":"object","required":["id","name"],"properties":{"id":{"type":"number","description":"Transaction ID"},"name":{"type":"string","description":"Tag name (e.g. \"Groceries\")"},"color":{"type":"string","description":"Hex color (e.g. \"#22c55e\"). Optional — auto-assigned if omitted."}}}},{"name":"remove_tag","description":"Remove a tag from a transaction.","inputSchema":{"type":"object","required":["id","tagId"],"properties":{"id":{"type":"number","description":"Transaction ID"},"tagId":{"type":"number","description":"Tag ID (from list_tags or the transaction object)"}}}},{"name":"health_check","description":"Check Finance Hub backend connectivity and database status.","inputSchema":{"type":"object","properties":{}}}]},"jsonrpc":"2.0","id":1}
Thinking
Thinking
MCP server responds correctly. Now test an actual tool call:
Bash
Test MCP health_check tool against live backend
IN
\
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"health_check","arguments":{}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 5 node server.js 2>/dev/null | tail -1 | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'])"
Copy code to clipboard
OUT
{
"status": "ok",
"timestamp": "2026-05-09T16:19:09.977Z",
"storage": {
"type": "PostgreSQL",
"host": "db",
"database": "finance_hub"
}
}
Bash
Rebuild backend with API key auth changes
IN
# Test the API key auth directly against the backend (backend still has DEV_BYPASS_AUTH=true, need to restart first)
# First rebuild with the new code
cd /volume2/docker/finance/finance-hub && docker compose up -d --build backend 2>&1 | tail -8
Copy code to clipboard
OUT
time="2026-05-09T19:19:13+03:00" level=warning msg="/volume2/docker/finance/finance-hub/docker-compose.yml: `version` is obsolete"
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.45/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Dfinance-hub%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Bash
Rebuild backend with sudo
IN
sudo docker compose -f /volume2/docker/finance/finance-hub/docker-compose.yml up -d --build backend 2>&1 | tail -10
Copy code to clipboard
OUT
sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required
Thinking
Thinking
Docker needs sudo+password. Let me verify the auth changes are correct by testing the logic directly:
Bash
Unit test the auth middleware logic
IN
node - << 'EOF'
// Simulate the auth middleware logic
const [ENV_SECRET];
process.env.API_KEY = API_KEY;
const { authentikMiddleware } = require('/volume2/docker/finance/finance-hub/backend/src/auth.js');
function testReq(path, headers = {}) {
const res = { status(code) { this.code = code; return this; }, json(body) { return body; } };
const req = { path, headers };
let passed = false;
authentikMiddleware(req, res, () => { passed = true; });
return { passed, code: res.code, user: req.user };
}
// Public path — no auth needed
let r = testReq('/api/health', {});
console.assert(r.passed, 'health should be public');
console.log('✓ /api/health is public');
// Valid API key
r = testReq('/api/payments', { authorization: `Bearer ${API_KEY}` });
console.assert(r.passed, 'valid key should pass');
console.assert(r.user.username === 'api-key', 'api user');
console.log('✓ valid API key passes');
// Wrong API key
r = testReq('/api/payments', { authorization: 'Bearer wrongkey' });
console.assert(!r.passed, 'wrong key should fail');
console.assert(r.code === 401, '401 on bad key');
console.log('✓ wrong API key returns 401');
// No auth + no bypass + no authentik header
process.env.DEV_BYPASS_AUTH = 'false';
r = testReq('/api/payments', {});
console.assert(!r.passed && r.code === 401, 'no creds → 401');
console.log('✓ no credentials returns 401');
// Upload endpoints no longer public
r = testReq('/api/upload/csv', {});
console.assert(!r.passed && r.code === 401, 'upload/csv requires auth');
console.log('✓ /api/upload/csv now requires auth (was public, now fixed)');
// Authentik header pass-through
r = testReq('/api/payments', { 'x-authentik-username': 'lukastest', 'x-authentik-email': '[EMAIL]' });
console.assert(r.passed && r.user.username === 'lukastest', 'authentik header works');
console.log('✓ Authentik X-authentik-username header works');
console.log('\nAll auth tests passed.');
EOF
Copy code to clipboard
OUT
✓ /api/health is public
✓ valid API key passes
✓ wrong API key returns 401
✓ no credentials returns 401
✓ /api/upload/csv now requires auth (was public, now fixed)
✓ Authentik X-authentik-username header works
All auth tests passed.
All auth cases pass. Now test the MCP spending_by_tag tool against the live backend (with DEV_BYPASS_AUTH still on):
Bash
Test spending_by_tag MCP tool against live backend...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const BASE_URL = (process.env.FINANCE_HUB_URL || '[URL_WITH_CREDENTIALS] '^1.29.0'}
Thinking
Thinking
Good. Now let me test the MCP server can at least start and call the health check:
Bash
Smoke test MCP server startup
IN
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | timeout 5 node server.js 2>/dev/null | head -3 || echo "NODE test (expected timeout on stdin wait)"
Copy code to clipboard
OUT
{"result":{"tools":[{"name":"list_transactions","description":"List finance transactions with optional filters. Returns paginated results including total amount for the filtered set.","inputSchema":{"type":"object","properties":{"page":{"type":"number","description":"Page number (default 1)"},"limit":{"type":"number","description":"Results per page, max 200 (default 50)"},"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD (inclusive)"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD (inclusive)"},"tag":{"type":"string","description":"Filter by tag name"},"recipient":{"type":"string","description":"Substring match on payee name"},"type":{"type":"string","description":"Transaction type: POS | ATM | WALLET"},"source":{"type":"string","description":"Import source: INGEST | UPLOAD"},"search":{"type":"string","description":"Full-text search across rawMessage and recipient"},"hideBalanceAlerts":{"type":"boolean","description":"Exclude balance-notification SMS (default false)"},"sortBy":{"type":"string","description":"Sort field: date | amount | recipient | createdAt"},"sortDir":{"type":"string","description":"asc or desc (default desc)"}}}},{"name":"spending_by_tag","description":"Aggregate spending totals by tag for a date range. Fetches up to 200 transactions and groups them. Good for monthly summaries.","inputSchema":{"type":"object","properties":{"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD"}}}},{"name":"get_transaction","description":"Get a single transaction by its numeric ID.","inputSchema":{"type":"object","required":["id"],"properties":{"id":{"type":"number","description":"Transaction import ID"}}}},{"name":"list_tags","description":"List all available tags with their colors.","inputSchema":{"type":"object","properties":{}}},{"name":"add_tag","description":"Add a tag to a transaction. Creates the tag globally if it does not exist.","inputSchema":{"type":"object","required":["id","name"],"properties":{"id":{"type":"number","description":"Transaction ID"},"name":{"type":"string","description":"Tag name (e.g. \"Groceries\")"},"color":{"type":"string","description":"Hex color (e.g. \"#22c55e\"). Optional — auto-assigned if omitted."}}}},{"name":"remove_tag","description":"Remove a tag from a transaction.","inputSchema":{"type":"object","required":["id","tagId"],"properties":{"id":{"type":"number","description":"Transaction ID"},"tagId":{"type":"number","description":"Tag ID (from list_tags or the transaction object)"}}}},{"name":"health_check","description":"Check Finance Hub backend connectivity and database status.","inputSchema":{"type":"object","properties":{}}}]},"jsonrpc":"2.0","id":1}
Thinking
Thinking
MCP server responds correctly. Now test an actual tool call:
Bash
Test MCP health_check tool against live backend
IN
\
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"health_check","arguments":{}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 5 node server.js 2>/dev/null | tail -1 | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'])"
Copy code to clipboard
OUT
{
"status": "ok",
"timestamp": "2026-05-09T16:19:09.977Z",
"storage": {
"type": "PostgreSQL",
"host": "db",
"database": "finance_hub"
}
}
Bash
Rebuild backend with API key auth changes
IN
# Test the API key auth directly against the backend (backend still has DEV_BYPASS_AUTH=true, need to restart first)
# First rebuild with the new code
cd /volume2/docker/finance/finance-hub && docker compose up -d --build backend 2>&1 | tail -8
Copy code to clipboard
OUT...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const BASE_URL = (process.env.FINANCE_HUB_URL || '[URL_WITH_CREDENTIALS] '^1.29.0'}
Thinking
Thinking
Good. Now let me test the MCP server can at least start and call the health check:
Bash
Smoke test MCP server startup
IN
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | timeout 5 node server.js 2>/dev/null | head -3 || echo "NODE test (expected timeout on stdin wait)"
Copy code to clipboard
OUT
{"result":{"tools":[{"name":"list_transactions","description":"List finance transactions with optional filters. Returns paginated results including total amount for the filtered set.","inputSchema":{"type":"object","properties":{"page":{"type":"number","description":"Page number (default 1)"},"limit":{"type":"number","description":"Results per page, max 200 (default 50)"},"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD (inclusive)"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD (inclusive)"},"tag":{"type":"string","description":"Filter by tag name"},"recipient":{"type":"string","description":"Substring match on payee name"},"type":{"type":"string","description":"Transaction type: POS | ATM | WALLET"},"source":{"type":"string","description":"Import source: INGEST | UPLOAD"},"search":{"type":"string","description":"Full-text search across rawMessage and recipient"},"hideBalanceAlerts":{"type":"boolean","description":"Exclude balance-notification SMS (default false)"},"sortBy":{"type":"string","description":"Sort field: date | amount | recipient | createdAt"},"sortDir":{"type":"string","description":"asc or desc (default desc)"}}}},{"name":"spending_by_tag","description":"Aggregate spending totals by tag for a date range. Fetches up to 200 transactions and groups them. Good for monthly summaries.","inputSchema":{"type":"object","properties":{"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD"}}}},{"name":"get_transaction","description":"Get a single transaction by its numeric ID.","inputSchema":{"type":"object","required":["id"],"properties":{"id":{"type":"number","description":"Transaction import ID"}}}},{"name":"list_tags","description":"List all available tags with their colors.","inputSchema":{"type":"object","properties":{}}},{"name":"add_tag","description":"Add a tag to a transaction. Creates the tag globally if it does not exist.","inputSchema":{"type":"object","required":["id","name"],"properties":{"id":{"type":"number","description":"Transaction ID"},"name":{"type":"string","description":"Tag name (e.g. \"Groceries\")"},"color":{"type":"string","description":"Hex color (e.g. \"#22c55e\"). Optional — auto-assigned if omitted."}}}},{"name":"remove_tag","description":"Remove a tag from a transaction.","inputSchema":{"type":"object","required":["id","tagId"],"properties":{"id":{"type":"number","description":"Transaction ID"},"tagId":{"type":"number","description":"Tag ID (from list_tags or the transaction object)"}}}},{"name":"health_check","description":"Check Finance Hub backend connectivity and database status.","inputSchema":{"type":"object","properties":{}}}]},"jsonrpc":"2.0","id":1}
Thinking
Thinking
MCP server responds correctly. Now test an actual tool call:
Bash
Test MCP health_check tool against live backend
IN
\
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"health_check","arguments":{}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 5 node server.js 2>/dev/null | tail -1 | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'])"
Copy code to clipboard
OUT
{
"status": "ok",
"timestamp": "2026-05-09T16:19:09.977Z",
"storage": {
"type": "PostgreSQL",
"host": "db",
"database": "finance_hub"
}
}
Bash
Rebuild backend with API key auth changes
IN...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const BASE_URL = (process.env.FINANCE_HUB_URL || '[URL_WITH_CREDENTIALS] '^1.29.0'}
Thinking
Thinking
Good. Now let me test the MCP server can at least start and call the health check:
Bash
Smoke test MCP server startup
IN
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | timeout 5 node server.js 2>/dev/null | head -3 || echo "NODE test (expected timeout on stdin wait)"
Copy code to clipboard
OUT
{"result":{"tools":[{"name":"list_transactions","description":"List finance transactions with optional filters. Returns paginated results including total amount for the filtered set.","inputSchema":{"type":"object","properties":{"page":{"type":"number","description":"Page number (default 1)"},"limit":{"type":"number","description":"Results per page, max 200 (default 50)"},"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD (inclusive)"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD (inclusive)"},"tag":{"type":"string","description":"Filter by tag name"},"recipient":{"type":"string","description":"Substring match on payee name"},"type":{"type":"string","description":"Transaction type: POS | ATM | WALLET"},"source":{"type":"string","description":"Import source: INGEST | UPLOAD"},"search":{"type":"string","description":"Full-text search across rawMessage and recipient"},"hideBalanceAlerts":{"type":"boolean","description":"Exclude balance-notification SMS (default false)"},"sortBy":{"type":"string","description":"Sort field: date | amount | recipient | createdAt"},"sortDir":{"type":"string","description":"asc or desc (default desc)"}}}},{"name":"spending_by_tag","description":"Aggregate spending totals by tag for a date range. Fetches up to 200 transactions and groups them. Good for monthly summaries.","inputSchema":{"type":"object","properties":{"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD"}}}},{"name":"get_transaction","description":"Get a single transaction by its numeric ID.","inputSchema":{"type":"object","required":["id"],"properties":{"id":{"type":"number","description":"Transaction import ID"}}}},{"name":"list_tags","description":"List all available tags with their colors.","inputSchema":{"type":"object","properties":{}}},{"name":"add_tag","description":"Add a tag to a transaction. Creates the tag globally if it does not exist.","inputSchema":{"type":"object","required":["id","name"],"properties":{"id":{"type":"number","description":"Transaction ID"},"name":{"type":"string","description":"Tag name (e.g. \"Groceries\")"},"color":{"type":"string","description":"Hex color (e.g. \"#22c55e\"). Optional — auto-assigned if omitted."}}}},{"name":"remove_tag","description":"Remove a tag from a transaction.","inputSchema":{"type":"object","required":["id","tagId"],"properties":{"id":{"type":"number","description":"Transaction ID"},"tagId":{"type":"number","description":"Tag ID (from list_tags or the transaction object)"}}}},{"name":"health_check","description":"Check Finance Hub backend connectivity and database status.","inputSchema":{"type":"object","properties":{}}}]},"jsonrpc":"2.0","id":1}...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
auth
certs
db
.env
.env.example
docker-compose.yml
README.md
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const BASE_URL = (process.env.FINANCE_HUB_URL || '[URL_WITH_CREDENTIALS] '^1.29.0'}
Thinking
Thinking
Good. Now let me test the MCP server can at least start and call the health check:
Bash
Smoke test MCP server startup
IN
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | timeout 5 node server.js 2>/dev/null | head -3 || echo "NODE test (expected timeout on stdin wait)"
Copy code to clipboard
OUT
{"result":{"tools":[{"name":"list_transactions","description":"List finance transactions with optional filters. Returns paginated results including total amount for the filtered set.","inputSchema":{"type":"object","properties":{"page":{"type":"number","description":"Page number (default 1)"},"limit":{"type":"number","description":"Results per page, max 200 (default 50)"},"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD (inclusive)"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD (inclusive)"},"tag":{"type":"string","description":"Filter by tag name"},"recipient":{"type":"string","description":"Substring match on payee name"},"type":{"type":"string","description":"Transaction type: POS | ATM | WALLET"},"source":{"type":"string","description":"Import source: INGEST | UPLOAD"},"search":{"type":"string","description":"Full-text search across rawMessage and recipient"},"hideBalanceAlerts":{"type":"boolean","description":"Exclude balance-notification SMS (default false)"},"sortBy":{"type":"string","description":"Sort field: date | amount | recipient | createdAt"},"sortDir":{"type":"string","description":"asc or desc (default desc)"}}}},{"name":"spending_by_tag","description":"Aggregate spending totals by tag for a date range. Fetches up to 200 transactions and groups them. Good for monthly summaries.","inputSchema":{"type":"object","properties":{"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD"}}}},{"name":"get_transaction","description":"Get a single transaction by its numeric ID.","inputSchema":{"type":"object","required":["id"],"properties":{"id":{"type":"number","description":"Transaction import ID"}}}},{"name":"list_tags","description":"List all available tags with their colors.","inputSchema":{"type":"object","properties":{}}},{"name":"add_tag","description":"Add a tag to a transaction. Creates the tag globally if it does not exist.","inputSchema":{"type":"object","required":["id","name"],"properties":{"id":{"type":"number","description":"Transaction ID"},"name":{"type":"string","description":"Tag name (e.g. \"Groceries\")"},"color":{"type":"string","description":"Hex color (e.g. \"#22c55e\"). Optional — auto-assigned if omitted."}}}},{"name":"remove_tag","description":"Remove a tag from a transaction.","inputSchema":{"type":"object","required":["id","tagId"],"properties":{"id":{"type":"number","description":"Transaction ID"},"tagId":{"type":"number","description":"Tag ID (from list_tags or the transaction object)"}}}},{"name":"health_check","description":"Check Finance Hub backend connectivity and database status.","inputSchema":{"type":"object","properties":{}}}]},"jsonrpc":"2.0","id":1}
Thinking
Thinking
MCP server responds correctly. Now test an actual tool call:
Bash
Test MCP health_check tool against live backend
IN
\
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"health_check","arguments":{}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 5 node server.js 2>/dev/null | tail -1 | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'])"
Copy code to clipboard
OUT
{
"status": "ok",
"timestamp": "2026-05-09T16:19:09.977Z",
"storage": {
"type": "PostgreSQL",
"host": "db",
"database": "finance_hub"
}
}
Bash
Rebuild backend with API key auth changes
IN
# Test the API key auth directly against the backend (backend still has DEV_BYPASS_AUTH=true, need to restart first)
# First rebuild with the new code
cd /volume2/docker/finance/finance-hub && docker compose up -d --build backend 2>&1 | tail -8
Copy code to clipboard
OUT
time="2026-05-09T19:19:13+03:00" level=warning msg="/volume2/docker/finance/finance-hub/docker-compose.yml: `version` is obsolete"
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.45/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Dfinance-hub%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Bash
Rebuild backend with sudo
IN
sudo docker compose -f /volume2/docker/finance/finance-hub/docker-compose.yml up -d --build backend 2>&1 | tail -10
Copy code to clipboard
OUT
sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required
Thinking
Thinking
Docker needs sudo+password. Let me verify the auth changes are correct by testing the logic directly:
Bash
Unit test the auth middleware logic
IN
node - << 'EOF'
// Simulate the auth middleware logic
const [ENV_SECRET];
process.env.API_KEY = API_KEY;
const { authentikMiddleware } = require('/volume2/docker/finance/finance-hub/backend/src/auth.js');
function testReq(path, headers = {}) {
const res = { status(code) { this.code = code; return this; }, json(body) { return body; } };
const req = { path, headers };
let passed = false;
authentikMiddleware(req, res, () => { passed = true; });
return { passed, code: res.code, user: req.user };
}
// Public path — no auth needed
let r = testReq('/api/health', {});
console.assert(r.passed, 'health should be public');
console.log('✓ /api/health is public');
// Valid API key
r = testReq('/api/payments', { authorization: `Bearer ${API_KEY}` });
console.assert(r.passed, 'valid key should pass');
console.assert(r.user.username === 'api-key', 'api user');
console.log('✓ valid API key passes');
// Wrong API key
r = testReq('/api/payments', { authorization: 'Bearer wrongkey' });
console.assert(!r.passed, 'wrong key should fail');
console.assert(r.code === 401, '401 on bad key');
console.log('✓ wrong API key returns 401');
// No auth + no bypass + no authentik header
process.env.DEV_BYPASS_AUTH = 'false';
r = testReq('/api/payments', {});
console.assert(!r.passed && r.code === 401, 'no creds → 401');
console.log('✓ no credentials returns 401');
// Upload endpoints no longer public
r = testReq('/api/upload/csv', {});
console.assert(!r.passed && r.code === 401, 'upload/csv requires auth');
console.log('✓ /api/upload/csv now requires auth (was public, now fixed)');
// Authentik header pass-through
r = testReq('/api/payments', { 'x-authentik-username': 'lukastest', 'x-authentik-email': '[EMAIL]' });
console.assert(r.passed && r.user.username === 'lukastest', 'authentik header works');
console.log('✓ Authentik X-authentik-username header works');
console.log('\nAll auth tests passed.');
EOF
Copy code to clipboard
OUT
✓ /api/health is public
✓ valid API key passes
✓ wrong API key returns 401
✓ no credentials returns 401
✓ /api/upload/csv now requires auth (was public, now fixed)
✓ Authentik X-authentik-username header works
All auth tests passed.
All auth cases pass. Now test the MCP spending_by_tag tool against the live backend (with DEV_BYPASS_AUTH still on):
Bash
Test spending_by_tag MCP tool against live backend
IN
cd /volume2/docker/finance/finance-hub/mcp && \
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"spending_by_tag","arguments":{"dateFrom":"2026-05-01","dateTo":"2026-05-09"}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 8 node server.js 2>/dev/null | grep '"id":2' | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'][:500])"
Copy code to clipboard
OUT
{
"period": {
"from": "2026-05-01",
"to": "2026-05-09"
},
"totalTransactions": 36,
"breakdown": [
{
"tag": "Bills",
"total": 117.9,
"count": 6
},
{
"tag": "(untagged)",
"total": 1153.87,
"count": 22
}
]
}
MCP tools work correctly against real data. Let me write the Claude desktop config example and the
.gitignore
for the mcp dir:
Write
claude_desktop_config.example.json
claude_desktop_config.example.json
14 lines
{
"_comment": "Add this block to ~/.config/claude/claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/claude_desktop_config.json)",
"mcpServers": {
"finance-hub": {
"command": "node",
"args": ["/volume2/docker/finance/finance-hub/mcp/server.js"],
"env": {
"FINANCE_HUB_URL": "http://localhost:3001",
"FINANCE_HUB_API_KEY": "94bb2917b9a1b9dcbaec49e7009c2a6b222fc310d32f530aa07e411e8d4a058c"
}
}
}
}
Write
.gitignore
.gitignore
2 lines
node_modules/
Update Todos
Fix auth.js: add API key support + remove upload endpoints from PUBLIC_PATHS
Update index.js: add CORS origin restriction via env var
Update .env and .env.example with API_KEY variable
Build MCP server (stdio, Claude desktop compatible)
Test: restart backend, verify API key auth and MCP connection
Now let me save relevant project context to memory:
Read
MEMORY.md
MEMORY.md
Write
MEMORY.md
MEMORY.md
4 lines
# Memory Index
- [project_finance_hub.md](project_finance_hub.md) — Finance Hub architecture, auth design, MCP server, API key setup
Write
project_finance_hub.md
project_finance_hub.md
35 lines
---
name: Finance Hub project state
description: Architecture, auth design, API key, MCP server location and tools for the finance-hub project
type: project
---
Finance Hub is a personal finance tracker at `/volume2/docker/finance/finance-hub/`.
**Running services** (as of 2026-05-09):
- Backend: port 3001 (Express/Prisma, PostgreSQL)
- Frontend: port 5175 (React/Vite)
- Adminer: port 8092
- Authentik: port 9100 (auth.lakylak.xyz, tag 2026.2.1)
**Auth design:**
- Production: NPM (Nginx Proxy Manager) + Authentik forward-auth outpost → injects `X-authentik-username` header
- `DEV_BYPASS_AUTH=true` is currently set in `.env` — app runs without real auth
- API key auth added: `Authorization: Bearer <API_KEY>` bypasses Authentik header check (used by MCP/scripts)
- API_KEY in `.env`: `94bb2917b9a1b9dcbaec49e7009c2a6b222fc310d32f530aa07e411e8d4a058c`
**Public endpoints** (no auth): `/api/health`, `/api/payments/ingest`
**Security fix applied (2026-05-09):** Removed `/api/upload/csv` and `/api/upload/preview` from PUBLIC_PATHS — they were incorrectly public.
**MCP server:** `/volume2/docker/finance/finance-hub/mcp/server.js`
- Stdio transport, works with Claude desktop
- Tools: list_transactions, spending_by_tag, get_transaction, list_tags, add_tag, remove_tag, health_check
- Env: FINANCE_HUB_URL, FINANCE_HUB_API_KEY
- Config example: `mcp/claude_desktop_config.example.json`
**To enable real Authentik auth:** Set `DEV_BYPASS_AUTH=false` in `.env...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
auth
certs
db
.env
.env.example
docker-compose.yml
README.md
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const BASE_URL = (process.env.FINANCE_HUB_URL || '[URL_WITH_CREDENTIALS] '^1.29.0'}
Thinking
Thinking
Good. Now let me test the MCP server can at least start and call the health check:
Bash
Smoke test MCP server startup
IN
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | timeout 5 node server.js 2>/dev/null | head -3 || echo "NODE test (expected timeout on stdin wait)"
Copy code to clipboard
OUT
{"result":{"tools":[{"name":"list_transactions","description":"List finance transactions with optional filters. Returns paginated results including total amount for the filtered set.","inputSchema":{"type":"object","properties":{"page":{"type":"number","description":"Page number (default 1)"},"limit":{"type":"number","description":"Results per page, max 200 (default 50)"},"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD (inclusive)"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD (inclusive)"},"tag":{"type":"string","description":"Filter by tag name"},"recipient":{"type":"string","description":"Substring match on payee name"},"type":{"type":"string","description":"Transaction type: POS | ATM | WALLET"},"source":{"type":"string","description":"Import source: INGEST | UPLOAD"},"search":{"type":"string","description":"Full-text search across rawMessage and recipient"},"hideBalanceAlerts":{"type":"boolean","description":"Exclude balance-notification SMS (default false)"},"sortBy":{"type":"string","description":"Sort field: date | amount | recipient | createdAt"},"sortDir":{"type":"string","description":"asc or desc (default desc)"}}}},{"name":"spending_by_tag","description":"Aggregate spending totals by tag for a date range. Fetches up to 200 transactions and groups them. Good for monthly summaries.","inputSchema":{"type":"object","properties":{"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD"}}}},{"name":"get_transaction","description":"Get a single transaction by its numeric ID.","inputSchema":{"type":"object","required":["id"],"properties":{"id":{"type":"number","description":"Transaction import ID"}}}},{"name":"list_tags","description":"List all available tags with their colors.","inputSchema":{"type":"object","properties":{}}},{"name":"add_tag","description":"Add a tag to a transaction. Creates the tag globally if it does not exist.","inputSchema":{"type":"object","required":["id","name"],"properties":{"id":{"type":"number","description":"Transaction ID"},"name":{"type":"string","description":"Tag name (e.g. \"Groceries\")"},"color":{"type":"string","description":"Hex color (e.g. \"#22c55e\"). Optional — auto-assigned if omitted."}}}},{"name":"remove_tag","description":"Remove a tag from a transaction.","inputSchema":{"type":"object","required":["id","tagId"],"properties":{"id":{"type":"number","description":"Transaction ID"},"tagId":{"type":"number","description":"Tag ID (from list_tags or the transaction object)"}}}},{"name":"health_check","description":"Check Finance Hub backend connectivity and database status.","inputSchema":{"type":"object","properties":{}}}]},"jsonrpc":"2.0","id":1}
Thinking
Thinking
MCP server responds correctly. Now test an actual tool call:
Bash
Test MCP health_check tool against live backend
IN
\
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"health_check","arguments":{}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 5 node server.js 2>/dev/null | tail -1 | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'])"
Copy code to clipboard
OUT
{
"status": "ok",
"timestamp": "2026-05-09T16:19:09.977Z",
"storage": {
"type": "PostgreSQL",
"host": "db",
"database": "finance_hub"
}
}
Bash
Rebuild backend with API key auth changes
IN
# Test the API key auth directly against the backend (backend still has DEV_BYPASS_AUTH=true, need to restart first)
# First rebuild with the new code
cd /volume2/docker/finance/finance-hub && docker compose up -d --build backend 2>&1 | tail -8
Copy code to clipboard
OUT
time="2026-05-09T19:19:13+03:00" level=warning msg="/volume2/docker/finance/finance-hub/docker-compose.yml: `version` is obsolete"
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.45/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Dfinance-hub%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Bash
Rebuild backend with sudo
IN
sudo docker compose -f /volume2/docker/finance/finance-hub/docker-compose.yml up -d --build backend 2>&1 | tail -10
Copy code to clipboard
OUT
sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required
Thinking
Thinking
Docker needs sudo+password. Let me verify the auth changes are correct by testing the logic directly:
Bash
Unit test the auth middleware logic
IN
node - << 'EOF'
// Simulate the auth middleware logic
const [ENV_SECRET];
process.env.API_KEY = API_KEY;
const { authentikMiddleware } = require('/volume2/docker/finance/finance-hub/backend/src/auth.js');
function testReq(path, headers = {}) {
const res = { status(code) { this.code = code; return this; }, json(body) { return body; } };
const req = { path, headers };
let passed = false;
authentikMiddleware(req, res, () => { passed = true; });
return { passed, code: res.code, user: req.user };
}
// Public path — no auth needed
let r = testReq('/api/health', {});
console.assert(r.passed, 'health should be public');
console.log('✓ /api/health is public');
// Valid API key
r = testReq('/api/payments', { authorization: `Bearer ${API_KEY}` });
console.assert(r.passed, 'valid key should pass');
console.assert(r.user.username === 'api-key', 'api user');
console.log('✓ valid API key passes');
// Wrong API key
r = testReq('/api/payments', { authorization: 'Bearer wrongkey' });
console.assert(!r.passed, 'wrong key should fail');
console.assert(r.code === 401, '401 on bad key');
console.log('✓ wrong API key returns 401');
// No auth + no bypass + no authentik header
process.env.DEV_BYPASS_AUTH = 'false';
r = testReq('/api/payments', {});
console.assert(!r.passed && r.code === 401, 'no creds → 401');
console.log('✓ no credentials returns 401');
// Upload endpoints no longer public
r = testReq('/api/upload/csv', {});
console.assert(!r.passed && r.code === 401, 'upload/csv requires auth');
console.log('✓ /api/upload/csv now requires auth (was public, now fixed)');
// Authentik header pass-through
r = testReq('/api/payments', { 'x-authentik-username': 'lukastest', 'x-authentik-email': '[EMAIL]' });
console.assert(r.passed && r.user.username === 'lukastest', 'authentik header works');
console.log('✓ Authentik X-authentik-username header works');
console.log('\nAll auth tests passed.');
EOF
Copy code to clipboard
OUT
✓ /api/health is public
✓ valid API key passes
✓ wrong API key returns 401
✓ no credentials returns 401
✓ /api/upload/csv now requires auth (was public, now fixed)
✓ Authentik X-authentik-username header works
All auth tests passed.
All auth cases pass. Now test the MCP spending_by_tag tool against the live backend (with DEV_BYPASS_AUTH still on):
Bash
Test spending_by_tag MCP tool against live backend
IN
cd /volume2/docker/finance/finance-hub/mcp && \
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"spending_by_tag","arguments":{"dateFrom":"2026-05-01","dateTo":"2026-05-09"}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 8 node server.js 2>/dev/null | grep '"id":2' | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'][:500])"
Copy code to clipboard
OUT
{
"period": {
"from": "2026-05-01",
"to": "2026-05-09"
},
"totalTransactions": 36,
"breakdown": [
{
"tag": "Bills",
"total": 117.9,
"count": 6
},
{
"tag": "(untagged)",
"total": 1153.87,
"count": 22
}
]
}
MCP tools work correctly against real data. Let me write the Claude desktop config example and the
.gitignore
for the mcp dir:
Write
claude_desktop_config.example.json
claude_desktop_config.example.json
14 lines
{
"_comment": "Add this block to ~/.config/claude/claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/claude_desktop_config.json)",
"mcpServers": {
"finance-hub": {
"command": "node",
"args": ["/volume2/docker/finance/finance-hub/mcp/server.js"],
"env": {
"FINANCE_HUB_URL": "http://localhost:3001",
"FINANCE_HUB_API_KEY": "94bb2917b9a1b9dcbaec49e7009c2a6b222fc310d32f530aa07e411e8d4a058c"
}
}
}
}
Write
.gitignore
.gitignore
2 lines
node_modules/
Update Todos
Fix auth.js: add API key support + remove upload endpoints from PUBLIC_PATHS
Update index.js: add CORS origin restriction via env var
Update .env and .env.example with API_KEY variable
Build MCP server (stdio, Claude desktop compatible)
Test: restart backend, verify API key auth and MCP connection
Now let me save relevant project context to memory:
Read
MEMORY.md
MEMORY.md
Write
MEMORY.md
MEMORY.md
4 lines
# Memory Index
- [project_finance_hub.md](project_finance_hub.md) — Finance Hub architecture, auth design, MCP server, API key setup
Write
project_finance_hub.md
project_finance_hub.md
35 lines
---
name: Finance Hub project state
description: Architecture, auth design, API key, MCP server location and tools for the finance-hub project
type: project
---
Finance Hub is a personal finance tracker at `/volume2/docker/finance/finance-hub/`.
**Running services** (as of 2026-05-09):
- Backend: port 3001 (Express/Prisma, PostgreSQL)
- Frontend: port 5175 (React/Vite)
- Adminer: port 8092
- Authentik: port 9100 (auth.lakylak.xyz, tag 2026.2.1)
**Auth design:**
- Production: NPM (Nginx Proxy Manager) + Authentik forward-auth outpost → injects `X-authentik-username` header
- `DEV_BYPASS_AUTH=true` is currently set in `.env` — app runs without real auth
- API key auth added: `Authorization: Bearer <API_KEY>` bypasses Authentik header check (used by MCP/scripts)
- API_KEY in `.env`: `94bb2917b9a1b9dcbaec49e7009c2a6b222fc310d32f530aa07e411e8d4a058c`
**Public endpoints** (no auth): `/api/health`, `/api/payments/ingest`
**Security fix applied (2026-05-09):** Removed `/api/upload/csv` and `/api/upload/preview` from PUBLIC_PATHS — they were incorrectly public.
**MCP server:** `/volume2/docker/finance/finance-hub/mcp/server.js`
- Stdio transport, works with Claude desktop
- Tools: list_transactions, spending_by_tag, get_transaction, list_tags, add_tag, remove_tag, health_check
- Env: FINANCE_HUB_URL, FINANCE_HUB_API_KEY
- Config example: `mcp/claude_desktop_config.example.json`
**To enable real Authentik auth:** Set `DEV_BYPASS_AUTH=false` in `.env...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
: [Platform] Refinemen... 24 m left100% 5• Mon 11 May 16:36:47Client.php - app - 9 problems in this fileClient.php 9 xDv 23 00EXPLORE:m RateLimitException.ohdV Appapp > Services > Crm › Hubspot › f* Client.php › °g Clientclass Client extends Baseclient 1mplements Hubspotclientintertacev Services123• Hubspol~OpportunitySynestrategy..• HubsnotLastModified@nenSvncStra..137# HubspotLastModifiedSyncStrategy....# HubsootSinaleSyncstrategy.onp## HubspotSyncStrategyBase.php• HubsnotWebhookBatchSyncstrateg.V Dadinationi# HubspotPaginationService.php# PaginationConfig.php# PaginationState.phpProspectsearchstrategy→Hoal~ Servicelraits# OpportunitySyncTrait.php• SvncCrmEntitiesTrait.phpwsuncrields.rait.ono• WriteCrmTrait.phr• Utils• WebhookBatchSvncCollector.phoBatchSvncRedisService.ohvCliient.oho|• ClosedDea|StagesService.oho* DealFieldsService.oho* DecorateActivitv.oho•FieldDefinitions.ohn• FieldTvneConverter.ohn# HubspotClientinterface.php# HubspotTokenManager.php€ DavloadRuilder nhn* PemoteCrmObiectManinulator.nhn«* PecnonceNormalize nhn• Service.php# SyncFieldAction.php# SyncRelatedActivityManager.php# WebhookSyncBatchProcessor.phpIntearationapo• Listeners> MetadataMiarationV Pipedrive• OpportunitySvncStrateav• ProspectSearchStrateavOUTIINE, TIMELING1MCALi8 JY-20725-handle-HS-search-rate-limit* ©SEEEBДБESpub lic tunction 1sHubspotRateL1m1t Throwable Se: boolreturn talsepubLic tunction parseretryArter(Throwable se: 1ny1t method exists(se, 'getResponseMeaders")) ‹Sheaders = Se->aetResponseHeaders@ ?:0•$value = $headers('Retry-After') ?? $headers('retry-after') ?? null;$value = $value[0] ?7 null;if (is numeric(Svalue)) {$message = strtolower($e->getMessage));if (str contains($message, 'daily')) {return 600if str contains(Smessage, "ten secondly') <return 10:if (str_contains($message, 'secondly')) {return 1.Sthis->loq->warning('(Hubspot) No retry-after header or known message, using default'. [get_class(Se)= Smessage,return 10.public function getMinimumApiVersion(): stringreturn celf.-MtN ApT VERSTON•public function getInstance(): Factoryreturnnew Factory([= Sthis→>accessToken.= true.''base url => Sthis->baseUruoublic function aetNewinstance: Discovervreturn HubSpot\Factorv::createWithAccessToken/Sthis-saccessToken@6A902|23 Review rate limit handli...Review rate limit handling for HubSpot jo..? handienuospotkateLimitfix @tests/Unit/Services/Crm/Hubspot/Pagination/HubspotPaginationServiceTest.php and @tests/Unit/Jobs/Crm/MatchActivityCrmDataTest.php• Edit MatchAct vitycrmbatalest.on:public function testralledhethoclogsError(): voitpublic function testfal ledMethodlogsError(): Voldsexcepcion = new kuncimeexcepcion("lest Talture);sexcepcion = new kuncimeexception("lest Talture);Sentry::shouldReceive'captureException->once)->with Sexception:Siob = new MatchActivitvCrmData(activitvid: 123.Siob = new MatchActivitvCrmDatafromConfiguration: null,remotoSearch. +ruefromConfiguration: null,createMock(Client::class) mocks search() toreturn (], sothe hubspotClientMock->request( chain is never reached. The fix is to mocksearch() directly and drop the HTTP client chain entirely.• Write HubspotPaginationServiceTest.php<?ohodeclare(strict types=1):MatchActivityCrmDataTest - two changes• $job->tries → $job->maxExceptions : the job has no $tries property; it uses retryUntil() for retry bounding and maxExceptions for exception counting• unquerone assertion 360→ 260Hmatches 120 * 60 + 60 afterithe REURY WINDOW MNULES Changerthe od value sbo diont matchithe o ditormula either solt was already broker•tectFailedMethodl.oocFrror•added \Sentrv:: chouldReceive( "cantureFycention' _soncel since failedo now calls itHubspotPaginationServiceTest — structural fix:he oricina litests mocked netainstanca 0E>dat dltenti0reduesiti@lexo ectina the rea ldltentzu seanch @ltolbe calledlthrouch that chain Buta oreat eMock di llentrura lassilmocksseanch<oo (return[] ), so request() was never reached - every test was silently returning 0 results. Fixed by mocking search() directly on SclientMock, which is the actual boundary the pagination service interacts withDronned the now-unused SfactorvMock. ShubsnotClientMock.and createHubsnotResnonse().$endpoint = self::BASE URL."/crm/v3/objects/{$objectType)/search":return $this->executeRequest(function () use (Sendpoint, $payload) <Sresponse = Sthis-›"POSt". Sendpoint. Wison' =>$payload]):return Sresponse->toArray0:11 Client.ohoAsk before edit.A tnkoe Vouolil 160 minutoe ondlSpaces: 4 UTF-8 LF ( PHP 8 SignIn 8.3P 0...
|
Code
|
Client.php — app — 9 problems in this file
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
i Client.ohp g XV APP• Hubspol~OpportunitySynestrategy..# HubspotLastModifiedOpenSyncStra...,# HubspotLastModifiedSyncStrategy....# HubsootSinaleSyncstrategy.onp## HubspotSyncStrategyBase.php• HubsnotWebhookBatchSyncstrateg.Pagination# HubspotPaginationService.php# PaginationConfig.php# PaginationState.phpProspectsearchstrategy→Hoal~ Servicelraits# OpportunitySyncTrait.php• SvncCrmEntitiesTrait.phpwsuncrields.rait.ono• WriteCrmTrait.phr• Utils• WebhookBatchSvncCollector.ohoBatchSvncRedisService.ohvClient.oho* ClosedDealStagesService.oho• DealFieldsService.oho* DecorateActivitv.oho•FieldDefinitions.ohn|• FieldTvoeConverter.oho# HubspotClientinterface.php#R HubspotTokenManager.phpR PayloadBuilder.php* PemoteCrmObiectManinulator.nhn«* PecnonceNormalize nhn• Service.php# SyncFieldAction.php# SyncRelatedActivityManager.php# WebhookSyncBatchProcessor.php> IntearationApp• Listeners> MetadataMiarationV Pipedrive• OpportunitySvncStrateav• ProspectSearchStrateavOUTIINETIMELINGSô JY-20725-handle-HS-search-rate-limite C ®6 A902app > Services > Crm › Hubspot › f* Client.php › °g Client › * executeRequest()class Client extends Baseclient 1mpLements HubspotclentIntertaceExecute a HubSoot APt call with rate Uimit handina.On a 429, stores the absolute expiry timestamp with SET NX (first writer wins).* This means all subsequent jobs that also receive 429 in the same burst do not* reset the liL = the window is anchored to the Tirst 44y, nor thelast.* Readers compute the remaining wait from the stored timestamp, so jobs that check* the cache near expiry are not delaved longer than necessary.* @param callable(): T $apiCall The API call to execute* @throws RateLimitException When rate limit is hit or cached rate limit is active* @return T The result of the API callprivate function executeRequest(callable $apiCall)meric($cachedexpiresat) ) •1>rematning= maxil,'lincscacheocxptresat - cimelrHubspot race Limit (cached circult-dreaker),Sremaining,try.return Sanicallo.J.catch.(Throwahle.&e).fifIethie-sicHuhenn+Patol imi+/coll.fSretryAfter = $this->parseRetryAfter(Se):so theRedis:•set(ScacheKev,-(strina)- (time() + SretrvAfter).."'nx'..'ey' => SretrvAfterl)Sthis->log->warning(' [Hubspot] Received 429 from API',•['team_id''confia id'=• Sth1s→>cont10→>team 10,• => Sthis->contio->qetido.= Se-saetMessage0.i• throw- new RateLimitException('Hubspot returned 429', $retryAfter, $e);throw-Se:: [Platform] Refinemen... 25 m left100% 5• Mon 11 May 16:35:480 000• 2 C›* Review rate limit handli... XKeview fate lmit nanaling tor nuospouo..<> HandleHubspotRateLimit…fix @tests/Unit/Services/Crm/Hubspot/Pagination/HubspotPaginationServiceTest.php and @tests/Unit/Jobs/Crm/MatchActivityCrmDataTest.php• The stries property doesn't exist on MatchActivityCrmData - it uses retryUntil() instead. That assertion should check maxExceptions . Let me fix that and continue with the remaining breaks• Edit MatchActivityCrmDataTest.phpSthis-sassertSame(3, Siob-stries):<this_sascertCame/200 cioh_stimeout())•Sjob->uniqueFor()):ertSame([30, 90, 180]. Siob>backoff()):Sthis-sassert.eptions):$this->assertSame(300, Sjob->timeout()):Sjob->uniqueFor()):Sthis->assertSame([3090,1801. Siob->backoff()):• Edit MatchActivitycrmDatarest.pnppublic function testFailedMethodLogsError(): voidSexception = new \RuntimeException('Test failure');public function testfailedMethodLogsErrorl: voidSexception = new \RuntimeException('Test failure');sencry:: shoulokeceivecapturecxception→>once→>wich sexcepcionSiob = new MatchAct ivitvermbatalactivitvid: 13.fromConfiauration: null.siob = new MatchAct ivitvermbataactivitvid: 123.fromConfiguration: null,remoteSearch: true• Now the pagination service test. The root problem is createMock(Client::class) mocks search() to return ! , so the hubspotClientMock->request() chain is never reached. The fix is to mocksearch() directly and drop the HTTP client chain entirely.• Write HubspotPaginationServiceTest.phpdeclare(strict tvnes-1).• Summary of what was fixed and whyMatchActiv1tycrmDatatest — two changes:• $job->tries → $job-•maxExceptions : the job has no $tries property; it uses retryUntil() for retry bounding and maxExceptions for exception counting•uniqueFor() assertion 360 → 7260 : matches 120 * 60 + 60 after the RETRY WINDOW MINUTES change: the old value 360 didn't match the old formula either, so it was already broken.testFailedMethodLoqsError:added (Sentrv::shouldReceive("cantureExcention')->once() since failed() now calls itHubspotPaginationServiceTest — structural fix:WHEo1Gn I1EstmoceedW9e1eindeS9eeen1ee9eseexoei1oeeieteeeto be Galledithrouatha cha n Bu aaEea teMoak Ca sten H lass limockse seaa te mool return[] ), so request() was never reached — every test was silently returning 0 results. Fixed by mocking search() directly on $clientMock, which is the actual boundary the pagination service interacts with.Dropped the now-unused sfactorvMock.ShubspotClientMock. and creatAdd coverade of now methads in1 38 lines selectedAsk before edit.8 SignIn...
|
Code
|
Review rate limit handli… — app
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
SlackFileEditViewGoHistoryWindowHelp•APPDOCKERO ₴1DEV (docker)₴2APP (-zsh)-zsh* @param string SobjectType The object type('deals''companies''contacts''cal)* @param array<string, mixed>Spayload Thesearch payload with filters, sorts, prope* @return array The search response with'results''total'*'paging'keys* @throws RateLimitException When rate limit is hit* @throws HubspotException On API errors** @return array The search response with 'results', 'total', 'paging' keys*/public function search(string SobjectType, array Spayload): arrayend diff4) app/Console/Commands/JiminnyDebugCommand.php (statement_indentation)begin diff --/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php+++/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php-359,11+359,11 @ScrmService = ScrmResolver->prepareCrmService);-//-/1for ($i = 0; si < 3; Si++) {if (Si % 250) {Sthis->info("Syncing opportunity {Si}");Sthis->info("Matching contact {$i}");-1/-1/.+++++ScrmService->syncOpportunity('374720564');if ($i % 25=0{//Sthis->info("Syncing opportunity {$i}");Sthis->info("Matching contact {$i}");////}ScrmService->syncOpportunity('374720564');$crmService->matchByName('Robot');end diffFixed 4 of 5666 files in 146.870 seconds, 60.00 MB memory usedWhat's next:Try Docker Debug for seamless, persistent debugging tools in any containeror image →Learn more at https://docs.docker.com/go/debug-cli/lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~/jiminny/app (JY-20725-handle-HS-search-rate-lirHomeDMsActivityFilesLater..•More• [Platform) Refinemen…. 24 m left100% <78• Mon 11 May 16:36:47ED→Describe what you are looking forJiminny ...# contusion-clinic# curiosity_lab# engineering# general# jiminny-bg# platform-tickets# product_launches# random# releases# sofia-office# support# thank-yous# the_people_of jimi...^ Direct messagesP. Aneliya Angelova®. Galya DimitrovaPetko Kashinski&. Stefka StoyanovaVasil Vasilev&. Nikolay IvanovAneliya Angelova, ...Stoyan Tanev• VesE Lukas Kovalik y... 0::: AppsS Jira CloudToastGanala Cala# support8 346 0MessagesC FilesBookmarksKara Joneswas added .Tuesday, April 28th~More v+Wednesday, April 29th ~Lauren Hudson 12:58 PMHi team, I'm trying to set up auto detect for LesMills, but when I add a new playbook (because theywant it applied across all teams), it's pulling in a loadof activity types that l am not able to delete. Anyideas please?Screenshot 2026-04-29 at 10.54.42.png •1 reply 12 days agoLauren Hudson 6:17 PMHello, request from Norstella to update theirJiminny to match this shared spreadsheet. lssomebody from the support team able to helpplease?Numbers Document -Message #support+Аа...
|
Code
|
Client.php — app — 9 problems in this file
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
SlackFileEditViewGoHistoryWindowHelpAPPDOCKER2881DEV (docker)₴2APP (-zsh)-zsh* @param string SobjectType The object type('deals''companies''contacts''cal)* @param array<string, mixed>Spayload Thesearch payload with filters, sorts, prope* @return array The search response with'results''total''paging'keys* @throws RateLimitException When rate limit is hit* @throws HubspotException On API errors** @return array The search response with 'results', 'total', 'paging' keys*/public function search(string SobjectType, array Spayload): arrayend diff4) app/Console/Commands/JiminnyDebugCommand.php (statement_indentation)begin diff --/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php+++/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php-359,11+359,11 @ScrmService = ScrmResolver->prepareCrmService);-//-/1for ($i = 0; si < 3; Si++) {if (Si % 250) {Sthis->info("Syncing opportunity {Si}");Sthis->info("Matching contact {$i}");-1/-1/.+++++ScrmService->syncOpportunity('374720564');if ($i % 25=0{//Sthis->info("Syncing opportunity {$i}");Sthis->info("Matching contact {$i}");////}ScrmService->syncOpportunity('374720564');$crmService->matchByName('Robot');end diffFixed 4 of 5666 files in 146.870 seconds, 60.00 MB memory usedWhat's next:Try Docker Debug for seamless, persistent debugging tools in any containeror image →Learn more at https://docs.docker.com/go/debug-cli/lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~/jiminny/app (JY-20725-handle-HS-search-rate-lirHomeDMsActivityFilesLater..•More• [Platform) Refinemen…. 24 m left100% <78• Mon 11 May 16:36:31ED→Describe what you are looking forJiminny ...# contusion-clinic# curiosity_lab# engineering# general# jiminny-bg# platform-tickets# product_launches# random# releases# sofia-office# support# thank-yous# the_people_of jimi...^ Direct messagesP. Aneliya Angelova®. Galya DimitrovaPetko Kashinski&. Stefka StoyanovaVasil Vasilev&. Nikolay IvanovAneliya Angelova, ...Stoyan Tanev• VesE Lukas Kovalik y... 0::: AppsS Jira CloudToastGanala Cala# support8 346 0MessagesC FilesBookmarksKara Joneswas added .Tuesday, April 28th~More v+Wednesday, April 29th ~Lauren Hudson 12:58 PMHi team, I'm trying to set up auto detect for LesMills, but when I add a new playbook (because theywant it applied across all teams), it's pulling in a loadof activity types that l am not able to delete. Anyideas please?Screenshot 2026-04-29 at 10.54.42.png •1 reply 12 days agoLauren Hudson 6:17 PMHello, request from Norstella to update theirJiminny to match this shared spreadsheet. lssomebody from the support team able to helpplease?Numbers Document -Message #support+Аа...
|
Code
|
Review rate limit handli… — app
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
SlackFileEditViewGoHistoryWindowHelpAPP ()DOCKER₴81DEV (docker)₴2APP (-zsh)-zsh* @param string SobjectType The object type('deals''companies''contacts''cal)* @param array<string, mixed>Spayload Thesearch payload with filters, sorts, prope* @return array The search response with'results''total'*'paging'keys* @throws RateLimitException When rate limit is hit* @throws HubspotException On API errors** @return array The search response with 'results', 'total', 'paging' keys*/public function search(string SobjectType, array Spayload): arrayend diff4) app/Console/Commands/JiminnyDebugCommand.php (statement_indentation)begin diff --/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php+++/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php-359,11+359,11 @ScrmService = ScrmResolver->prepareCrmService);-//-/1for ($i = 0; si < 3; Si++) {if (Si % 250) {Sthis->info("Syncing opportunity {Si}");Sthis->info("Matching contact {$i}");-1/-1/.+++++ScrmService->syncOpportunity('374720564');if ($i % 25=0{//Sthis->info("Syncing opportunity {$i}");Sthis->info("Matching contact {$i}");////}ScrmService->syncOpportunity('374720564');$crmService->matchByName('Robot');end diffFixed 4 of 5666 files in 146.870 seconds, 60.00 MB memory usedWhat's next:Try Docker Debug for seamless, persistent debugging tools in any containeror image →Learn more at https://docs.docker.com/go/debug-cli/lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~/jiminny/app (JY-20725-handle-HS-search-rate-lir• 0EDHomeDMsActivityFilesLater..•More→Jiminny ...# contusion-clinic# curiosity_lab# engineering# general# jiminny-bg# platform-tickets# product_launches# random# releases# sofia-office# support# thank-yous# the_people_of jimi...^ Direct messagesP. Aneliya Angelova®. Galya Dimitrova. Petko Kashinski&. Stefka StoyanovaVasil Vasilev&. Nikolay IvanovAneliya Angelova, ...Stoyan Tanev. VesE Lukas Kovalik y... O::: AppsS Jira CloudToast100% <78• Mon 11 May 17:12:48Describe what you are looking for# support8 346 0MessagesC FilesBookmarksKara Joneswas added .Tuesday, April 28th~More v+Wednesday, April 29th ~Lauren Hudson 12:58 PMHi team, I'm trying to set up auto detect for LesMills, but when I add a new playbook (because theywant it applied across all teams), it's pulling in a loadof activity types that l am not able to delete. Anyideas please?Screenshot 2026-04-29 at 10.54.42.png •1 reply 12 days agoLauren Hudson 6:17 PMHello, request from Norstella to update theirJiminny to match this shared spreadsheet. lssomebody from the support team able to helpplease?Numbers Document •Message #support+...
|
Code
|
Review rate limit handli… — app
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
Remote Explorer
selectionViewWindov• HandleHubspotRateLimit.php XwClient.php 9V APP" JoosVCrm# MatchActivitiesToNewOpportunity.phpw MatchacuivilvcrmData.onp## NoteObiect.phpw SaveActivity.phpwSavelranscription.phpw [EMAIL]#SvncObiects.oho#SvncOpportunities.lob.ohv# SvncOpportunitv.ohol# SyncProfileMetadata.php#SvncTeamFieldc.loh.nhn#. SvncTeamMetadata.nhnl# UpdateOpportunitySpecifications.php##UodateStage.pnpDealkisks> MeetingBotV Middlowarol#* HandleHubspotRateLimit.phpRA RateLimited.php> Streamine> TeamTelephony> UserR# BaseProcessingJob.phpDummyJob.phpmoortRecallA Recordinas.oo.onv#ImoortRemotetrack.Job.onv# Job.oho# JobDisoatcher.oho# JobDispatcherinterface.ohv# [EMAIL]#SasVisibilitvControl.ohnV ModelsiActivitvi>A.Calondarapp > Jobs › Middleware > «* HandleHubspotRateLimit.php › * HandleHubspotRateLimidectaretstriet_types-,namespace Jiminny Jobs Midd Leware;use Tluminate Suoport Facades Loo:useJiminnvExcent.ions RateLimitExceot.ion:: Job middleware that catches RateLimitException from HubSpot API callsclacs HandieHubsnotRatelimitprivate const int MaX RETRY DELAY = 600:private const int MIN RETRY DELAY = 1;orivate const int JIHER SECONDS =5NOAAWANNpublic function handte(object sjob, callable snext): voidtry} catchmax(self::MIN RETRY DELAY, min($e->getRetryAfter(), self::MAX RETRY DELAY)):Add ]lcter to prevent chundering nerd problem by randomizing recry cimesSoelay += random 1nc6, selt::sulitk sccunds"sattemots = s1ob->attemots:if Sattemots <= 3 Sattemots & 10 === 0 4Log:: info('[HandleHubspotRateLimit) Rate limit caught, releasing job with delay', ['job_class' = $job::class,Iattemntcl = Cattemntc.'retry after' = Se->getRetryAfter(),'delay' = $delay.Sioh-sre lease Sde llav).i8 JY-20725-handle-HS-search-rate-limit* ©₫6A90г"Platrorm Kerlnemen…. now100% 5Mon 11 May 10.01:03• 2 C→= Review rate limit handli..Review rate limit handling for HubSpot jo.Isentry. Is there scenario where the tlow would be silentlv falled. Do I havea, way of knowina is some of the loos are lostwhatever was actually thrown. It's worth logging and tagging these differently so in Sentry you can tell "this portal was hammered for 2 hours" from "there's an actual bug"• EditMatchact.vitvcrmbata.onupubuic tunction talled uhrowable sexcention: vo1dLog::error('(MatchActivityCrmData) Job permanently failed after allpublic tunctzon tazleatihrowable sexceptzon): vo2dScontext =configuration' = $this->fromConfiguration?->getId(),08 0 0 0* M&|$this->fromConfiguration?->getId(),• Now for the 100k spike concern specifically — 64,000 Sentry events would be a lot of noise. Worth knowing that Sentry has rate limiting on its own side, and you can also add \Sentry:: configureScope() togroup them, but that's probably overkill here. The failed_jobs table already gives you the full count; Sentry gives you the signal that it happened at all.To directly answer your question: yes, there was a silent failure scenario. Jobs hitting retryUntil() after being rate-limited were written to failed_jobs and logged at error level, but nothing reachedSenuva une vaxau cenouse xceeded excep on pau vate-uni exhaus ion neverropava eu nroueh he excep ton nanuer ony ea exceр lons Dв eros eс da, Now aшea cadures Dou andLiEloa level distinauishes them so vou can filter in Sentry by excention tvoe: MaxAttemotsExceededExcention = rate-limit death. anvthina else = real bua<> HandleHubspotRateLimit…fix @tests/Unit/Services/Crm/Hubsoot/Pacination/HubsootPacinationServiceTest.ohnand@tests/Unit/.lobs/Crm/MatchActivitvCrmDataTest.ohv• Thought for 299s >• Let me check what's actually broken before writina fixes.• Bash Run the affected tests to see what's failingIN php vendor/bin/phpunit tests/Unit/Jobs/Crm/MatchActivityCrmDataTest.php tests/Unit/Jobs/Middleware/HandleHubspotRateLimitTest.php tests/Unit/Services/Crm/Hubspot/Pagination/HubspotPagAllow this bash command?ono vendorbin/ onounit rests/unluJoos/crm/Marchacuiv1cvcrmbaralesc.onotests/Unit/Jobs/Middleware/HandleHubspotRateLimitTest.phptests/Unit/Services/Crm/Hubspot/Pagination/HubspotPaginationServiceTest.php -=no-coverage 2>81tail -40Run the affected tests to see what's failing7 Yes, alllow oho vendorlbin/oh.... for alll proiects3 NoTell Claude what to do insteadEse to cance8 SignIn...
|
Code
|
Review rate limit handli… — app
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
i Client.ohp g XV APP• Hubspol~OpportunitySynestrategy..# HubspotLastModifiedOpenSyncStra...,# HubspotLastModifiedSyncStrategy....# HubsootSinaleSyncstrategy.onp## HubspotSyncStrategyBase.php• HubsnotWebhookBatchSyncstrateg.Pagination# HubspotPaginationService.php# PaginationConfig.php# PaginationState.phpProspectsearchstrategy→Hoal~ Servicelraits# OpportunitySyncTrait.php• SvncCrmEntitiesTrait.phpwsuncrields.rait.ono• WriteCrmTrait.phr• Utils• WebhookBatchSvncCollector.ohoBatchSvncRedisService.ohvClient.oho* ClosedDealStagesService.oho* DealFieldsService.oho* DecorateActivitv.oho•FieldDefinitions.ohn|• FieldTvoeConverter.oho# HubspotClientinterface.php#R HubspotTokenManager.phpR PayloadBuilder.php* PemoteCrmObiectManinulator.nhn«* PecnonceNormalize nhn• Service.php#R SyncFieldAction.php# SyncRelatedActivityManager.php# WebhookSyncBatchProcessor.php> IntearationApp• Listeners> MetadataMiarationV Pipedrive• OpportunitySvncStrateav• ProspectSearchStrateavOUTIINETIMELINGSô JY-20725-handle-HS-search-rate-limite C ®6 A902app > Services > Crm › Hubspot › f* Client.php › °g Client › * executeRequest()class Client extends Baseclient 1mpLements HubspotclentIntertaceExecute a HubSoot APt call with rate Uimit handina.On a 429, stores the absolute expiry timestamp with SET NX (first writer wins).* This means all subsequent jobs that also receive 429 in the same burst do not* reset the liL = the window is anchored to the Tirst 44y, nor thelast.* Readers compute the remaining wait from the stored timestamp, so jobs that check* the cache near expiry are not delaved longer than necessary.* @param callable(): T $apiCall The API call to execute* @throws RateLimitException When rate limit is hit or cached rate limit is active* @return T The result of the API callprivate function executeRequest(callable $apiCall)meric($cachedexpiresat) ) •1>rematning= maxil,'lincscacheocxptresat - cimelrHubspot race Limit (cached circult-dreaker),Sremaining,try.return Sanicallo...catch. (Throwahle.&e).fifIethie-sicHuhenn+Patol imi+/coll.fSretryAfter = $this->parseRetryAfter(Se):so the* Keois..set(scacherey, • (String)• (time+ SretrvAfter),'U'nx'..'ex" = SretrvAfterl)• ISthis->log->warning(' [Hubspot] Received 429 from API',•['team_id''confia id'=• Sth1s→>cont10→>team 10,• => Sthis->contio->qetido.= Se-saetMessage0.i• throw- new RateLimitException('Hubspot returned 429', $retryAfter, $e);throw-Se:: [Platform] Refinemen... 25 m left100% 5• Mon 11 May 16:35:400 000• 2 C›* Review rate limit handli... XKeview fate lmit nanaling tor nuospouo..<> HandleHubspotRateLimit…fix @tests/Unit/Services/Crm/Hubspot/Pagination/HubspotPaginationServiceTest.php and @tests/Unit/Jobs/Crm/MatchActivityCrmDataTest.php• The stries property doesn't exist on MatchActivityCrmData - it uses retryUntil() instead. That assertion should check maxExceptions . Let me fix that and continue with the remaining breaks• Edit MatchActivityCrmDataTest.phpSthis-sassertSame(3, Siob-stries):<this_sascertCame/200 cioh_stimeout())•Sjob->uniqueFor()):ertSame([30, 90, 180]. Siob>backoff()):Sthis-sassert.eptions):$this->assertSame(300, Sjob->timeout()):Sjob->uniqueFor()):Sthis->assertSame([3090,1801. Siob->backoff()):• Edit MatchActivitycrmDatarest.pnppublic function testFailedMethodLogsError(): voidSexception = new \RuntimeException('Test failure');public function testfailedMethodLogsErrorl: voidSexception = new \RuntimeException('Test failure');sencry:: shoulokeceivecapturecxception→>once→>wich sexcepcionSiob = new MatchAct ivitvermbatalactivitvid: 13.fromConfiauration: null.siob = new MatchAct ivitvermbataactivitvid: 123.fromConfiguration: null,remoteSearch: true• Now the pagination service test. The root problem is createMock(Client::class) mocks search() to return ! , so the hubspotClientMock->request() chain is never reached. The fix is to mocksearch() directly and drop the HTTP client chain entirely.• Write HubspotPaginationServiceTest.phpdeclare(strict tvnes-1).• Summary of what was fixed and whyMatchActiv1tycrmDatatest — two changes:• $job->tries → $job-•maxExceptions : the job has no $tries property; it uses retryUntil() for retry bounding and maxExceptions for exception counting•uniqueFor() assertion 360 → 7260 : matches 120 * 60 + 60 after the RETRY WINDOW MINUTES change: the old value 360 didn't match the old formula either, so it was already broken.testFailedMethodLoqsError:added (Sentrv::shouldReceive("cantureExcention')->once() since failed() now calls itHubspotPaginationServiceTest — structural fix:WHEo1Gn I1EstmoceedW9e1eindeS9eeen1ee9eseexoei1oeeieteeeto be Galledithrouatha cha n Bu aaEea teMoak Ca sten H lass limockse seaa te mool return[] ), so request() was never reached — every test was silently returning 0 results. Fixed by mocking search() directly on $clientMock, which is the actual boundary the pagination service interacts with.Dropped the now-unused sfactorvMock.ShubspotClientMock. and creatAdd coverade of now methads in1 38 lines selectedAsk before edit.88 SignIn...
|
Code
|
Review rate limit handli… — app
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
SlackFileEditViewGoHistoryWindowHelpAPPDOCKER₴81DEV (docker)₴2APP (-zsh)-zsh* @param string SobjectType The object type('deals''companies''contacts''cal)* @param array<string, mixed>Spayload Thesearch payload with filters, sorts, prope* @return array The search response with'results''total''paging'keys* @throws RateLimitException When rate limit is hit* @throws HubspotException On API errors** @return array The search response with 'results', 'total', 'paging' keys*/public function search(string SobjectType, array Spayload): arrayend diff4) app/Console/Commands/JiminnyDebugCommand.php (statement_indentation)begin diff --/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php+++/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php-359,11+359,11 @ScrmService = ScrmResolver->prepareCrmService);-//-/1for ($i = 0; si < 3; Si++) {if (Si % 250) {Sthis->info("Syncing opportunity {Si}");Sthis->info("Matching contact {$i}");-1/-1/.+++++ScrmService->syncOpportunity('374720564');if ($i % 25=0{//Sthis->info("Syncing opportunity {$i}");Sthis->info("Matching contact {$i}");////}ScrmService->syncOpportunity('374720564');$crmService->matchByName('Robot');end diffFixed 4 of 5666 files in 146.870 seconds, 60.00 MB memory usedWhat's next:Try Docker Debug for seamless, persistent debugging tools in any containeror image →Learn more at https://docs.docker.com/go/debug-cli/lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~/jiminny/app (JY-20725-handle-HS-search-rate-lirHomeDMsActivityFilesLater..•More• [Platform) Refinemen…. 25 m left100% <78• Mon 11 May 16:35:45•ED→Describe what you are looking forJiminny ...# contusion-clinic# curiosity_lab# engineering# general# jiminny-bg# platform-tickets# product_launches# random# releases# sofia-office# support# thank-yous# the_people_of jimi...^ Direct messagesP. Aneliya Angelova®. Galya DimitrovaPetko Kashinski&. Stefka StoyanovaVasil Vasilev&. Nikolay IvanovAneliya Angelova, ...Stoyan Tanev• VesE Lukas Kovalik y... 0::: AppsS Jira CloudToastGanala Cala# support8 346 0MessagesC FilesBookmarksKara Joneswas added .Tuesday, April 28th~More v+Wednesday, April 29th ~Lauren Hudson 12:58 PMHi team, I'm trying to set up auto detect for LesMills, but when I add a new playbook (because theywant it applied across all teams), it's pulling in a loadof activity types that l am not able to delete. Anyideas please?Screenshot 2026-04-29 at 10.54.42.png •1 reply 12 days agoLauren Hudson 6:17 PMHello, request from Norstella to update theirJiminny to match this shared spreadsheet. lssomebody from the support team able to helpplease?Numbers Document -Message #support+Аа...
|
Code
|
Review rate limit handli… — app
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
SlackFileEditViewGoHistoryWindowHelpAPPDOCKER₴81DEV (docker)₴2APP (-zsh)-zsh* @param string SobjectType The object type('deals''companies''contacts''cal)* @param array<string, mixed>Spayload Thesearch payload with filters, sorts, prope* @return array The search response with'results''total''paging'keys* @throws RateLimitException When rate limit is hit* @throws HubspotException On API errors** @return array The search response with 'results', 'total', 'paging' keys*/public function search(string SobjectType, array Spayload): arrayend diff4) app/Console/Commands/JiminnyDebugCommand.php (statement_indentation)begin diff --/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php+++/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php-359,11+359,11 @ScrmService = ScrmResolver->prepareCrmService);-//-/1for ($i = 0; si < 3; Si++) {if (Si % 250) {Sthis->info("Syncing opportunity {Si}");Sthis->info("Matching contact {$i}");-1/-1/.+++++ScrmService->syncOpportunity('374720564');if ($i % 25=0{//Sthis->info("Syncing opportunity {$i}");Sthis->info("Matching contact {$i}");////}ScrmService->syncOpportunity('374720564');$crmService->matchByName('Robot');end diffFixed 4 of 5666 files in 146.870 seconds, 60.00 MB memory usedWhat's next:Try Docker Debug for seamless, persistent debugging tools in any containeror image →Learn more at https://docs.docker.com/go/debug-cli/lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~/jiminny/app (JY-20725-handle-HS-search-rate-lirHomeDMsActivityFilesLater..•More• [Platform) Refinemen…. 24 m left100% <78• Mon 11 May 16:36:04ED→Describe what you are looking forJiminny ...# contusion-clinic# curiosity_lab# engineering# general# jiminny-bg# platform-tickets# product_launches# random# releases# sofia-office# support# thank-yous# the_people_of jimi...^ Direct messagesP. Aneliya Angelova®. Galya DimitrovaPetko Kashinski&. Stefka StoyanovaVasil Vasilev&. Nikolay IvanovAneliya Angelova, ...Stoyan Tanev• VesE Lukas Kovalik y... 0::: AppsS Jira CloudToastGanala Cala# support8 346 0MessagesC FilesBookmarksKara Joneswas added .Tuesday, April 28th~More v+Wednesday, April 29th ~Lauren Hudson 12:58 PMHi team, I'm trying to set up auto detect for LesMills, but when I add a new playbook (because theywant it applied across all teams), it's pulling in a loadof activity types that l am not able to delete. Anyideas please?Screenshot 2026-04-29 at 10.54.42.png •1 reply 12 days agoLauren Hudson 6:17 PMHello, request from Norstella to update theirJiminny to match this shared spreadsheet. lssomebody from the support team able to helpplease?Numbers Document -Message #support+Аа...
|
Code
|
Client.php — app — 9 problems in this file
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
Run and Debug (⇧⌘D)
SlackFileEditViewGoHistoryWindowHelpAPPDOCKER0 ₴1DEV (docker)₴2APP (-zsh)-zsh* @param string SobjectType The object type('deals''companies''contacts''cal)* @param array<string, mixed>Spayload Thesearch payload with filters, sorts, prop‹* @return array The search response with'results''total'*'paging'keys* @throws RateLimitException When rate limit is hit* @throws HubspotException On API errors** @return array The search response with 'results', 'total', 'paging' keys*/public function search(string SobjectType, array Spayload): arrayend diff4) app/Console/Commands/JiminnyDebugCommand.php (statement_indentation)begin diff --/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php+++/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php-359,11+359,11 @ScrmService = ScrmResolver->prepareCrmService);-//-/1for ($i = 0; si < 3; Si++) {if (Si % 250) {Sthis->info("Syncing opportunity {Si}");Sthis->info("Matching contact {$i}");-1/-1/.+++++ScrmService->syncOpportunity('374720564');if ($i % 25=0{//Sthis->info("Syncing opportunity {$i}");Sthis->info("Matching contact {$i}");////}ScrmService->syncOpportunity('374720564');$crmService->matchByName('Robot');end diffFixed 4 of 5666 files in 146.870 seconds, 60.00 MB memory usedWhat's next:Try Docker Debug for seamless, persistent debugging tools in any containeror image →Learn more at https://docs.docker.com/go/debug-cli/lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~/jiminny/app (JY-20725-handle-HS-search-rate-lirHomeDMsActivityFilesLater..•More(abl•[Platform) Refinemen... now100%8• Mon 11 May 16:05:01→[Platform] Refinement •now - 16:00-17:00Ci Join Google MeetJiminny ...# contusion-clinic# curiosity_lab# engineering# general# jiminny-bg# platform-tickets# product_launches# random# releases# sofia-office# support# thank-yous# the_people_of_jimi...• MessagesC FilesBookmarks+^ Direct messagesP. Aneliya Angelova®. Galya DimitrovaPetko Kashinski&. Stefka StoyanovaVasil Vasilev&. Nikolay IvanovAneliya Angelova, ...Stoyan Tanev• VesE Lukas Kovalik y... 020832): RemoveToday ~ ok 4.0 and 4.1 models(#506) (steliyan-g)NewGitHub APP 2:13 PM9 new commits pushed to master by ilian-jiminny861859d2 - Stop sending signals to datadog forincresed/ decreased workers.23ad2bb1 - Drop usage of RunForSeconds.3aedca4a - Generage Workerld on the fly.86cb0855 - ignore-missing-request-id-on-transcript Handle wrong format for assemblytranscriptions, this happends for old activitiesthat had a different provider and were updatedto point to Assembly when the providers weredeleted49ca4dfe - ignore-missing-request-id-on-transcript added log importShow morejiminny/app| Added by GitHubCircleCI APP 2:38 PMDeployment Successful!Project: appWhen:05/11/202611:38:14Tag:View JobE:: AppsS Jira CloudToastGanala Cala112Message #releases+..•...
|
Code
|
✻ [Claude Code] MatchActivityCrmDataTest.php (Matc ✻ [Claude Code] MatchActivityCrmDataTest.php (MatchActivityCrmDataTest.php) — app...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
12
"Platrorm Kerlnemen… now100% L2Mon 11 May 16:03:40*EXPLORERV APPV appy JobsVCrm** MatchActivitiesToNewOpportunity.phpMatchActivityCrmData.php** NoteObject.php## SaveActivity.phpwSavelranscription.php## SetupLayout.php#R SyncActivity.phpwSvncFieldMetadata.pho1A SyncHubspotObjects.php#Suncleads.ono# SyncObjects.php#SvncOpoortunities.lob.ohv# SyncOpportunity.php## SyncProfileMetadata.php1# SyncTeamFieldsJob.php# SvncTeamMetadata.ohnR UpdateOpportunitySpecifications.php*R UpdateStage.php> DealRisks> MeetingBotV Middlowaro*R HandleHubspotRateLimit.phpR RateLimited.php> Streaming> Team> Telephony> User«* BaseProcessingJob.php* DummyJob.php14 ImportRecallAlRecordingsJob.php14 ImportRemoteTrackJob.php# .Job.ohp#A JobDispatcher.php#.JobDispatcherinterface.ohv# [EMAIL]# SasVisibilitvControl.nhn> Listeners> MailActivitvi> Ai> CalendarPa JY-20725-handle-HS-search-rate-limit* ©О6Д11О2*Claude Code MatchActivityCrmDatalest.php (MatchActivityCrmDatalest.php—•2 C3IA RateLimitException.phpR * [Claude Code] MatchActivityCrmDataTest.php 1 X# Clclassinny > app › tests › Unit > Jobs › Crm > 9 MatchActivityCrmDataTest.php > 4 MatchActivityCrmDataTest › ® testJobConfiguration()• Matchactiv1tycrmDataTest extends Testcaseclass MatchActivityermDatarest extends Testcasepublic tunction testSkipscrmcontiqurationSw1tchwhenAlreadyMatches: vo1123public tunction testSkipscrmcont1qurat1onSw1tchwhenALreadyMatches: vo1dSthis→>connection->expects(Sthis->once152Sthis->connection->expects(Sthis->once→ = Review rate limit handli..Review rate limit handling for HubSpot j...->wilReturncallback(function Scallback) «return $callback();7);Sob = new Matchactiv1tycrmbatafromConfiguration: $fromConfiguration,sjob->handle(crnAcelvlcyservice: sthis->ernAceivicyservice,connection: Sth1s->connection,oublic function testlobConfiaurationo: voidi$job = new MatchActivityCrmData(fromConfiquration: null.Sthis->assertSame(3, Sjob->tries);Sthis->assertSame([30, 90, 180], Sjob->backoff()):public function testUniqueId(): voidSconfauration = sthis-screateMockconfauration:.class).$configuration->method ( 'getId')->willReturn(5);sjob1 = new MatchActivityCrmData(activityId: 123,fromConfiguration: null,remoteSearch: false,Dinis-sascertSone(1123-0-10cal0, Stobl-sunioueTd0).$job2 = new MatchActivityCrmData(activityid:fromConfiguration: null,remotesearch: true.sthis-sassertSame('123:0:remote'. Siob-suniqueldo):Cioh? = new MatchActivitvGrmDataifromConfiguration: $configuration,sthis-sascertSame ('123:5:local'. Siob3->uniqueid0));184-wil ReturnCallback function Scallback)return $callback();7);S1ob = new Matchactiv1tycrmbatal+romcon iguration: stronconfiguration,$job→>harcrnAculvicyservice: schis→>crAceivicyservice,connection: sthis->connect1on,oublic function testlobConfiaurationo: void$job = new MatchActivityCrmData(activityId: 1,fromConfiguration: null,sthis assert5ame (30, s-1)(300, sjob→>timsthis->assertSame(7260, $job->uniqueFor());sthis->assertSame( [30, 90, 180], Sjob->backoff();public function testUniqueld(): voidSconfiauration = sthis-screateMock confaurat ion:cllass):$configuration->method('getId')->willReturn(5);$job1 = new MatchActivityCrmData(fromConfiguration: null,Dints-snssertSame/1123-0:1ocal0 stobl-sunioueTd0)).$job2 =MatchActivityCrmData(activityId: 123,fromConfiguration: null,true.sthis-sassertSame "123:0:remote'. Siob2-suniqueldon:Cioh? = new MatchActivitvGrmlatalactivitvtd: 122.fromConficuration, SconfiaurationSthis->assertSame('123:5:local'. Siob3->uniqueldO):</> HandleHubspotRateLimit..fix @tests/Unit/Services/Crm/Hubspot/Pagination/HubspotPaginationServiceTest.php and @tests/Unit/Jobs/Crm/MatchActivityCrmDataTest.phpwhich php8.2 php8.1 php82 php81 2>/dev/null; ls /usr/local/bin/php* 2>/dev/null; ls /opt/homebrew/bin/php* 2>/dev/nullOUT Exit code 1phpo.z not toundphpo.1 not tound• Bash Find how tests are run in this projectcat /Users/lukas/jiminny/app/sail 2>/dev/null | head -5; ls /Users/lukas/jiminny/app/*.sh 2>/dev/null; cat /Users/lukas/jiminny/app/Makefile 2>/dev/null | grep -j "phpunit\|test" | hOUT (eval):1: no matches found: /Users/lukas/liminny/app/*.sh• Can't run tests directlv — no PHP in PATH. I'l reason from the code and make the fixes.What's broken:1. MatchActivityCrmDataTest::testJobConfiguration - uniqueFor() now returns 7260 (120x60+60), test expects 3602. MatchActivitvCrmDataTest::testFailedMethodLoosError — failedo now calls Sentry:: captureExcentiono. not mocked3. All HubspotPaginationServiceTest tests - mock hubspotClientMock->request() through an getInstance()->getClient() chain, but since Client is a full mock via createMock(), search() isalso mocked and returns [1 by default, so request() is never reached• Edit MatchActivitvCrmDataTest.ohopublic function testjobconfiguration): voidSjob =new MatchActivityCrmData(acuivityso.fromConfiguration: null,remoresearch: Talse.public function testJobConfiguration): voidsjob = newMatchActivityCrmData(fromConfiguration: null,remoresearch: ralsesthis-sassertSame(3. Siob->tries:sthis-sascertSame 300. Siohsstimeouton+sthis->assertSame(3. Siob->tries):sthis-sascertSame 300. Siohsstimeouton:Make this edit to MatchActivityCrmDataTest.php?7Yes. alllow all edits this seccion3 NoTall Alaudh what te de tnstnadEsc to canceLn 180, Col 1 Spaces: 4 UTF-8 LF PHP 8 SignIn 8.3 P 0...
|
Code
|
✻ [Claude Code] MatchActivityCrmDataTest.php (Matc ✻ [Claude Code] MatchActivityCrmDataTest.php (MatchActivityCrmDataTest.php) — app...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 12 pending changes
SlackFileEditViewGoHistoryWindowHelpAPPDOCKER0 ₴1DEV (docker)₴2APP (-zsh)-zsh* @param string SobjectType The object type('deals''companies''contacts''cal)* @param array<string, mixed>Spayload Thesearch payload with filters, sorts, prope* @return array The search response with'results''total'*'paging'keys* @throws RateLimitException When rate limit is hit* @throws HubspotException On API errors** @return array The search response with 'results', 'total', 'paging' keys*/public function search(string SobjectType, array Spayload): arrayend diff4) app/Console/Commands/JiminnyDebugCommand.php (statement_indentation)begin diff --/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php+++/home/jiminny/app/Console/Commands/JiminnyDebugCommand.php-359,11+359,11 @ScrmService = ScrmResolver->prepareCrmService);-//-/1for ($i = 0; si < 3; Si++) {if (Si % 250) {Sthis->info("Syncing opportunity {Si}");Sthis->info("Matching contact {$i}");-1/-1/.+++++ScrmService->syncOpportunity('374720564');if ($i % 25=0{//Sthis->info("Syncing opportunity {$i}");Sthis->info("Matching contact {$i}");////}ScrmService->syncOpportunity('374720564');$crmService->matchByName('Robot');end diffFixed 4 of 5666 files in 146.870 seconds, 60.00 MB memory usedWhat's next:Try Docker Debug for seamless, persistent debugging tools in any containeror image →Learn more at https://docs.docker.com/go/debug-cli/lukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~/jiminny/app (JY-20725-handle-HS-search-rate-lirHomeDMsActivityFilesLater..•More• [Platform) Refinemen…. 24 m left100% <78• Mon 11 May 16:36:49ED→Describe what you are looking forJiminny ...# contusion-clinic# curiosity_lab# engineering# general# jiminny-bg# platform-tickets# product_launches# random# releases# sofia-office# support# thank-yous# the_people_of jimi...^ Direct messagesP. Aneliya Angelova®. Galya DimitrovaPetko Kashinski&. Stefka StoyanovaVasil Vasilev&. Nikolay IvanovAneliya Angelova, ...Stoyan Tanev• VesE Lukas Kovalik y... 0::: AppsS Jira CloudToastGanala Cala# support8 346 0MessagesC FilesBookmarksKara Joneswas added .Tuesday, April 28th~More v+Wednesday, April 29th ~Lauren Hudson 12:58 PMHi team, I'm trying to set up auto detect for LesMills, but when I add a new playbook (because theywant it applied across all teams), it's pulling in a loadof activity types that l am not able to delete. Anyideas please?Screenshot 2026-04-29 at 10.54.42.png •1 reply 12 days agoLauren Hudson 6:17 PMHello, request from Norstella to update theirJiminny to match this shared spreadsheet. lssomebody from the support team able to helpplease?Numbers Document -Message #support+Аа...
|
Code
|
Client.php — app — 9 problems in this file
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
iTerm2ShellEditViewSessionScriptsProfilesWindowHelpallSupport Daily • 2 m left100% <78• Tue 12 May 15:13:49DOCKERDOCKER (-zsh)STAGE (ssh)₴81DEV (-zsh)O $2APP (-zsh)*3ec2-user@ip-10-30-129-.84T2PROD (-zsh)181screenpipe*kibanaasticsearch"{"type" : "log""@timestamp": "2026-05-11T19:54:53Z","tags" : ["warning""el,"data"], "pid" :7,'"message": "Unabletorevive connection: [URL_WITH_CREDENTIALS] : "2026-05-11T19:54:53Z",asticsearch", "data"],"pid":7,"message": "No livingconnections"}"warning","elkibanaI {"type" : "log", "@timestamp" : "2026-05-11T19:54:53Z": ["warning"ugins""licensing"], "pid" :7,"message" : "License informationcould not be obtainedasticsearch due to Error: No Livingconnectionskibana1 {"type" : "log","@timestamp" : "2026-05-11T19:54:54Z","tags" : ["error"ticsearch", "data"], "pid":7, "message" : "[ConnectionError]: getaddrinfo ENOTFOUND elasticsearch elasticsearch: 9200"}kibana1 {"type" : "log","@timestamp": "2026-05-11T19:54:54Z", "tags" : ["warning", "elasticsearch", "data"], "pid" :7,"message":"Unable to reviveconnection: [URL_WITH_CREDENTIALS] "2026-05-11T19:54:54Z","tags" : ["warning", "elasticsearch", "data"],"pid":7,"message"• "No livingconnections"}kibana1 {"type": "log","@timestamp": "2026-05-11T19:54:54Z""tags" : ["error", "plugins", "taskManager","taskManager"], "pid" :7, "message": "Failed to poll for work: Error: NoLiving connections"}kibana1 {"type": "log", "@timestamp": "2026-05-11T19:54:572", "tags" : ["error", "elasticsearch".,"data"], "pid" :7, "message" :"[ConnectionError]: getaddrinfo ENOTFOUND elasticsearch elasticsearch:9200*}kibana1 {"type" : "Log","@timestamp" : "2026-05-11T19:54:57Z","tags" : ["warning","elasticsearch", "data"], "pid" :7, "message": "Unable to revive connection: [URL_WITH_CREDENTIALS] : "2026-05-11T19:54:57Z", "tags" : ["warning", "elasticsearch", "data"], "pid" :7, "message": "No living connections"}kibanaI {"type" : "log", "@timestamp" : "2026-05-11T19:54:57Z", "tags" : ["error","plugins", "taskManager", "taskManager"], "pid" :7, "message" : "Failed to pollfor work: Error: NoLiving connections"}kibanaI {"type" : "log", "@timestamp" : "2026-05-11T19:54:59Z","tags" : ["error", "elasticsearch", "data"], "pid" :7, "message" : "[ConnectionError]: getaddrinfo ENOTFOUND elasticsearch elasticsearch:9200"}1 {"type" : "log","@timestamp": "2026-05-11T19:55: 00Z""tags": ["warning"asticsearch", "data"], "pid" :7, "message": "Unable to revive connection: [URL_WITH_CREDENTIALS] : "2026-05-11T19:55 :00Z", "tags" : ["warning", "elasticsearch", "data"], "pid" :7, "message": "No living connections"}1 {"type": "log", "@timestamp" : "2026-05-11T19:55:00Z","tags" : ["error", "plug, "taskManager",, "taskManager"], "pid" :7, "message": "Failed to poll for work: Error: NoLiving connections"}unexpected EOFkas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~/jiminny/infrastructure/dev/docker (develop) $-zshX5O ₴6-zsh*** System restart required ***Last login: Thu May7 08:01:13 2026 from 212.5.153.87lukas@jiminny-prod-bastion:~$lukas@jiminny-prod-bastion:~$ client_loop: send disconnect: Broken pipeukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ ||T3 EU (ssh)New release'24.04.4 LTS' available.Run'do-release-upgrade'to upgrade to it.*** System restart required ***Last login: Wed Apr 22 08:09:46 2026 from 212.5.153.87lukas@jiminny-eu-bastion:~$ |T4STAGE (ssh)Run 'do-release-upgrade' to upgrade to it.System restart required ***Last login: Thu May7 11:01:47 2026 from 212.5.153.87bastion:-$QA (-zsh)Poetry could not find a pyproject.toml file in /Users/lukas or its parentsPoetry could not find a pyproject.toml file in /Users/lukas or its parentsT6FE (-zsh)Poetry could not find a pyproject.toml file in /Users/lukas or its parentsPRODSTAGEFRONTENDPoetry could not find a pyproject.toml file in /Users/lukas or its parentsLukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ |17 EXT (-zsh)Poetry could not find a pyproject.toml file in /Users/lukas or its parentsPoetry could not find a pyproject.toml file in /Users/lukas or its parentsukas@Lukas-Kovaliks-MacBook-Pro-Jiminny ~ $ lEXTENSION...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
#!/bin/bash
# screenpipe_sync.sh
# Syncs Screenpipe SQLite data (vision + audio) to a NAS archive database.
# Append-only, no deletions.
#
# Key design points
# -----------------
# 1. Multi-install safe via install_id.
# Source IDs reset to 1 on every screenpipe reinstall. To avoid collisions
# in the NAS archive, every synced table gets an extra `install_id` column,
# and the logical primary key becomes (install_id, id) enforced by a
# unique index. The install_id is a UUID stored in
# ~/.screenpipe/.sync_install_id — wiping ~/.screenpipe/ (which is what
# happens on reinstall) discards it, so the next run generates a new one.
#
# 2. Schema-drift tolerant. If screenpipe migrations add new columns to the
# source DB, the NAS gets ALTER TABLE'd to match. Inserts use explicit
# column lists so positional mismatches can't occur.
#
# 3. FTS caveat. FTS tables in the NAS use source `id` as rowid. After a
# reinstall, INSERT OR IGNORE will silently skip rows whose id collides
# with a previous install's id, so FTS only reliably indexes the most
# recent install. Falls back to LIKE queries on the base tables for
# multi-install searches (which can filter by install_id).
#
# Usage
# -----
# ./screenpipe_sync.sh # syncs yesterday
# ./screenpipe_sync.sh 2026-04-15 # syncs a specific date
# ./screenpipe_sync.sh today # syncs today so far
# ./screenpipe_sync.sh --reset-install-id # rotate install_id and exit
# ./screenpipe_sync.sh --show-install-id # print install_id and exit
set -euo pipefail
# ─── CONFIG ───────────────────────────────────────────────────────────────────
DB_SRC="${SCREENPIPE_DB:-$HOME/.screenpipe/db.sqlite}"
NAS_MOUNT="${NAS_MOUNT:-/Volumes/screenpipe}"
NAS_DB="$NAS_MOUNT/archive.db"
NAS_DATA="$NAS_MOUNT/data"
LOG_FILE="$HOME/.screenpipe/sync.log"
INSTALL_ID_FILE="$HOME/.screenpipe/.sync_install_id"
# Sync table groups. Order matters for FK-ish references
# (parents before children).
VISION_TABLES=(video_chunks frames elements ocr_text ui_events meetings)
AUDIO_TABLES=(speakers speaker_embeddings audio_chunks audio_transcriptions audio_tags)
ALL_SYNC_TABLES=("${VISION_TABLES[@]}" "${AUDIO_TABLES[@]}")
# ──────────────────────────────────────────────────────────────────────────────
SCRIPT_START=$(date +%s)
# ─── HELPERS ──────────────────────────────────────────────────────────────────
log() {
local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*"
echo "$msg" | tee -a "$LOG_FILE"
}
step() {
local now=$(date +%s)
local elapsed=$(( now - SCRIPT_START ))
local min=$(( elapsed / 60 ))
local sec=$(( elapsed % 60 ))
printf "\n[+%02dm%02ds] ▶ %s\n" "$min" "$sec" "$*" | tee -a "$LOG_FILE"
}
run_sqlite_heredoc() {
local label="$1"
local sql="$2"
local start=$(date +%s)
printf " %-36s " "$label"
sqlite3 "$DB_SRC" <<< "$sql" &
local pid=$!
local spin=[PASSWORD] '⠙' '⠹' '⠸' '⠼' '⠴' '⠦' '⠧' '⠇' '⠏')
local i=0
while kill -0 "$pid" 2>/dev/null; do
printf "\r %-36s %s " "$label" "${spin[$i]}"
i=$(( (i + 1) % 10 ))
sleep 0.2
done
wait "$pid"
local rc=$?
if [ $rc -ne 0 ]; then
printf "\r %-36s ✗ FAILED\n" "$label" | tee -a "$LOG_FILE"
exit $rc
fi
local dur=$(( $(date +%s) - start ))
printf "\r %-36s ✓ %dm%02ds\n" "$label" "$(( dur / 60 ))" "$(( dur % 60 ))" | tee -a "$LOG_FILE"
}
check() {
local label="$1" got="$2" expected="$3"
if [ "$got" -eq "$expected" ]; then
printf " %-25s %s / %s ✓\n" "$label:" "$got" "$expected"
else
printf " %-25s %s / %s ✗ MISMATCH\n" "$label:" "$got" "$expected"
fi
}
table_columns_with_types() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2 "|" $3}'
}
table_columns() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2}'
}
table_exists() {
local db="$1" table="$2"
local count
count=$(sqlite3 "$db" "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='$table';")
[ "$count" -gt 0 ]
}
# Adds any columns present in source but missing in NAS for the given table.
# Skips install_id (which is NAS-only and managed separately).
ensure_columns() {
local table="$1"
local label="schema: $table"
printf " %-36s " "$label"
if ! table_exists "$DB_SRC" "$table"; then
printf "✗ source missing — skipping\n"
return 0
fi
if ! table_exists "$NAS_DB" "$table"; then
printf "✓ fresh (created above)\n"
return 0
fi
local src_cols
src_cols=$(table_columns_with_types "$DB_SRC" "$table")
local nas_cols
nas_cols=$(table_columns "$NAS_DB" "$table")
local added=0
local added_names=""
while IFS='|' read -r name type; do
[ -z "$name" ] && continue
if ! echo "$nas_cols" | grep -Fxq "$name"; then
sqlite3 "$NAS_DB" "ALTER TABLE $table ADD COLUMN \"$name\" $type;"
added=$((added + 1))
added_names="$added_names $name"
fi
done <<< "$src_cols"
if [ "$added" -gt 0 ]; then
printf "✓ added %d:%s\n" "$added" "$added_names"
else
printf "✓ in sync\n"
fi
}
# Comma-separated, double-quoted column list for a table from source DB.
# Optional alias is prefixed (e.g. `o."col"`) for JOIN selects where column
# names would otherwise collide.
build_col_list() {
local table="$1"
local alias="${2:-}"
local prefix=""
[ -n "$alias" ] && prefix="${alias}."
table_columns "$DB_SRC" "$table" | awk 'NF' | awk -v p="$prefix" '{print p "\"" $0 "\""}' | paste -sd, -
}
# ──────────────────────────────────────────────────────────────────────────────
# ─── ARG HANDLING ─────────────────────────────────────────────────────────────
if [ "${1:-}" = "--reset-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
OLD=$(cat "$INSTALL_ID_FILE")
rm -f "$INSTALL_ID_FILE"
echo "Removed install_id: $OLD"
else
echo "No install_id file at $INSTALL_ID_FILE"
fi
echo "A new install_id will be generated on the next sync run."
exit 0
fi
if [ "${1:-}" = "--show-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
cat "$INSTALL_ID_FILE"
else
echo "(none — will be generated on next run)"
fi
exit 0
fi
if [ "${1:-}" = "today" ]; then
TARGET_DATE=$(date +%Y-%m-%d)
elif [ -n "${1:-}" ]; then
TARGET_DATE="$1"
if ! [[ "$TARGET_DATE" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
echo "ERROR: Invalid date. Use YYYY-MM-DD, 'today', or no argument for yesterday."
exit 1
fi
else
TARGET_DATE=$(date -v-1d +%Y-%m-%d)
fi
# ─── INSTALL ID ───────────────────────────────────────────────────────────────
INSTALL_ID=""
if [ -f "$INSTALL_ID_FILE" ]; then
INSTALL_ID=$(tr -d '[:space:]' < "$INSTALL_ID_FILE")
fi
if [ -z "$INSTALL_ID" ]; then
INSTALL_ID=$(uuidgen | tr 'A-Z' 'a-z')
echo "$INSTALL_ID" > "$INSTALL_ID_FILE"
log "Generated new install_id: $INSTALL_ID"
fi
log "========================================"
log "Screenpipe sync starting for: $TARGET_DATE"
log "install_id: $INSTALL_ID"
log "========================================"
# ─── PREFLIGHT ────────────────────────────────────────────────────────────────
step "Preflight checks"
if [ ! -f "$DB_SRC" ]; then
log "ERROR: Source DB not found at $DB_SRC"; exit 1
fi
printf " %-25s %s (%s)\n" "Source DB:" "OK" "$(du -sh "$DB_SRC" | cut -f1)"
if [ ! -d "$NAS_MOUNT" ]; then
log "ERROR: NAS not mounted at $NAS_MOUNT"; exit 1
fi
printf " %-25s %s\n" "NAS mount:" "OK $NAS_MOUNT"
# Check if DB already synced for this date+install_id
DB_ALREADY_SYNCED=false
if [ -f "$NAS_DB" ]; then
if table_exists "$NAS_DB" "frames"; then
HAS_INSTALL_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('frames') WHERE name='install_id';")
if [ "$HAS_INSTALL_COL" -gt "0" ]; then
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE' AND install_id='$INSTALL_ID';" 2>/dev/null || echo "0")
else
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE';" 2>/dev/null || echo "0")
fi
if [ "$EXISTING" -gt "0" ]; then
log "Date $TARGET_DATE / install $INSTALL_ID already has $EXISTING frames — skipping DB sync"
DB_ALREADY_SYNCED=true
else
printf " %-25s %s (%s)\n" "Archive DB:" "exists" "$(du -sh "$NAS_DB" | cut -f1)"
fi
else
printf " %-25s %s\n" "Archive DB:" "exists, no frames table yet"
fi
else
printf " %-25s %s\n" "Archive DB:" "will be created"
fi
# Source data dir for this date (video frames)
DATA_SRC="$HOME/.screenpipe/data/data/$TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
DATA_SIZE=$(du -sh "$DATA_SRC" | cut -f1)
DATA_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
printf " %-25s %s (%s files, %s)\n" "Frame data dir:" "OK" "$DATA_FILES" "$DATA_SIZE"
else
printf " %-25s %s\n" "Frame data dir:" "not found — skipping"
fi
# Audio files (flat in ~/.screenpipe/data/, dated by filename)
shopt -s nullglob
AUDIO_SRC_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_SRC_FILES[@]} -gt 0 ]; then
AUDIO_TOTAL=$(du -ch "${AUDIO_SRC_FILES[@]}" | tail -1 | cut -f1)
printf " %-25s %s (%s files, %s)\n" "Audio files:" "OK" "${#AUDIO_SRC_FILES[@]}" "$AUDIO_TOTAL"
else
printf " %-25s %s\n" "Audio files:" "none for this date"
fi
# ─── SCHEMA MIGRATION: install_id ─────────────────────────────────────────────
# Adds install_id column to existing NAS tables, backfills NULLs with a
# legacy tag, and creates the (install_id, id) unique index. Idempotent.
if [ -f "$NAS_DB" ]; then
step "Schema migration: install_id"
LEGACY_TAG="legacy-$(date +%Y%m%d)"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
if ! table_exists "$NAS_DB" "$tbl"; then
continue
fi
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
ROW_COUNT=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM $tbl;")
printf " %-36s adding install_id, backfill %s rows → %s\n" "$tbl" "$ROW_COUNT" "$LEGACY_TAG"
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
sqlite3 "$NAS_DB" "UPDATE $tbl SET install_id = '$LEGACY_TAG' WHERE install_id IS NULL;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
fi
# ─── DB SYNC ──────────────────────────────────────────────────────────────────
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── COUNT SOURCE ROWS ────────────────────────────────────────────────────
step "Counting source rows for $TARGET_DATE"
SRC_FRAMES=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ELEMENTS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM elements WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_UI=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE';")
SRC_OCR=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ocr_text WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_MEETINGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE';")
SRC_ACHUNKS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ATRANS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_transcriptions WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
SRC_ATAGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_tags WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
# speakers + speaker_embeddings are install-global, not per-date; we sync all.
SRC_SPEAKERS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speakers;")
SRC_SEMB=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speaker_embeddings;")
printf " %-25s %s\n" "frames:" "$SRC_FRAMES"
printf " %-25s %s\n" "elements:" "$SRC_ELEMENTS"
printf " %-25s %s\n" "ui_events:" "$SRC_UI"
printf " %-25s %s\n" "ocr_text:" "$SRC_OCR"
printf " %-25s %s\n" "meetings:" "$SRC_MEETINGS"
printf " %-25s %s\n" "audio_chunks:" "$SRC_ACHUNKS"
printf " %-25s %s\n" "audio_transcriptions:" "$SRC_ATRANS"
printf " %-25s %s\n" "audio_tags:" "$SRC_ATAGS"
printf " %-25s %s (all-time)\n" "speakers:" "$SRC_SPEAKERS"
printf " %-25s %s (all-time)\n" "speaker_embeddings:" "$SRC_SEMB"
if [ "$SRC_FRAMES" -eq "0" ] && [ "$SRC_ACHUNKS" -eq "0" ]; then
log "No frames or audio chunks for $TARGET_DATE — skipping DB sync"
DB_ALREADY_SYNCED=true
fi
fi
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── INIT TABLES ──────────────────────────────────────────────────────────
step "Initialising tables (CREATE IF NOT EXISTS)"
run_sqlite_heredoc "creating vision tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.frames AS SELECT * FROM main.frames WHERE 0;
CREATE TABLE IF NOT EXISTS nas.elements AS SELECT * FROM main.elements WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ui_events AS SELECT * FROM main.ui_events WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ocr_text AS SELECT * FROM main.ocr_text WHERE 0;
CREATE TABLE IF NOT EXISTS nas.video_chunks AS SELECT * FROM main.video_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.meetings AS SELECT * FROM main.meetings WHERE 0;
DETACH nas;
"
run_sqlite_heredoc "creating audio tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.audio_chunks AS SELECT * FROM main.audio_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_transcriptions AS SELECT * FROM main.audio_transcriptions WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speakers AS SELECT * FROM main.speakers WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speaker_embeddings AS SELECT * FROM main.speaker_embeddings WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_tags AS SELECT * FROM main.audio_tags WHERE 0;
DETACH nas;
"
# Re-run install_id + index setup so freshly-created tables get them too.
for tbl in "${ALL_SYNC_TABLES[@]}"; do
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
# ─── SCHEMA DRIFT ─────────────────────────────────────────────────────────
step "Reconciling NAS schema with source"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
ensure_columns "$tbl"
done
run_sqlite_heredoc "creating indexes" "
ATTACH '$NAS_DB' AS nas;
-- vision
CREATE INDEX IF NOT EXISTS nas.idx_frames_timestamp ON frames(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_frames_app_name ON frames(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_window_name ON frames(window_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_video_chunk_id ON frames(video_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_frames_document_path ON frames(document_path) WHERE document_path IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_id ON elements(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_src_role ON elements(frame_id, source, role) WHERE text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_onscreen_frame ON elements(frame_id) WHERE on_screen = 1 AND text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_timestamp ON ui_events(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_app_name ON ui_events(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_frame_id ON ui_events(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_ocr_text_frame_id ON ocr_text(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_meetings_start ON meetings(meeting_start);
CREATE INDEX IF NOT EXISTS nas.idx_video_chunks_device ON video_chunks(device_name);
-- audio
CREATE INDEX IF NOT EXISTS nas.idx_audio_chunks_timestamp ON audio_chunks(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_chunk_id ON audio_transcriptions(audio_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_timestamp ON audio_transcriptions(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_speaker ON audio_transcriptions(speaker_id, timestamp DESC);
CREATE INDEX IF NOT EXISTS nas.idx_speaker_emb_speaker_id ON speaker_embeddings(speaker_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_tags_chunk_id ON audio_tags(audio_chunk_id);
DETACH nas;
"
run_sqlite_heredoc "creating FTS tables" "
ATTACH '$NAS_DB' AS nas;
CREATE VIRTUAL TABLE IF NOT EXISTS nas.elements_fts USING fts5(
text, role, frame_id UNINDEXED,
content='elements', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.frames_fts USING fts5(
full_text, app_name, window_name, browser_url, id UNINDEXED,
tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.ui_events_fts USING fts5(
text_content, app_name, window_title, element_name,
content='ui_events', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.audio_transcriptions_fts USING fts5(
transcription, device, speaker_id UNINDEXED, id UNINDEXED,
tokenize='unicode61'
);
DETACH nas;
"
# ─── BUILD EXPLICIT COLUMN LISTS ──────────────────────────────────────────
# Source columns + install_id appended. Same on both sides of the INSERT.
FRAMES_COLS=$(build_col_list frames)
ELEMENTS_COLS=$(build_col_list elements)
ELEMENTS_COLS_E=$(build_col_list elements e)
UI_EVENTS_COLS=$(build_col_list ui_events)
OCR_TEXT_COLS=$(build_col_list ocr_text)
OCR_TEXT_COLS_O=$(build_col_list ocr_text o)
VIDEO_CHUNKS_COLS=$(build_col_list video_chunks)
MEETINGS_COLS=$(build_col_list meetings)
ACHUNKS_COLS=$(build_col_list audio_chunks)
ATRANS_COLS=$(build_col_list audio_transcriptions)
ATRANS_COLS_T=$(build_col_list audio_transcriptions t)
SPEAKERS_COLS=$(build_col_list speakers)
SEMB_COLS=$(build_col_list speaker_embeddings)
ATAGS_COLS=$(build_col_list audio_tags)
ATAGS_COLS_AT=$(build_col_list audio_tags at)
# ─── SYNC VISION DATA ─────────────────────────────────────────────────────
step "Syncing vision data for $TARGET_DATE"
run_sqlite_heredoc "video_chunks" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.video_chunks ($VIDEO_CHUNKS_COLS, install_id)
SELECT $VIDEO_CHUNKS_COLS, '$INSTALL_ID' FROM main.video_chunks
WHERE id IN (
SELECT DISTINCT video_chunk_id FROM main.frames
WHERE date(timestamp) = '$TARGET_DATE' AND video_chunk_id IS NOT NULL
);
DETACH nas;
"
run_sqlite_heredoc "frames ($SRC_FRAMES rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames ($FRAMES_COLS, install_id)
SELECT $FRAMES_COLS, '$INSTALL_ID' FROM main.frames WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ocr_text ($SRC_OCR rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ocr_text ($OCR_TEXT_COLS, install_id)
SELECT $OCR_TEXT_COLS_O, '$INSTALL_ID' FROM main.ocr_text o
JOIN main.frames f ON o.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ui_events ($SRC_UI rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events ($UI_EVENTS_COLS, install_id)
SELECT $UI_EVENTS_COLS, '$INSTALL_ID' FROM main.ui_events WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "elements ($SRC_ELEMENTS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements ($ELEMENTS_COLS, install_id)
SELECT $ELEMENTS_COLS_E, '$INSTALL_ID' FROM main.elements e
JOIN main.frames f ON e.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "meetings ($SRC_MEETINGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.meetings ($MEETINGS_COLS, install_id)
SELECT $MEETINGS_COLS, '$INSTALL_ID' FROM main.meetings WHERE date(meeting_start) = '$TARGET_DATE';
DETACH nas;
"
# ─── SYNC AUDIO DATA ──────────────────────────────────────────────────────
step "Syncing audio data for $TARGET_DATE"
# Speakers + embeddings are install-global, not per-date. Sync everything
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
#!/bin/bash
# screenpipe_sync.sh
# Syncs Screenpipe SQLite data (vision + audio) to a NAS archive database.
# Append-only, no deletions.
#
# Key design points
# -----------------
# 1. Multi-install safe via install_id.
# Source IDs reset to 1 on every screenpipe reinstall. To avoid collisions
# in the NAS archive, every synced table gets an extra `install_id` column,
# and the logical primary key becomes (install_id, id) enforced by a
# unique index. The install_id is a UUID stored in
# ~/.screenpipe/.sync_install_id — wiping ~/.screenpipe/ (which is what
# happens on reinstall) discards it, so the next run generates a new one.
#
# 2. Schema-drift tolerant. If screenpipe migrations add new columns to the
# source DB, the NAS gets ALTER TABLE'd to match. Inserts use explicit
# column lists so positional mismatches can't occur.
#
# 3. FTS caveat. FTS tables in the NAS use source `id` as rowid. After a
# reinstall, INSERT OR IGNORE will silently skip rows whose id collides
# with a previous install's id, so FTS only reliably indexes the most
# recent install. Falls back to LIKE queries on the base tables for
# multi-install searches (which can filter by install_id).
#
# Usage
# -----
# ./screenpipe_sync.sh # syncs yesterday
# ./screenpipe_sync.sh 2026-04-15 # syncs a specific date
# ./screenpipe_sync.sh today # syncs today so far
# ./screenpipe_sync.sh --reset-install-id # rotate install_id and exit
# ./screenpipe_sync.sh --show-install-id # print install_id and exit
set -euo pipefail
# ─── CONFIG ───────────────────────────────────────────────────────────────────
DB_SRC="${SCREENPIPE_DB:-$HOME/.screenpipe/db.sqlite}"
NAS_MOUNT="${NAS_MOUNT:-/Volumes/screenpipe}"
NAS_DB="$NAS_MOUNT/archive.db"
NAS_DATA="$NAS_MOUNT/data"
LOG_FILE="$HOME/.screenpipe/sync.log"
INSTALL_ID_FILE="$HOME/.screenpipe/.sync_install_id"
# Sync table groups. Order matters for FK-ish references
# (parents before children).
VISION_TABLES=(video_chunks frames elements ocr_text ui_events meetings)
AUDIO_TABLES=(speakers speaker_embeddings audio_chunks audio_transcriptions audio_tags)
ALL_SYNC_TABLES=("${VISION_TABLES[@]}" "${AUDIO_TABLES[@]}")
# ──────────────────────────────────────────────────────────────────────────────
SCRIPT_START=$(date +%s)
# ─── HELPERS ──────────────────────────────────────────────────────────────────
log() {
local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*"
echo "$msg" | tee -a "$LOG_FILE"
}
step() {
local now=$(date +%s)
local elapsed=$(( now - SCRIPT_START ))
local min=$(( elapsed / 60 ))
local sec=$(( elapsed % 60 ))
printf "\n[+%02dm%02ds] ▶ %s\n" "$min" "$sec" "$*" | tee -a "$LOG_FILE"
}
run_sqlite_heredoc() {
local label="$1"
local sql="$2"
local start=$(date +%s)
printf " %-36s " "$label"
sqlite3 "$DB_SRC" <<< "$sql" &
local pid=$!
local spin=[PASSWORD] '⠙' '⠹' '⠸' '⠼' '⠴' '⠦' '⠧' '⠇' '⠏')
local i=0
while kill -0 "$pid" 2>/dev/null; do
printf "\r %-36s %s " "$label" "${spin[$i]}"
i=$(( (i + 1) % 10 ))
sleep 0.2
done
wait "$pid"
local rc=$?
if [ $rc -ne 0 ]; then
printf "\r %-36s ✗ FAILED\n" "$label" | tee -a "$LOG_FILE"
exit $rc
fi
local dur=$(( $(date +%s) - start ))
printf "\r %-36s ✓ %dm%02ds\n" "$label" "$(( dur / 60 ))" "$(( dur % 60 ))" | tee -a "$LOG_FILE"
}
check() {
local label="$1" got="$2" expected="$3"
if [ "$got" -eq "$expected" ]; then
printf " %-25s %s / %s ✓\n" "$label:" "$got" "$expected"
else
printf " %-25s %s / %s ✗ MISMATCH\n" "$label:" "$got" "$expected"
fi
}
table_columns_with_types() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2 "|" $3}'
}
table_columns() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2}'
}
table_exists() {
local db="$1" table="$2"
local count
count=$(sqlite3 "$db" "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='$table';")
[ "$count" -gt 0 ]
}
# Adds any columns present in source but missing in NAS for the given table.
# Skips install_id (which is NAS-only and managed separately).
ensure_columns() {
local table="$1"
local label="schema: $table"
printf " %-36s " "$label"
if ! table_exists "$DB_SRC" "$table"; then
printf "✗ source missing — skipping\n"
return 0
fi
if ! table_exists "$NAS_DB" "$table"; then
printf "✓ fresh (created above)\n"
return 0
fi
local src_cols
src_cols=$(table_columns_with_types "$DB_SRC" "$table")
local nas_cols
nas_cols=$(table_columns "$NAS_DB" "$table")
local added=0
local added_names=""
while IFS='|' read -r name type; do
[ -z "$name" ] && continue
if ! echo "$nas_cols" | grep -Fxq "$name"; then
sqlite3 "$NAS_DB" "ALTER TABLE $table ADD COLUMN \"$name\" $type;"
added=$((added + 1))
added_names="$added_names $name"
fi
done <<< "$src_cols"
if [ "$added" -gt 0 ]; then
printf "✓ added %d:%s\n" "$added" "$added_names"
else
printf "✓ in sync\n"
fi
}
# Comma-separated, double-quoted column list for a table from source DB.
# Optional alias is prefixed (e.g. `o."col"`) for JOIN selects where column
# names would otherwise collide.
build_col_list() {
local table="$1"
local alias="${2:-}"
local prefix=""
[ -n "$alias" ] && prefix="${alias}."
table_columns "$DB_SRC" "$table" | awk 'NF' | awk -v p="$prefix" '{print p "\"" $0 "\""}' | paste -sd, -
}
# ──────────────────────────────────────────────────────────────────────────────
# ─── ARG HANDLING ─────────────────────────────────────────────────────────────
if [ "${1:-}" = "--reset-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
OLD=$(cat "$INSTALL_ID_FILE")
rm -f "$INSTALL_ID_FILE"
echo "Removed install_id: $OLD"
else
echo "No install_id file at $INSTALL_ID_FILE"
fi
echo "A new install_id will be generated on the next sync run."
exit 0
fi
if [ "${1:-}" = "--show-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
cat "$INSTALL_ID_FILE"
else
echo "(none — will be generated on next run)"
fi
exit 0
fi
if [ "${1:-}" = "today" ]; then
TARGET_DATE=$(date +%Y-%m-%d)
elif [ -n "${1:-}" ]; then
TARGET_DATE="$1"
if ! [[ "$TARGET_DATE" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
echo "ERROR: Invalid date. Use YYYY-MM-DD, 'today', or no argument for yesterday."
exit 1
fi
else
TARGET_DATE=$(date -v-1d +%Y-%m-%d)
fi
# ─── INSTALL ID ───────────────────────────────────────────────────────────────
INSTALL_ID=""
if [ -f "$INSTALL_ID_FILE" ]; then
INSTALL_ID=$(tr -d '[:space:]' < "$INSTALL_ID_FILE")
fi
if [ -z "$INSTALL_ID" ]; then
INSTALL_ID=$(uuidgen | tr 'A-Z' 'a-z')
echo "$INSTALL_ID" > "$INSTALL_ID_FILE"
log "Generated new install_id: $INSTALL_ID"
fi
log "========================================"
log "Screenpipe sync starting for: $TARGET_DATE"
log "install_id: $INSTALL_ID"
log "========================================"
# ─── PREFLIGHT ────────────────────────────────────────────────────────────────
step "Preflight checks"
if [ ! -f "$DB_SRC" ]; then
log "ERROR: Source DB not found at $DB_SRC"; exit 1
fi
printf " %-25s %s (%s)\n" "Source DB:" "OK" "$(du -sh "$DB_SRC" | cut -f1)"
if [ ! -d "$NAS_MOUNT" ]; then
log "ERROR: NAS not mounted at $NAS_MOUNT"; exit 1
fi
printf " %-25s %s\n" "NAS mount:" "OK $NAS_MOUNT"
# Check if DB already synced for this date+install_id
DB_ALREADY_SYNCED=false
if [ -f "$NAS_DB" ]; then
if table_exists "$NAS_DB" "frames"; then
HAS_INSTALL_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('frames') WHERE name='install_id';")
if [ "$HAS_INSTALL_COL" -gt "0" ]; then
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE' AND install_id='$INSTALL_ID';" 2>/dev/null || echo "0")
else
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE';" 2>/dev/null || echo "0")
fi
if [ "$EXISTING" -gt "0" ]; then
log "Date $TARGET_DATE / install $INSTALL_ID already has $EXISTING frames — skipping DB sync"
DB_ALREADY_SYNCED=true
else
printf " %-25s %s (%s)\n" "Archive DB:" "exists" "$(du -sh "$NAS_DB" | cut -f1)"
fi
else
printf " %-25s %s\n" "Archive DB:" "exists, no frames table yet"
fi
else
printf " %-25s %s\n" "Archive DB:" "will be created"
fi
# Source data dir for this date (video frames)
DATA_SRC="$HOME/.screenpipe/data/data/$TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
DATA_SIZE=$(du -sh "$DATA_SRC" | cut -f1)
DATA_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
printf " %-25s %s (%s files, %s)\n" "Frame data dir:" "OK" "$DATA_FILES" "$DATA_SIZE"
else
printf " %-25s %s\n" "Frame data dir:" "not found — skipping"
fi
# Audio files (flat in ~/.screenpipe/data/, dated by filename)
shopt -s nullglob
AUDIO_SRC_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_SRC_FILES[@]} -gt 0 ]; then
AUDIO_TOTAL=$(du -ch "${AUDIO_SRC_FILES[@]}" | tail -1 | cut -f1)
printf " %-25s %s (%s files, %s)\n" "Audio files:" "OK" "${#AUDIO_SRC_FILES[@]}" "$AUDIO_TOTAL"
else
printf " %-25s %s\n" "Audio files:" "none for this date"
fi
# ─── SCHEMA MIGRATION: install_id ─────────────────────────────────────────────
# Adds install_id column to existing NAS tables, backfills NULLs with a
# legacy tag, and creates the (install_id, id) unique index. Idempotent.
if [ -f "$NAS_DB" ]; then
step "Schema migration: install_id"
LEGACY_TAG="legacy-$(date +%Y%m%d)"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
if ! table_exists "$NAS_DB" "$tbl"; then
continue
fi
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
ROW_COUNT=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM $tbl;")
printf " %-36s adding install_id, backfill %s rows → %s\n" "$tbl" "$ROW_COUNT" "$LEGACY_TAG"
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
sqlite3 "$NAS_DB" "UPDATE $tbl SET install_id = '$LEGACY_TAG' WHERE install_id IS NULL;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
fi
# ─── DB SYNC ──────────────────────────────────────────────────────────────────
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── COUNT SOURCE ROWS ────────────────────────────────────────────────────
step "Counting source rows for $TARGET_DATE"
SRC_FRAMES=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ELEMENTS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM elements WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_UI=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE';")
SRC_OCR=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ocr_text WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_MEETINGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE';")
SRC_ACHUNKS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ATRANS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_transcriptions WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
SRC_ATAGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_tags WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
# speakers + speaker_embeddings are install-global, not per-date; we sync all.
SRC_SPEAKERS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speakers;")
SRC_SEMB=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speaker_embeddings;")
printf " %-25s %s\n" "frames:" "$SRC_FRAMES"
printf " %-25s %s\n" "elements:" "$SRC_ELEMENTS"
printf " %-25s %s\n" "ui_events:" "$SRC_UI"
printf " %-25s %s\n" "ocr_text:" "$SRC_OCR"
printf " %-25s %s\n" "meetings:" "$SRC_MEETINGS"
printf " %-25s %s\n" "audio_chunks:" "$SRC_ACHUNKS"
printf " %-25s %s\n" "audio_transcriptions:" "$SRC_ATRANS"
printf " %-25s %s\n" "audio_tags:" "$SRC_ATAGS"
printf " %-25s %s (all-time)\n" "speakers:" "$SRC_SPEAKERS"
printf " %-25s %s (all-time)\n" "speaker_embeddings:" "$SRC_SEMB"
if [ "$SRC_FRAMES" -eq "0" ] && [ "$SRC_ACHUNKS" -eq "0" ]; then
log "No frames or audio chunks for $TARGET_DATE — skipping DB sync"
DB_ALREADY_SYNCED=true
fi
fi
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── INIT TABLES ──────────────────────────────────────────────────────────
step "Initialising tables (CREATE IF NOT EXISTS)"
run_sqlite_heredoc "creating vision tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.frames AS SELECT * FROM main.frames WHERE 0;
CREATE TABLE IF NOT EXISTS nas.elements AS SELECT * FROM main.elements WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ui_events AS SELECT * FROM main.ui_events WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ocr_text AS SELECT * FROM main.ocr_text WHERE 0;
CREATE TABLE IF NOT EXISTS nas.video_chunks AS SELECT * FROM main.video_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.meetings AS SELECT * FROM main.meetings WHERE 0;
DETACH nas;
"
run_sqlite_heredoc "creating audio tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.audio_chunks AS SELECT * FROM main.audio_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_transcriptions AS SELECT * FROM main.audio_transcriptions WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speakers AS SELECT * FROM main.speakers WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speaker_embeddings AS SELECT * FROM main.speaker_embeddings WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_tags AS SELECT * FROM main.audio_tags WHERE 0;
DETACH nas;
"
# Re-run install_id + index setup so freshly-created tables get them too.
for tbl in "${ALL_SYNC_TABLES[@]}"; do
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
# ─── SCHEMA DRIFT ─────────────────────────────────────────────────────────
step "Reconciling NAS schema with source"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
ensure_columns "$tbl"
done
run_sqlite_heredoc "creating indexes" "
ATTACH '$NAS_DB' AS nas;
-- vision
CREATE INDEX IF NOT EXISTS nas.idx_frames_timestamp ON frames(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_frames_app_name ON frames(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_window_name ON frames(window_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_video_chunk_id ON frames(video_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_frames_document_path ON frames(document_path) WHERE document_path IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_e...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
#!/bin/bash
# screenpipe_sync.sh
# Syncs Screenpipe SQLite data (vision + audio) to a NAS archive database.
# Append-only, no deletions.
#
# Key design points
# -----------------
# 1. Multi-install safe via install_id.
# Source IDs reset to 1 on every screenpipe reinstall. To avoid collisions
# in the NAS archive, every synced table gets an extra `install_id` column,
# and the logical primary key becomes (install_id, id) enforced by a
# unique index. The install_id is a UUID stored in
# ~/.screenpipe/.sync_install_id — wiping ~/.screenpipe/ (which is what
# happens on reinstall) discards it, so the next run generates a new one.
#
# 2. Schema-drift tolerant. If screenpipe migrations add new columns to the
# source DB, the NAS gets ALTER TABLE'd to match. Inserts use explicit
# column lists so positional mismatches can't occur.
#
# 3. FTS caveat. FTS tables in the NAS use source `id` as rowid. After a
# reinstall, INSERT OR IGNORE will silently skip rows whose id collides
# with a previous install's id, so FTS only reliably indexes the most
# recent install. Falls back to LIKE queries on the base tables for
# multi-install searches (which can filter by install_id).
#
# Usage
# -----
# ./screenpipe_sync.sh # syncs yesterday
# ./screenpipe_sync.sh 2026-04-15 # syncs a specific date
# ./screenpipe_sync.sh today # syncs today so far
# ./screenpipe_sync.sh --reset-install-id # rotate install_id and exit
# ./screenpipe_sync.sh --show-install-id # print install_id and exit
set -euo pipefail
# ─── CONFIG ───────────────────────────────────────────────────────────────────
DB_SRC="${SCREENPIPE_DB:-$HOME/.screenpipe/db.sqlite}"
NAS_MOUNT="${NAS_MOUNT:-/Volumes/screenpipe}"
NAS_DB="$NAS_MOUNT/archive.db"
NAS_DATA="$NAS_MOUNT/data"
LOG_FILE="$HOME/.screenpipe/sync.log"
INSTALL_ID_FILE="$HOME/.screenpipe/.sync_install_id"
# Sync table groups. Order matters for FK-ish references
# (parents before children).
VISION_TABLES=(video_chunks frames elements ocr_text ui_events meetings)
AUDIO_TABLES=(speakers speaker_embeddings audio_chunks audio_transcriptions audio_tags)
ALL_SYNC_TABLES=("${VISION_TABLES[@]}" "${AUDIO_TABLES[@]}")
# ──────────────────────────────────────────────────────────────────────────────
SCRIPT_START=$(date +%s)
# ─── HELPERS ──────────────────────────────────────────────────────────────────
log() {
local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*"
echo "$msg" | tee -a "$LOG_FILE"
}
step() {
local now=$(date +%s)
local elapsed=$(( now - SCRIPT_START ))
local min=$(( elapsed / 60 ))
local sec=$(( elapsed % 60 ))
printf "\n[+%02dm%02ds] ▶ %s\n" "$min" "$sec" "$*" | tee -a "$LOG_FILE"
}
run_sqlite_heredoc() {
local label="$1"
local sql="$2"
local start=$(date +%s)
printf " %-36s " "$label"
sqlite3 "$DB_SRC" <<< "$sql" &
local pid=$!
local spin=[PASSWORD] '⠙' '⠹' '⠸' '⠼' '⠴' '⠦' '⠧' '⠇' '⠏')
local i=0
while kill -0 "$pid" 2>/dev/null; do
printf "\r %-36s %s " "$label" "${spin[$i]}"
i=$(( (i + 1) % 10 ))
sleep 0.2
done
wait "$pid"
local rc=$?
if [ $rc -ne 0 ]; then
printf "\r %-36s ✗ FAILED\n" "$label" | tee -a "$LOG_FILE"
exit $rc
fi
local dur=$(( $(date +%s) - start ))
printf "\r %-36s ✓ %dm%02ds\n" "$label" "$(( dur / 60 ))" "$(( dur % 60 ))" | tee -a "$LOG_FILE"
}
check() {
local label="$1" got="$2" expected="$3"
if [ "$got" -eq "$expected" ]; then
printf " %-25s %s / %s ✓\n" "$label:" "$got" "$expected"
else
printf " %-25s %s / %s ✗ MISMATCH\n" "$label:" "$got" "$expected"
fi
}
table_columns_with_types() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2 "|" $3}'
}
table_columns() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2}'
}
table_exists() {
local db="$1" table="$2"
local count
count=$(sqlite3 "$db" "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='$table';")
[ "$count" -gt 0 ]
}
# Adds any columns present in source but missing in NAS for the given table.
# Skips install_id (which is NAS-only and managed separately).
ensure_columns() {
local table="$1"
local label="schema: $table"
printf " %-36s " "$label"
if ! table_exists "$DB_SRC" "$table"; then
printf "✗ source missing — skipping\n"
return 0
fi
if ! table_exists "$NAS_DB" "$table"; then
printf "✓ fresh (created above)\n"
return 0
fi
local src_cols
src_cols=$(table_columns_with_types "$DB_SRC" "$table")
local nas_cols
nas_cols=$(table_columns "$NAS_DB" "$table")
local added=0
local added_names=""
while IFS='|' read -r name type; do
[ -z "$name" ] && continue
if ! echo "$nas_cols" | grep -Fxq "$name"; then
sqlite3 "$NAS_DB" "ALTER TABLE $table ADD COLUMN \"$name\" $type;"
added=$((added + 1))
added_names="$added_names $name"
fi
done <<< "$src_cols"
if [ "$added" -gt 0 ]; then
printf "✓ added %d:%s\n" "$added" "$added_names"
else
printf "✓ in sync\n"
fi
}
# Comma-separated, double-quoted column list for a table from source DB.
# Optional alias is prefixed (e.g. `o."col"`) for JOIN selects where column
# names would otherwise collide.
build_col_list() {
local table="$1"
local alias="${2:-}"
local prefix=""
[ -n "$alias" ] && prefix="${alias}."
table_columns "$DB_SRC" "$table" | awk 'NF' | awk -v p="$prefix" '{print p "\"" $0 "\""}' | paste -sd, -
}
# ──────────────────────────────────────────────────────────────────────────────
# ─── ARG HANDLING ─────────────────────────────────────────────────────────────
if [ "${1:-}" = "--reset-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
OLD=$(cat "$INSTALL_ID_FILE")
rm -f "$INSTALL_ID_FILE"
echo "Removed install_id: $OLD"
else
echo "No install_id file at $INSTALL_ID_FILE"
fi
echo "A new install_id will be generated on the next sync run."
exit 0
fi
if [ "${1:-}" = "--show-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
cat "$INSTALL_ID_FILE"
else
echo "(none — will be generated on next run)"
fi
exit 0
fi
if [ "${1:-}" = "today" ]; then
TARGET_DATE=$(date +%Y-%m-%d)
elif [ -n "${1:-}" ]; then
TARGET_DATE="$1"
if ! [[ "$TARGET_DATE" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
echo "ERROR: Invalid date. Use YYYY-MM-DD, 'today', or no argument for yesterday."
exit 1
fi
else
TARGET_DATE=$(date -v-1d +%Y-%m-%d)
fi
# ─── INSTALL ID ───────────────────────────────────────────────────────────────
INSTALL_ID=""
if [ -f "$INSTALL_ID_FILE" ]; then
INSTALL_ID=$(tr -d '[:space:]' < "$INSTALL_ID_FILE")
fi
if [ -z "$INSTALL_ID" ]; then
INSTALL_ID=$(uuidgen | tr 'A-Z' 'a-z')
echo "$INSTALL_ID" > "$INSTALL_ID_FILE"
log "Generated new install_id: $INSTALL_ID"
fi
log "========================================"
log "Screenpipe sync starting for: $TARGET_DATE"
log "install_id: $INSTALL_ID"
log "========================================"
# ─── PREFLIGHT ────────────────────────────────────────────────────────────────
step "Preflight checks"
if [ ! -f "$DB_SRC" ]; then
log "ERROR: Source DB not found at $DB_SRC"; exit 1
fi
printf " %-25s %s (%s)\n" "Source DB:" "OK" "$(du -sh "$DB_SRC" | cut -f1)"
if [ ! -d "$NAS_MOUNT" ]; then
log "ERROR: NAS not mounted at $NAS_MOUNT"; exit 1
fi
printf " %-25s %s\n" "NAS mount:" "OK $NAS_MOUNT"
# Check if DB already synced for this date+install_id
DB_ALREADY_SYNCED=false
if [ -f "$NAS_DB" ]; then
if table_exists "$NAS_DB" "frames"; then
HAS_INSTALL_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('frames') WHERE name='install_id';")
if [ "$HAS_INSTALL_COL" -gt "0" ]; then
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE' AND install_id='$INSTALL_ID';" 2>/dev/null || echo "0")
else
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE';" 2>/dev/null || echo "0")
fi
if [ "$EXISTING" -gt "0" ]; then
log "Date $TARGET_DATE / install $INSTALL_ID already has $EXISTING frames — skipping DB sync"
DB_ALREADY_SYNCED=true
else
printf " %-25s %s (%s)\n" "Archive DB:" "exists" "$(du -sh "$NAS_DB" | cut -f1)"
fi
else
printf " %-25s %s\n" "Archive DB:" "exists, no frames table yet"
fi
else
printf " %-25s %s\n" "Archive DB:" "will be created"
fi
# Source data dir for this date (video frames)
DATA_SRC="$HOME/.screenpipe/data/data/$TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
DATA_SIZE=$(du -sh "$DATA_SRC" | cut -f1)
DATA_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
printf " %-25s %s (%s files, %s)\n" "Frame data dir:" "OK" "$DATA_FILES" "$DATA_SIZE"
else
printf " %-25s %s\n" "Frame data dir:" "not found — skipping"
fi
# Audio files (flat in ~/.screenpipe/data/, dated by filename)
shopt -s nullglob
AUDIO_SRC_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_SRC_FILES[@]} -gt 0 ]; then
AUDIO_TOTAL=$(du -ch "${AUDIO_SRC_FILES[@]}" | tail -1 | cut -f1)
printf " %-25s %s (%s files, %s)\n" "Audio files:" "OK" "${#AUDIO_SRC_FILES[@]}" "$AUDIO_TOTAL"
else
printf " %-25s %s\n" "Audio files:" "none for this date"
fi
# ─── SCHEMA MIGRATION: install_id ─────────────────────────────────────────────
# Adds install_id column to existing NAS tables, backfills NULLs with a
# legacy tag, and creates the (install_id, id) unique index. Idempotent.
if [ -f "$NAS_DB" ]; then
step "Schema migration: install_id"
LEGACY_TAG="legacy-$(date +%Y%m%d)"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
if ! table_exists "$NAS_DB" "$tbl"; then
continue
fi
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
ROW_COUNT=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM $tbl;")
printf " %-36s adding install_id, backfill %s rows → %s\n" "$tbl" "$ROW_COUNT" "$LEGACY_TAG"
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
sqlite3 "$NAS_DB" "UPDATE $tbl SET install_id = '$LEGACY_TAG' WHERE install_id IS NULL;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
fi
# ─── DB SYNC ──────────────────────────────────────────────────────────────────
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── COUNT SOURCE ROWS ────────────────────────────────────────────────────
step "Counting source rows for $TARGET_DATE"
SRC_FRAMES=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ELEMENTS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM elements WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_UI=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE';")
SRC_OCR=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ocr_text WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_MEETINGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE';")
SRC_ACHUNKS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ATRANS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_transcriptions WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
SRC_ATAGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_tags WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
# speakers + speaker_embeddings are install-global, not per-date; we sync all.
SRC_SPEAKERS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speakers;")
SRC_SEMB=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speaker_embeddings;")
printf " %-25s %s\n" "frames:" "$SRC_FRAMES"
printf " %-25s %s\n" "elements:" "$SRC_ELEMENTS"
printf " %-25s %s\n" "ui_events:" "$SRC_UI"
printf " %-25s %s\n" "ocr_text:" "$SRC_OCR"
printf " %-25s %s\n" "meetings:" "$SRC_MEETINGS"
printf " %-25s %s\n" "audio_chunks:" "$SRC_ACHUNKS"
printf " %-25s %s\n" "audio_transcriptions:" "$SRC_ATRANS"
printf " %-25s %s\n" "audio_tags:" "$SRC_ATAGS"
printf " %-25s %s (all-time)\n" "speakers:" "$SRC_SPEAKERS"
printf " %-25s %s (all-time)\n" "speaker_embeddings:" "$SRC_SEMB"
if [ "$SRC_FRAMES" -eq "0" ] && [ "$SRC_ACHUNKS" -eq "0" ]; then
log "No frames or audio chunks for $TARGET_DATE — skipping DB sync"
DB_ALREADY_SYNCED=true
fi
fi
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── INIT TABLES ──────────────────────────────────────────────────────────
step "Initialising tables (CREATE IF NOT EXISTS)"
run_sqlite_heredoc "creating vision tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.frames AS SELECT * FROM main.frames WHERE 0;
CREATE TABLE IF NOT EXISTS nas.elements AS SELECT * FROM main.elements WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ui_events AS SELECT * FROM main.ui_events WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ocr_text AS SELECT * FROM main.ocr_text WHERE 0;
CREATE TABLE IF NOT EXISTS nas.video_chunks AS SELECT * FROM main.video_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.meetings AS SELECT * FROM main.meetings WHERE 0;
DETACH nas;
"
run_sqlite_heredoc "creating audio tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.audio_chunks AS SELECT * FROM main.audio_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_transcriptions AS SELECT * FROM main.audio_transcriptions WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speakers AS SELECT * FROM main.speakers WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speaker_embeddings AS SELECT * FROM main.speaker_embeddings WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_tags AS SELECT * FROM main.audio_tags WHERE 0;
DETACH nas;
"
# Re-run install_id + index setup so freshly-created tables get them too.
for tbl in "${ALL_SYNC_TABLES[@]}"; do
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
# ─── SCHEMA DRIFT ─────────────────────────────────────────────────────────
step "Reconciling NAS schema with source"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
ensure_columns "$tbl"
done
run_sqlite_heredoc "creating indexes" "
ATTACH '$NAS_DB' AS nas;
-- vision
CREATE INDEX IF NOT EXISTS nas.idx_frames_timestamp ON frames(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_frames_app_name ON frames(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_window_name ON frames(window_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_video_chunk_id ON frames(video_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_frames_document_path ON frames(document_path) WHERE document_path IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_id ON elements(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_src_role ON elements(frame_id, source, role) WHERE text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_onscreen_frame ON elements(frame_id) WHERE on_screen = 1 AND text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_timestamp ON ui_events(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_app_name ON ui_events(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_frame_id ON ui_events(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_ocr_text_frame_id ON ocr_text(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_meetings_start ON meetings(meeting_start);
CREATE INDEX IF NOT EXISTS nas.idx_video_chunks_device ON video_chunks(device_name);
-- audio
CREATE INDEX IF NOT EXISTS nas.idx_audio_chunks_timestamp ON audio_chunks(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_chunk_id ON audio_transcriptions(audio_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_timestamp ON audio_transcriptions(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_speaker ON audio_transcriptions(speaker_id, timestamp DESC);
CREATE INDEX IF NOT EXISTS nas.idx_speaker_emb_speaker_id ON speaker_embeddings(speaker_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_tags_chunk_id ON audio_tags(audio_chunk_id);
DETACH nas;
"
run_sqlite_heredoc "creating FTS tables" "
ATTACH '$NAS_DB' AS nas;
CREATE VIRTUAL TABLE IF NOT EXISTS nas.elements_fts USING fts5(
text, role, frame_id UNINDEXED,
content='elements', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.frames_fts USING fts5(
full_text, app_name, window_name, browser_url, id UNINDEXED,
tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.ui_events_fts USING fts5(
text_content, app_name, window_title, element_name,
content='ui_events', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.audio_transcriptions_fts USING fts5(
transcription, device, speaker_id UNINDEXED, id UNINDEXED,
tokenize='unicode61'
);
DETACH nas;
"
# ─── BUILD EXPLICIT COLUMN LISTS ──────────────────────────────────────────
# Source columns + install_id appended. Same on both sides of the INSERT.
FRAMES_COLS=$(build_col_list frames)
ELEMENTS_COLS=$(build_col_list elements)
ELEMENTS_COLS_E=$(build_col_list elements e)
UI_EVENTS_COLS=$(build_col_list ui_events)
OCR_TEXT_COLS=$(build_col_list ocr_text)
OCR_TEXT_COLS_O=$(build_col_list ocr_text o)
VIDEO_CHUNKS_COLS=$(build_col_list video_chunks)
MEETINGS_COLS=$(build_col_list meetings)
ACHUNKS_COLS=$(build_col_list audio_chunks)
ATRANS_COLS=$(build_col_list audio_transcriptions)
ATRANS_COLS_T=$(build_col_list audio_transcriptions t)
SPEAKERS_COLS=$(build_col_list speakers)
SEMB_COLS=$(build_col_list speaker_embeddings)
ATAGS_COLS=$(build_col_list audio_tags)
ATAGS_COLS_AT=$(build_col_list audio_tags at)
# ─── SYNC VISION DATA ─────────────────────────────────────────────────────
step "Syncing vision data for $TARGET_DATE"
run_sqlite_heredoc "video_chunks" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.video_chunks ($VIDEO_CHUNKS_COLS, install_id)
SELECT $VIDEO_CHUNKS_COLS, '$INSTALL_ID' FROM main.video_chunks
WHERE id IN (
SELECT DISTINCT video_chunk_id FROM main.frames
WHERE date(timestamp) = '$TARGET_DATE' AND video_chunk_id IS NOT NULL
);
DETACH nas;
"
run_sqlite_heredoc "frames ($SRC_FRAMES rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames ($FRAMES_COLS, install_id)
SELECT $FRAMES_COLS, '$INSTALL_ID' FROM main.frames WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ocr_text ($SRC_OCR rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ocr_text ($OCR_TEXT_COLS, install_id)
SELECT $OCR_TEXT_COLS_O, '$INSTALL_ID' FROM main.ocr_text o
JOIN main.frames f ON o.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ui_events ($SRC_UI rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events ($UI_EVENTS_COLS, install_id)
SELECT $UI_EVENTS_COLS, '$INSTALL_ID' FROM main.ui_events WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "elements ($SRC_ELEMENTS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements ($ELEMENTS_COLS, install_id)
SELECT $ELEMENTS_COLS_E, '$INSTALL_ID' FROM main.elements e
JOIN main.frames f ON e.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "meetings ($SRC_MEETINGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.meetings ($MEETINGS_COLS, install_id)
SELECT $MEETINGS_COLS, '$INSTALL_ID' FROM main.meetings WHERE date(meeting_start) = '$TARGET_DATE';
DETACH nas;
"
# ─── SYNC AUDIO DATA ──────────────────────────────────────────────────────
step "Syncing audio data for $TARGET_DATE"
# Speakers + embeddings are install-global, not per-date. Sync everything
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
#!/bin/bash
# screenpipe_sync.sh
# Syncs Screenpipe SQLite data (vision + audio) to a NAS archive database.
# Append-only, no deletions.
#
# Key design points
# -----------------
# 1. Multi-install safe via install_id.
# Source IDs reset to 1 on every screenpipe reinstall. To avoid collisions
# in the NAS archive, every synced table gets an extra `install_id` column,
# and the logical primary key becomes (install_id, id) enforced by a
# unique index. The install_id is a UUID stored in
# ~/.screenpipe/.sync_install_id — wiping ~/.screenpipe/ (which is what
# happens on reinstall) discards it, so the next run generates a new one.
#
# 2. Schema-drift tolerant. If screenpipe migrations add new columns to the
# source DB, the NAS gets ALTER TABLE'd to match. Inserts use explicit
# column lists so positional mismatches can't occur.
#
# 3. FTS caveat. FTS tables in the NAS use source `id` as rowid. After a
# reinstall, INSERT OR IGNORE will silently skip rows whose id collides
# with a previous install's id, so FTS only reliably indexes the most
# recent install. Falls back to LIKE queries on the base tables for
# multi-install searches (which can filter by install_id).
#
# Usage
# -----
# ./screenpipe_sync.sh # syncs yesterday
# ./screenpipe_sync.sh 2026-04-15 # syncs a specific date
# ./screenpipe_sync.sh today # syncs today so far
# ./screenpipe_sync.sh --reset-install-id # rotate install_id and exit
# ./screenpipe_sync.sh --show-install-id # print install_id and exit
set -euo pipefail
# ─── CONFIG ───────────────────────────────────────────────────────────────────
DB_SRC="${SCREENPIPE_DB:-$HOME/.screenpipe/db.sqlite}"
NAS_MOUNT="${NAS_MOUNT:-/Volumes/screenpipe}"
NAS_DB="$NAS_MOUNT/archive.db"
NAS_DATA="$NAS_MOUNT/data"
LOG_FILE="$HOME/.screenpipe/sync.log"
INSTALL_ID_FILE="$HOME/.screenpipe/.sync_install_id"
# Sync table groups. Order matters for FK-ish references
# (parents before children).
VISION_TABLES=(video_chunks frames elements ocr_text ui_events meetings)
AUDIO_TABLES=(speakers speaker_embeddings audio_chunks audio_transcriptions audio_tags)
ALL_SYNC_TABLES=("${VISION_TABLES[@]}" "${AUDIO_TABLES[@]}")
# ──────────────────────────────────────────────────────────────────────────────
SCRIPT_START=$(date +%s)
# ─── HELPERS ──────────────────────────────────────────────────────────────────
log() {
local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*"
echo "$msg" | tee -a "$LOG_FILE"
}
step() {
local now=$(date +%s)
local elapsed=$(( now - SCRIPT_START ))
local min=$(( elapsed / 60 ))
local sec=$(( elapsed % 60 ))
printf "\n[+%02dm%02ds] ▶ %s\n" "$min" "$sec" "$*" | tee -a "$LOG_FILE"
}
run_sqlite_heredoc() {
local label="$1"
local sql="$2"
local start=$(date +%s)
printf " %-36s " "$label"
sqlite3 "$DB_SRC" <<< "$sql" &
local pid=$!
local spin=[PASSWORD] '⠙' '⠹' '⠸' '⠼' '⠴' '⠦' '⠧' '⠇' '⠏')
local i=0
while kill -0 "$pid" 2>/dev/null; do
printf "\r %-36s %s " "$label" "${spin[$i]}"
i=$(( (i + 1) % 10 ))
sleep 0.2
done
wait "$pid"
local rc=$?
if [ $rc -ne 0 ]; then
printf "\r %-36s ✗ FAILED\n" "$label" | tee -a "$LOG_FILE"
exit $rc
fi
local dur=$(( $(date +%s) - start ))
printf "\r %-36s ✓ %dm%02ds\n" "$label" "$(( dur / 60 ))" "$(( dur % 60 ))" | tee -a "$LOG_FILE"
}
check() {
local label="$1" got="$2" expected="$3"
if [ "$got" -eq "$expected" ]; then
printf " %-25s %s / %s ✓\n" "$label:" "$got" "$expected"
else
printf " %-25s %s / %s ✗ MISMATCH\n" "$label:" "$got" "$expected"
fi
}
table_columns_with_types() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2 "|" $3}'
}
table_columns() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2}'
}
table_exists() {
local db="$1" table="$2"
local count
count=$(sqlite3 "$db" "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='$table';")
[ "$count" -gt 0 ]
}
# Adds any columns present in source but missing in NAS for the given table.
# Skips install_id (which is NAS-only and managed separately).
ensure_columns() {
local table="$1"
local label="schema: $table"
printf " %-36s " "$label"
if ! table_exists "$DB_SRC" "$table"; then
printf "✗ source missing — skipping\n"
return 0
fi
if ! table_exists "$NAS_DB" "$table"; then
printf "✓ fresh (created above)\n"
return 0
fi
local src_cols
src_cols=$(table_columns_with_types "$DB_SRC" "$table")
local nas_cols
nas_cols=$(table_columns "$NAS_DB" "$table")
local added=0
local added_names=""
while IFS='|' read -r name type; do
[ -z "$name" ] && continue
if ! echo "$nas_cols" | grep -Fxq "$name"; then
sqlite3 "$NAS_DB" "ALTER TABLE $table ADD COLUMN \"$name\" $type;"
added=$((added + 1))
added_names="$added_names $name"
fi
done <<< "$src_cols"
if [ "$added" -gt 0 ]; then
printf "✓ added %d:%s\n" "$added" "$added_names"
else
printf "✓ in sync\n"
fi
}
# Comma-separated, double-quoted column list for a table from source DB.
# Optional alias is prefixed (e.g. `o."col"`) for JOIN selects where column
# names would otherwise collide.
build_col_list() {
local table="$1"
local alias="${2:-}"
local prefix=""
[ -n "$alias" ] && prefix="${alias}."
table_columns "$DB_SRC" "$table" | awk 'NF' | awk -v p="$prefix" '{print p "\"" $0 "\""}' | paste -sd, -
}
# ──────────────────────────────────────────────────────────────────────────────
# ─── ARG HANDLING ─────────────────────────────────────────────────────────────
if [ "${1:-}" = "--reset-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
OLD=$(cat "$INSTALL_ID_FILE")
rm -f "$INSTALL_ID_FILE"
echo "Removed install_id: $OLD"
else
echo "No install_id file at $INSTALL_ID_FILE"
fi
echo "A new install_id will be generated on the next sync run."
exit 0
fi
if [ "${1:-}" = "--show-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
cat "$INSTALL_ID_FILE"
else
echo "(none — will be generated on next run)"
fi
exit 0
fi
if [ "${1:-}" = "today" ]; then
TARGET_DATE=$(date +%Y-%m-%d)
elif [ -n "${1:-}" ]; then
TARGET_DATE="$1"
if ! [[ "$TARGET_DATE" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
echo "ERROR: Invalid date. Use YYYY-MM-DD, 'today', or no argument for yesterday."
exit 1
fi
else
TARGET_DATE=$(date -v-1d +%Y-%m-%d)
fi
# ─── INSTALL ID ───────────────────────────────────────────────────────────────
INSTALL_ID=""
if [ -f "$INSTALL_ID_FILE" ]; then
INSTALL_ID=$(tr -d '[:space:]' < "$INSTALL_ID_FILE")
fi
if [ -z "$INSTALL_ID" ]; then
INSTALL_ID=$(uuidgen | tr 'A-Z' 'a-z')
echo "$INSTALL_ID" > "$INSTALL_ID_FILE"
log "Generated new install_id: $INSTALL_ID"
fi
log "========================================"
log "Screenpipe sync starting for: $TARGET_DATE"
log "install_id: $INSTALL_ID"
log "========================================"
# ─── PREFLIGHT ────────────────────────────────────────────────────────────────
step "Preflight checks"
if [ ! -f "$DB_SRC" ]; then
log "ERROR: Source DB not found at $DB_SRC"; exit 1
fi
printf " %-25s %s (%s)\n" "Source DB:" "OK" "$(du -sh "$DB_SRC" | cut -f1)"
if [ ! -d "$NAS_MOUNT" ]; then
log "ERROR: NAS not mounted at $NAS_MOUNT"; exit 1
fi
printf " %-25s %s\n" "NAS mount:" "OK $NAS_MOUNT"
# Check if DB already synced for this date+install_id
DB_ALREADY_SYNCED=false
if [ -f "$NAS_DB" ]; then
if table_exists "$NAS_DB" "frames"; then
HAS_INSTALL_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('frames') WHERE name='install_id';")
if [ "$HAS_INSTALL_COL" -gt "0" ]; then
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE' AND install_id='$INSTALL_ID';" 2>/dev/null || echo "0")
else
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE';" 2>/dev/null || echo "0")
fi
if [ "$EXISTING" -gt "0" ]; then
log "Date $TARGET_DATE / install $INSTALL_ID already has $EXISTING frames — skipping DB sync"
DB_ALREADY_SYNCED=true
else
printf " %-25s %s (%s)\n" "Archive DB:" "exists" "$(du -sh "$NAS_DB" | cut -f1)"
fi
else
printf " %-25s %s\n" "Archive DB:" "exists, no frames table yet"
fi
else
printf " %-25s %s\n" "Archive DB:" "will be created"
fi
# Source data dir for this date (video frames)
DATA_SRC="$HOME/.screenpipe/data/data/$TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
DATA_SIZE=$(du -sh "$DATA_SRC" | cut -f1)
DATA_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
printf " %-25s %s (%s files, %s)\n" "Frame data dir:" "OK" "$DATA_FILES" "$DATA_SIZE"
else
printf " %-25s %s\n" "Frame data dir:" "not found — skipping"
fi
# Audio files (flat in ~/.screenpipe/data/, dated by filename)
shopt -s nullglob
AUDIO_SRC_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_SRC_FILES[@]} -gt 0 ]; then
AUDIO_TOTAL=$(du -ch "${AUDIO_SRC_FILES[@]}" | tail -1 | cut -f1)
printf " %-25s %s (%s files, %s)\n" "Audio files:" "OK" "${#AUDIO_SRC_FILES[@]}" "$AUDIO_TOTAL"
else
printf " %-25s %s\n" "Audio files:" "none for this date"
fi
# ─── SCHEMA MIGRATION: install_id ─────────────────────────────────────────────
# Adds install_id column to existing NAS tables, backfills NULLs with a
# legacy tag, and creates the (install_id, id) unique index. Idempotent.
if [ -f "$NAS_DB" ]; then
step "Schema migration: install_id"
LEGACY_TAG="legacy-$(date +%Y%m%d)"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
if ! table_exists "$NAS_DB" "$tbl"; then
continue
fi
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
ROW_COUNT=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM $tbl;")
printf " %-36s adding install_id, backfill %s rows → %s\n" "$tbl" "$ROW_COUNT" "$LEGACY_TAG"
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
sqlite3 "$NAS_DB" "UPDATE $tbl SET install_id = '$LEGACY_TAG' WHERE install_id IS NULL;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
fi
# ─── DB SYNC ──────────────────────────────────────────────────────────────────
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── COUNT SOURCE ROWS ────────────────────────────────────────────────────
step "Counting source rows for $TARGET_DATE"
SRC_FRAMES=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ELEMENTS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM elements WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_UI=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE';")
SRC_OCR=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ocr_text WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_MEETINGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE';")
SRC_ACHUNKS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ATRANS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_transcriptions WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
SRC_ATAGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_tags WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
# speakers + speaker_embeddings are install-global, not per-date; we sync all.
SRC_SPEAKERS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speakers;")
SRC_SEMB=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speaker_embeddings;")
printf " %-25s %s\n" "frames:" "$SRC_FRAMES"
printf " %-25s %s\n" "elements:" "$SRC_ELEMENTS"
printf " %-25s %s\n" "ui_events:" "$SRC_UI"
printf " %-25s %s\n" "ocr_text:" "$SRC_OCR"
printf " %-25s %s\n" "meetings:" "$SRC_MEETINGS"
printf " %-25s %s\n" "audio_chunks:" "$SRC_ACHUNKS"
printf " %-25s %s\n" "audio_transcriptions:" "$SRC_ATRANS"
printf " %-25s %s\n" "audio_tags:" "$SRC_ATAGS"
printf " %-25s %s (all-time)\n" "speakers:" "$SRC_SPEAKERS"
printf " %-25s %s (all-time)\n" "speaker_embeddings:" "$SRC_SEMB"
if [ "$SRC_FRAMES" -eq "0" ] && [ "$SRC_ACHUNKS" -eq "0" ]; then
log "No frames or audio chunks for $TARGET_DATE — skipping DB sync"
DB_ALREADY_SYNCED=true
fi
fi
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── INIT TABLES ──────────────────────────────────────────────────────────
step "Initialising tables (CREATE IF NOT EXISTS)"
run_sqlite_heredoc "creating vision tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.frames AS SELECT * FROM main.frames WHERE 0;
CREATE TABLE IF NOT EXISTS nas.elements AS SELECT * FROM main.elements WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ui_events AS SELECT * FROM main.ui_events WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ocr_text AS SELECT * FROM main.ocr_text WHERE 0;
CREATE TABLE IF NOT EXISTS nas.video_chunks AS SELECT * FROM main.video_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.meetings AS SELECT * FROM main.meetings WHERE 0;
DETACH nas;
"
run_sqlite_heredoc "creating audio tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.audio_chunks AS SELECT * FROM main.audio_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_transcriptions AS SELECT * FROM main.audio_transcriptions WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speakers AS SELECT * FROM main.speakers WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speaker_embeddings AS SELECT * FROM main.speaker_embeddings WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_tags AS SELECT * FROM main.audio_tags WHERE 0;
DETACH nas;
"
# Re-run install_id + index setup so freshly-created tables get them too.
for tbl in "${ALL_SYNC_TABLES[@]}"; do
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
# ─── SCHEMA DRIFT ─────────────────────────────────────────────────────────
step "Reconciling NAS schema with source"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
ensure_columns "$tbl"
done
run_sqlite_heredoc "creating indexes" "
ATTACH '$NAS_DB' AS nas;
-- vision
CREATE INDEX IF NOT EXISTS nas.idx_frames_timestamp ON frames(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_frames_app_name ON frames(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_window_name ON frames(window_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_video_chunk_id ON frames(video_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_frames_document_path ON frames(document_path) WHERE document_path IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_e...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
#!/bin/bash
# screenpipe_sync.sh
# Syncs Screenpipe SQLite data (vision + audio) to a NAS archive database.
# Append-only, no deletions.
#
# Key design points
# -----------------
# 1. Multi-install safe via install_id.
# Source IDs reset to 1 on every screenpipe reinstall. To avoid collisions
# in the NAS archive, every synced table gets an extra `install_id` column,
# and the logical primary key becomes (install_id, id) enforced by a
# unique index. The install_id is a UUID stored in
# ~/.screenpipe/.sync_install_id — wiping ~/.screenpipe/ (which is what
# happens on reinstall) discards it, so the next run generates a new one.
#
# 2. Schema-drift tolerant. If screenpipe migrations add new columns to the
# source DB, the NAS gets ALTER TABLE'd to match. Inserts use explicit
# column lists so positional mismatches can't occur.
#
# 3. FTS caveat. FTS tables in the NAS use source `id` as rowid. After a
# reinstall, INSERT OR IGNORE will silently skip rows whose id collides
# with a previous install's id, so FTS only reliably indexes the most
# recent install. Falls back to LIKE queries on the base tables for
# multi-install searches (which can filter by install_id).
#
# Usage
# -----
# ./screenpipe_sync.sh # syncs yesterday
# ./screenpipe_sync.sh 2026-04-15 # syncs a specific date
# ./screenpipe_sync.sh today # syncs today so far
# ./screenpipe_sync.sh --reset-install-id # rotate install_id and exit
# ./screenpipe_sync.sh --show-install-id # print install_id and exit
set -euo pipefail
# ─── CONFIG ───────────────────────────────────────────────────────────────────
DB_SRC="${SCREENPIPE_DB:-$HOME/.screenpipe/db.sqlite}"
NAS_MOUNT="${NAS_MOUNT:-/Volumes/screenpipe}"
NAS_DB="$NAS_MOUNT/archive.db"
NAS_DATA="$NAS_MOUNT/data"
LOG_FILE="$HOME/.screenpipe/sync.log"
INSTALL_ID_FILE="$HOME/.screenpipe/.sync_install_id"
# Sync table groups. Order matters for FK-ish references
# (parents before children).
VISION_TABLES=(video_chunks frames elements ocr_text ui_events meetings)
AUDIO_TABLES=(speakers speaker_embeddings audio_chunks audio_transcriptions audio_tags)
ALL_SYNC_TABLES=("${VISION_TABLES[@]}" "${AUDIO_TABLES[@]}")
# ──────────────────────────────────────────────────────────────────────────────
SCRIPT_START=$(date +%s)
# ─── HELPERS ──────────────────────────────────────────────────────────────────
log() {
local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*"
echo "$msg" | tee -a "$LOG_FILE"
}
step() {
local now=$(date +%s)
local elapsed=$(( now - SCRIPT_START ))
local min=$(( elapsed / 60 ))
local sec=$(( elapsed % 60 ))
printf "\n[+%02dm%02ds] ▶ %s\n" "$min" "$sec" "$*" | tee -a "$LOG_FILE"
}
run_sqlite_heredoc() {
local label="$1"
local sql="$2"
local start=$(date +%s)
printf " %-36s " "$label"
sqlite3 "$DB_SRC" <<< "$sql" &
local pid=$!
local spin=[PASSWORD] '⠙' '⠹' '⠸' '⠼' '⠴' '⠦' '⠧' '⠇' '⠏')
local i=0
while kill -0 "$pid" 2>/dev/null; do
printf "\r %-36s %s " "$label" "${spin[$i]}"
i=$(( (i + 1) % 10 ))
sleep 0.2
done
wait "$pid"
local rc=$?
if [ $rc -ne 0 ]; then
printf "\r %-36s ✗ FAILED\n" "$label" | tee -a "$LOG_FILE"
exit $rc
fi
local dur=$(( $(date +%s) - start ))
printf "\r %-36s ✓ %dm%02ds\n" "$label" "$(( dur / 60 ))" "$(( dur % 60 ))" | tee -a "$LOG_FILE"
}
check() {
local label="$1" got="$2" expected="$3"
if [ "$got" -eq "$expected" ]; then
printf " %-25s %s / %s ✓\n" "$label:" "$got" "$expected"
else
printf " %-25s %s / %s ✗ MISMATCH\n" "$label:" "$got" "$expected"
fi
}
table_columns_with_types() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2 "|" $3}'
}
table_columns() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2}'
}
table_exists() {
local db="$1" table="$2"
local count
count=$(sqlite3 "$db" "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='$table';")
[ "$count" -gt 0 ]
}
# Adds any columns present in source but missing in NAS for the given table.
# Skips install_id (which is NAS-only and managed separately).
ensure_columns() {
local table="$1"
local label="schema: $table"
printf " %-36s " "$label"
if ! table_exists "$DB_SRC" "$table"; then
printf "✗ source missing — skipping\n"
return 0
fi
if ! table_exists "$NAS_DB" "$table"; then
printf "✓ fresh (created above)\n"
return 0
fi
local src_cols
src_cols=$(table_columns_with_types "$DB_SRC" "$table")
local nas_cols
nas_cols=$(table_columns "$NAS_DB" "$table")
local added=0
local added_names=""
while IFS='|' read -r name type; do
[ -z "$name" ] && continue
if ! echo "$nas_cols" | grep -Fxq "$name"; then
sqlite3 "$NAS_DB" "ALTER TABLE $table ADD COLUMN \"$name\" $type;"
added=$((added + 1))
added_names="$added_names $name"
fi
done <<< "$src_cols"
if [ "$added" -gt 0 ]; then
printf "✓ added %d:%s\n" "$added" "$added_names"
else
printf "✓ in sync\n"
fi
}
# Comma-separated, double-quoted column list for a table from source DB.
# Optional alias is prefixed (e.g. `o."col"`) for JOIN selects where column
# names would otherwise collide.
build_col_list() {
local table="$1"
local alias="${2:-}"
local prefix=""
[ -n "$alias" ] && prefix="${alias}."
table_columns "$DB_SRC" "$table" | awk 'NF' | awk -v p="$prefix" '{print p "\"" $0 "\""}' | paste -sd, -
}
# ──────────────────────────────────────────────────────────────────────────────
# ─── ARG HANDLING ─────────────────────────────────────────────────────────────
if [ "${1:-}" = "--reset-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
OLD=$(cat "$INSTALL_ID_FILE")
rm -f "$INSTALL_ID_FILE"
echo "Removed install_id: $OLD"
else
echo "No install_id file at $INSTALL_ID_FILE"
fi
echo "A new install_id will be generated on the next sync run."
exit 0
fi
if [ "${1:-}" = "--show-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
cat "$INSTALL_ID_FILE"
else
echo "(none — will be generated on next run)"
fi
exit 0
fi
if [ "${1:-}" = "today" ]; then
TARGET_DATE=$(date +%Y-%m-%d)
elif [ -n "${1:-}" ]; then
TARGET_DATE="$1"
if ! [[ "$TARGET_DATE" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
echo "ERROR: Invalid date. Use YYYY-MM-DD, 'today', or no argument for yesterday."
exit 1
fi
else
TARGET_DATE=$(date -v-1d +%Y-%m-%d)
fi
# ─── INSTALL ID ───────────────────────────────────────────────────────────────
INSTALL_ID=""
if [ -f "$INSTALL_ID_FILE" ]; then
INSTALL_ID=$(tr -d '[:space:]' < "$INSTALL_ID_FILE")
fi
if [ -z "$INSTALL_ID" ]; then
INSTALL_ID=$(uuidgen | tr 'A-Z' 'a-z')
echo "$INSTALL_ID" > "$INSTALL_ID_FILE"
log "Generated new install_id: $INSTALL_ID"
fi
log "========================================"
log "Screenpipe sync starting for: $TARGET_DATE"
log "install_id: $INSTALL_ID"
log "========================================"
# ─── PREFLIGHT ────────────────────────────────────────────────────────────────
step "Preflight checks"
if [ ! -f "$DB_SRC" ]; then
log "ERROR: Source DB not found at $DB_SRC"; exit 1
fi
printf " %-25s %s (%s)\n" "Source DB:" "OK" "$(du -sh "$DB_SRC" | cut -f1)"
if [ ! -d "$NAS_MOUNT" ]; then
log "ERROR: NAS not mounted at $NAS_MOUNT"; exit 1
fi
printf " %-25s %s\n" "NAS mount:" "OK $NAS_MOUNT"
# Check if DB already synced for this date+install_id
DB_ALREADY_SYNCED=false
if [ -f "$NAS_DB" ]; then
if table_exists "$NAS_DB" "frames"; then
HAS_INSTALL_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('frames') WHERE name='install_id';")
if [ "$HAS_INSTALL_COL" -gt "0" ]; then
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE' AND install_id='$INSTALL_ID';" 2>/dev/null || echo "0")
else
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE';" 2>/dev/null || echo "0")
fi
if [ "$EXISTING" -gt "0" ]; then
log "Date $TARGET_DATE / install $INSTALL_ID already has $EXISTING frames — skipping DB sync"
DB_ALREADY_SYNCED=true
else
printf " %-25s %s (%s)\n" "Archive DB:" "exists" "$(du -sh "$NAS_DB" | cut -f1)"
fi
else
printf " %-25s %s\n" "Archive DB:" "exists, no frames table yet"
fi
else
printf " %-25s %s\n" "Archive DB:" "will be created"
fi
# Source data dir for this date (video frames)
DATA_SRC="$HOME/.screenpipe/data/data/$TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
DATA_SIZE=$(du -sh "$DATA_SRC" | cut -f1)
DATA_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
printf " %-25s %s (%s files, %s)\n" "Frame data dir:" "OK" "$DATA_FILES" "$DATA_SIZE"
else
printf " %-25s %s\n" "Frame data dir:" "not found — skipping"
fi
# Audio files (flat in ~/.screenpipe/data/, dated by filename)
shopt -s nullglob
AUDIO_SRC_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_SRC_FILES[@]} -gt 0 ]; then
AUDIO_TOTAL=$(du -ch "${AUDIO_SRC_FILES[@]}" | tail -1 | cut -f1)
printf " %-25s %s (%s files, %s)\n" "Audio files:" "OK" "${#AUDIO_SRC_FILES[@]}" "$AUDIO_TOTAL"
else
printf " %-25s %s\n" "Audio files:" "none for this date"
fi
# ─── SCHEMA MIGRATION: install_id ─────────────────────────────────────────────
# Adds install_id column to existing NAS tables, backfills NULLs with a
# legacy tag, and creates the (install_id, id) unique index. Idempotent.
if [ -f "$NAS_DB" ]; then
step "Schema migration: install_id"
LEGACY_TAG="legacy-$(date +%Y%m%d)"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
if ! table_exists "$NAS_DB" "$tbl"; then
continue
fi
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
ROW_COUNT=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM $tbl;")
printf " %-36s adding install_id, backfill %s rows → %s\n" "$tbl" "$ROW_COUNT" "$LEGACY_TAG"
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
sqlite3 "$NAS_DB" "UPDATE $tbl SET install_id = '$LEGACY_TAG' WHERE install_id IS NULL;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
fi
# ─── DB SYNC ──────────────────────────────────────────────────────────────────
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── COUNT SOURCE ROWS ────────────────────────────────────────────────────
step "Counting source rows for $TARGET_DATE"
SRC_FRAMES=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ELEMENTS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM elements WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_UI=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE';")
SRC_OCR=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ocr_text WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_MEETINGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE';")
SRC_ACHUNKS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ATRANS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_transcriptions WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
SRC_ATAGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_tags WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
# speakers + speaker_embeddings are install-global, not per-date; we sync all.
SRC_SPEAKERS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speakers;")
SRC_SEMB=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speaker_embeddings;")
printf " %-25s %s\n" "frames:" "$SRC_FRAMES"
printf " %-25s %s\n" "elements:" "$SRC_ELEMENTS"
printf " %-25s %s\n" "ui_events:" "$SRC_UI"
printf " %-25s %s\n" "ocr_text:" "$SRC_OCR"
printf " %-25s %s\n" "meetings:" "$SRC_MEETINGS"
printf " %-25s %s\n" "audio_chunks:" "$SRC_ACHUNKS"
printf " %-25s %s\n" "audio_transcriptions:" "$SRC_ATRANS"
printf " %-25s %s\n" "audio_tags:" "$SRC_ATAGS"
printf " %-25s %s (all-time)\n" "speakers:" "$SRC_SPEAKERS"
printf " %-25s %s (all-time)\n" "speaker_embeddings:" "$SRC_SEMB"
if [ "$SRC_FRAMES" -eq "0" ] && [ "$SRC_ACHUNKS" -eq "0" ]; then
log "No frames or audio chunks for $TARGET_DATE — skipping DB sync"
DB_ALREADY_SYNCED=true
fi
fi
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── INIT TABLES ──────────────────────────────────────────────────────────
step "Initialising tables (CREATE IF NOT EXISTS)"
run_sqlite_heredoc "creating vision tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.frames AS SELECT * FROM main.frames WHERE 0;
CREATE TABLE IF NOT EXISTS nas.elements AS SELECT * FROM main.elements WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ui_events AS SELECT * FROM main.ui_events WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ocr_text AS SELECT * FROM main.ocr_text WHERE 0;
CREATE TABLE IF NOT EXISTS nas.video_chunks AS SELECT * FROM main.video_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.meetings AS SELECT * FROM main.meetings WHERE 0;
DETACH nas;
"
run_sqlite_heredoc "creating audio tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.audio_chunks AS SELECT * FROM main.audio_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_transcriptions AS SELECT * FROM main.audio_transcriptions WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speakers AS SELECT * FROM main.speakers WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speaker_embeddings AS SELECT * FROM main.speaker_embeddings WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_tags AS SELECT * FROM main.audio_tags WHERE 0;
DETACH nas;
"
# Re-run install_id + index setup so freshly-created tables get them too.
for tbl in "${ALL_SYNC_TABLES[@]}"; do
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
# ─── SCHEMA DRIFT ─────────────────────────────────────────────────────────
step "Reconciling NAS schema with source"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
ensure_columns "$tbl"
done
run_sqlite_heredoc "creating indexes" "
ATTACH '$NAS_DB' AS nas;
-- vision
CREATE INDEX IF NOT EXISTS nas.idx_frames_timestamp ON frames(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_frames_app_name ON frames(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_window_name ON frames(window_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_video_chunk_id ON frames(video_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_frames_document_path ON frames(document_path) WHERE document_path IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_id ON elements(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_src_role ON elements(frame_id, source, role) WHERE text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_onscreen_frame ON elements(frame_id) WHERE on_screen = 1 AND text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_timestamp ON ui_events(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_app_name ON ui_events(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_frame_id ON ui_events(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_ocr_text_frame_id ON ocr_text(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_meetings_start ON meetings(meeting_start);
CREATE INDEX IF NOT EXISTS nas.idx_video_chunks_device ON video_chunks(device_name);
-- audio
CREATE INDEX IF NOT EXISTS nas.idx_audio_chunks_timestamp ON audio_chunks(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_chunk_id ON audio_transcriptions(audio_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_timestamp ON audio_transcriptions(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_speaker ON audio_transcriptions(speaker_id, timestamp DESC);
CREATE INDEX IF NOT EXISTS nas.idx_speaker_emb_speaker_id ON speaker_embeddings(speaker_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_tags_chunk_id ON audio_tags(audio_chunk_id);
DETACH nas;
"
run_sqlite_heredoc "creating FTS tables" "
ATTACH '$NAS_DB' AS nas;
CREATE VIRTUAL TABLE IF NOT EXISTS nas.elements_fts USING fts5(
text, role, frame_id UNINDEXED,
content='elements', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.frames_fts USING fts5(
full_text, app_name, window_name, browser_url, id UNINDEXED,
tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.ui_events_fts USING fts5(
text_content, app_name, window_title, element_name,
content='ui_events', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.audio_transcriptions_fts USING fts5(
transcription, device, speaker_id UNINDEXED, id UNINDEXED,
tokenize='unicode61'
);
DETACH nas;
"
# ─── BUILD EXPLICIT COLUMN LISTS ──────────────────────────────────────────
# Source columns + install_id appended. Same on both sides of the INSERT.
FRAMES_COLS=$(build_col_list frames)
ELEMENTS_COLS=$(build_col_list elements)
ELEMENTS_COLS_E=$(build_col_list elements e)
UI_EVENTS_COLS=$(build_col_list ui_events)
OCR_TEXT_COLS=$(build_col_list ocr_text)
OCR_TEXT_COLS_O=$(build_col_list ocr_text o)
VIDEO_CHUNKS_COLS=$(build_col_list video_chunks)
MEETINGS_COLS=$(build_col_list meetings)
ACHUNKS_COLS=$(build_col_list audio_chunks)
ATRANS_COLS=$(build_col_list audio_transcriptions)
ATRANS_COLS_T=$(build_col_list audio_transcriptions t)
SPEAKERS_COLS=$(build_col_list speakers)
SEMB_COLS=$(build_col_list speaker_embeddings)
ATAGS_COLS=$(build_col_list audio_tags)
ATAGS_COLS_AT=$(build_col_list audio_tags at)
# ─── SYNC VISION DATA ─────────────────────────────────────────────────────
step "Syncing vision data for $TARGET_DATE"
run_sqlite_heredoc "video_chunks" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.video_chunks ($VIDEO_CHUNKS_COLS, install_id)
SELECT $VIDEO_CHUNKS_COLS, '$INSTALL_ID' FROM main.video_chunks
WHERE id IN (
SELECT DISTINCT video_chunk_id FROM main.frames
WHERE date(timestamp) = '$TARGET_DATE' AND video_chunk_id IS NOT NULL
);
DETACH nas;
"
run_sqlite_heredoc "frames ($SRC_FRAMES rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames ($FRAMES_COLS, install_id)
SELECT $FRAMES_COLS, '$INSTALL_ID' FROM main.frames WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ocr_text ($SRC_OCR rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ocr_text ($OCR_TEXT_COLS, install_id)
SELECT $OCR_TEXT_COLS_O, '$INSTALL_ID' FROM main.ocr_text o
JOIN main.frames f ON o.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ui_events ($SRC_UI rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events ($UI_EVENTS_COLS, install_id)
SELECT $UI_EVENTS_COLS, '$INSTALL_ID' FROM main.ui_events WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "elements ($SRC_ELEMENTS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements ($ELEMENTS_COLS, install_id)
SELECT $ELEMENTS_COLS_E, '$INSTALL_ID' FROM main.elements e
JOIN main.frames f ON e.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "meetings ($SRC_MEETINGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.meetings ($MEETINGS_COLS, install_id)
SELECT $MEETINGS_COLS, '$INSTALL_ID' FROM main.meetings WHERE date(meeting_start) = '$TARGET_DATE';
DETACH nas;
"
# ─── SYNC AUDIO DATA ──────────────────────────────────────────────────────
step "Syncing audio data for $TARGET_DATE"
# Speakers + embeddings are install-global, not per-date. Sync everything
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
#!/bin/bash
# screenpipe_sync.sh
# Syncs Screenpipe SQLite data (vision + audio) to a NAS archive database.
# Append-only, no deletions.
#
# Key design points
# -----------------
# 1. Multi-install safe via install_id.
# Source IDs reset to 1 on every screenpipe reinstall. To avoid collisions
# in the NAS archive, every synced table gets an extra `install_id` column,
# and the logical primary key becomes (install_id, id) enforced by a
# unique index. The install_id is a UUID stored in
# ~/.screenpipe/.sync_install_id — wiping ~/.screenpipe/ (which is what
# happens on reinstall) discards it, so the next run generates a new one.
#
# 2. Schema-drift tolerant. If screenpipe migrations add new columns to the
# source DB, the NAS gets ALTER TABLE'd to match. Inserts use explicit
# column lists so positional mismatches can't occur.
#
# 3. FTS caveat. FTS tables in the NAS use source `id` as rowid. After a
# reinstall, INSERT OR IGNORE will silently skip rows whose id collides
# with a previous install's id, so FTS only reliably indexes the most
# recent install. Falls back to LIKE queries on the base tables for
# multi-install searches (which can filter by install_id).
#
# Usage
# -----
# ./screenpipe_sync.sh # syncs yesterday
# ./screenpipe_sync.sh 2026-04-15 # syncs a specific date
# ./screenpipe_sync.sh today # syncs today so far
# ./screenpipe_sync.sh --reset-install-id # rotate install_id and exit
# ./screenpipe_sync.sh --show-install-id # print install_id and exit
set -euo pipefail
# ─── CONFIG ───────────────────────────────────────────────────────────────────
DB_SRC="${SCREENPIPE_DB:-$HOME/.screenpipe/db.sqlite}"
NAS_MOUNT="${NAS_MOUNT:-/Volumes/screenpipe}"
NAS_DB="$NAS_MOUNT/archive.db"
NAS_DATA="$NAS_MOUNT/data"
LOG_FILE="$HOME/.screenpipe/sync.log"
INSTALL_ID_FILE="$HOME/.screenpipe/.sync_install_id"
# Sync table groups. Order matters for FK-ish references
# (parents before children).
VISION_TABLES=(video_chunks frames elements ocr_text ui_events meetings)
AUDIO_TABLES=(speakers speaker_embeddings audio_chunks audio_transcriptions audio_tags)
ALL_SYNC_TABLES=("${VISION_TABLES[@]}" "${AUDIO_TABLES[@]}")
# ──────────────────────────────────────────────────────────────────────────────
SCRIPT_START=$(date +%s)
# ─── HELPERS ──────────────────────────────────────────────────────────────────
log() {
local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*"
echo "$msg" | tee -a "$LOG_FILE"
}
step() {
local now=$(date +%s)
local elapsed=$(( now - SCRIPT_START ))
local min=$(( elapsed / 60 ))
local sec=$(( elapsed % 60 ))
printf "\n[+%02dm%02ds] ▶ %s\n" "$min" "$sec" "$*" | tee -a "$LOG_FILE"
}
run_sqlite_heredoc() {
local label="$1"
local sql="$2"
local start=$(date +%s)
printf " %-36s " "$label"
sqlite3 "$DB_SRC" <<< "$sql" &
local pid=$!
local spin=[PASSWORD] '⠙' '⠹' '⠸' '⠼' '⠴' '⠦' '⠧' '⠇' '⠏')
local i=0
while kill -0 "$pid" 2>/dev/null; do
printf "\r %-36s %s " "$label" "${spin[$i]}"
i=$(( (i + 1) % 10 ))
sleep 0.2
done
wait "$pid"
local rc=$?
if [ $rc -ne 0 ]; then
printf "\r %-36s ✗ FAILED\n" "$label" | tee -a "$LOG_FILE"
exit $rc
fi
local dur=$(( $(date +%s) - start ))
printf "\r %-36s ✓ %dm%02ds\n" "$label" "$(( dur / 60 ))" "$(( dur % 60 ))" | tee -a "$LOG_FILE"
}
check() {
local label="$1" got="$2" expected="$3"
if [ "$got" -eq "$expected" ]; then
printf " %-25s %s / %s ✓\n" "$label:" "$got" "$expected"
else
printf " %-25s %s / %s ✗ MISMATCH\n" "$label:" "$got" "$expected"
fi
}
table_columns_with_types() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2 "|" $3}'
}
table_columns() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2}'
}
table_exists() {
local db="$1" table="$2"
local count
count=$(sqlite3 "$db" "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='$table';")
[ "$count" -gt 0 ]
}
# Adds any columns present in source but missing in NAS for the given table.
# Skips install_id (which is NAS-only and managed separately).
ensure_columns() {
local table="$1"
local label="schema: $table"
printf " %-36s " "$label"
if ! table_exists "$DB_SRC" "$table"; then
printf "✗ source missing — skipping\n"
return 0
fi
if ! table_exists "$NAS_DB" "$table"; then
printf "✓ fresh (created above)\n"
return 0
fi
local src_cols
src_cols=$(table_columns_with_types "$DB_SRC" "$table")
local nas_cols
nas_cols=$(table_columns "$NAS_DB" "$table")
local added=0
local added_names=""
while IFS='|' read -r name type; do
[ -z "$name" ] && continue
if ! echo "$nas_cols" | grep -Fxq "$name"; then
sqlite3 "$NAS_DB" "ALTER TABLE $table ADD COLUMN \"$name\" $type;"
added=$((added + 1))
added_names="$added_names $name"
fi
done <<< "$src_cols"
if [ "$added" -gt 0 ]; then
printf "✓ added %d:%s\n" "$added" "$added_names"
else
printf "✓ in sync\n"
fi
}
# Comma-separated, double-quoted column list for a table from source DB.
# Optional alias is prefixed (e.g. `o."col"`) for JOIN selects where column
# names would otherwise collide.
build_col_list() {
local table="$1"
local alias="${2:-}"
local prefix=""
[ -n "$alias" ] && prefix="${alias}."
table_columns "$DB_SRC" "$table" | awk 'NF' | awk -v p="$prefix" '{print p "\"" $0 "\""}' | paste -sd, -
}
# ──────────────────────────────────────────────────────────────────────────────
# ─── ARG HANDLING ─────────────────────────────────────────────────────────────
if [ "${1:-}" = "--reset-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
OLD=$(cat "$INSTALL_ID_FILE")
rm -f "$INSTALL_ID_FILE"
echo "Removed install_id: $OLD"
else
echo "No install_id file at $INSTALL_ID_FILE"
fi
echo "A new install_id will be generated on the next sync run."
exit 0
fi
if [ "${1:-}" = "--show-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
cat "$INSTALL_ID_FILE"
else
echo "(none — will be generated on next run)"
fi
exit 0
fi
if [ "${1:-}" = "today" ]; then
TARGET_DATE=$(date +%Y-%m-%d)
elif [ -n "${1:-}" ]; then
TARGET_DATE="$1"
if ! [[ "$TARGET_DATE" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
echo "ERROR: Invalid date. Use YYYY-MM-DD, 'today', or no argument for yesterday."
exit 1
fi
else
TARGET_DATE=$(date -v-1d +%Y-%m-%d)
fi
# ─── INSTALL ID ───────────────────────────────────────────────────────────────
INSTALL_ID=""
if [ -f "$INSTALL_ID_FILE" ]; then
INSTALL_ID=$(tr -d '[:space:]' < "$INSTALL_ID_FILE")
fi
if [ -z "$INSTALL_ID" ]; then
INSTALL_ID=$(uuidgen | tr 'A-Z' 'a-z')
echo "$INSTALL_ID" > "$INSTALL_ID_FILE"
log "Generated new install_id: $INSTALL_ID"
fi
log "========================================"
log "Screenpipe sync starting for: $TARGET_DATE"
log "install_id: $INSTALL_ID"
log "========================================"
# ─── PREFLIGHT ────────────────────────────────────────────────────────────────
step "Preflight checks"
if [ ! -f "$DB_SRC" ]; then
log "ERROR: Source DB not found at $DB_SRC"; exit 1
fi
printf " %-25s %s (%s)\n" "Source DB:" "OK" "$(du -sh "$DB_SRC" | cut -f1)"
if [ ! -d "$NAS_MOUNT" ]; then
log "ERROR: NAS not mounted at $NAS_MOUNT"; exit 1
fi
printf " %-25s %s\n" "NAS mount:" "OK $NAS_MOUNT"
# Check if DB already synced for this date+install_id
DB_ALREADY_SYNCED=false
if [ -f "$NAS_DB" ]; then
if table_exists "$NAS_DB" "frames"; then
HAS_INSTALL_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('frames') WHERE name='install_id';")
if [ "$HAS_INSTALL_COL" -gt "0" ]; then
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE' AND install_id='$INSTALL_ID';" 2>/dev/null || echo "0")
else
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE';" 2>/dev/null || echo "0")
fi
if [ "$EXISTING" -gt "0" ]; then
log "Date $TARGET_DATE / install $INSTALL_ID already has $EXISTING frames — skipping DB sync"
DB_ALREADY_SYNCED=true
else
printf " %-25s %s (%s)\n" "Archive DB:" "exists" "$(du -sh "$NAS_DB" | cut -f1)"
fi
else
printf " %-25s %s\n" "Archive DB:" "exists, no frames table yet"
fi
else
printf " %-25s %s\n" "Archive DB:" "will be created"
fi
# Source data dir for this date (video frames)
DATA_SRC="$HOME/.screenpipe/data/data/$TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
DATA_SIZE=$(du -sh "$DATA_SRC" | cut -f1)
DATA_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
printf " %-25s %s (%s files, %s)\n" "Frame data dir:" "OK" "$DATA_FILES" "$DATA_SIZE"
else
printf " %-25s %s\n" "Frame data dir:" "not found — skipping"
fi
# Audio files (flat in ~/.screenpipe/data/, dated by filename)
shopt -s nullglob
AUDIO_SRC_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_SRC_FILES[@]} -gt 0 ]; then
AUDIO_TOTAL=$(du -ch "${AUDIO_SRC_FILES[@]}" | tail -1 | cut -f1)
printf " %-25s %s (%s files, %s)\n" "Audio files:" "OK" "${#AUDIO_SRC_FILES[@]}" "$AUDIO_TOTAL"
else
printf " %-25s %s\n" "Audio files:" "none for this date"
fi
# ─── SCHEMA MIGRATION: install_id ─────────────────────────────────────────────
# Adds install_id column to existing NAS tables, backfills NULLs with a
# legacy tag, and creates the (install_id, id) unique index. Idempotent.
if [ -f "$NAS_DB" ]; then
step "Schema migration: install_id"
LEGACY_TAG="legacy-$(date +%Y%m%d)"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
if ! table_exists "$NAS_DB" "$tbl"; then
continue
fi
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
ROW_COUNT=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM $tbl;")
printf " %-36s adding install_id, backfill %s rows → %s\n" "$tbl" "$ROW_COUNT" "$LEGACY_TAG"
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
sqlite3 "$NAS_DB" "UPDATE $tbl SET install_id = '$LEGACY_TAG' WHERE install_id IS NULL;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
fi
# ─── DB SYNC ──────────────────────────────────────────────────────────────────
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── COUNT SOURCE ROWS ────────────────────────────────────────────────────
step "Counting source rows for $TARGET_DATE"
SRC_FRAMES=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ELEMENTS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM elements WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_UI=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE';")
SRC_OCR=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ocr_text WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_MEETINGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE';")
SRC_ACHUNKS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ATRANS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_transcriptions WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
SRC_ATAGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_tags WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
# speakers + speaker_embeddings are install-global, not per-date; we sync all.
SRC_SPEAKERS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speakers;")
SRC_SEMB=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speaker_embeddings;")
printf " %-25s %s\n" "frames:" "$SRC_FRAMES"
printf " %-25s %s\n" "elements:" "$SRC_ELEMENTS"
printf " %-25s %s\n" "ui_events:" "$SRC_UI"
printf " %-25s %s\n" "ocr_text:" "$SRC_OCR"
printf " %-25s %s\n" "meetings:" "$SRC_MEETINGS"
printf " %-25s %s\n" "audio_chunks:" "$SRC_ACHUNKS"
printf " %-25s %s\n" "audio_transcriptions:" "$SRC_ATRANS"
printf " %-25s %s\n" "audio_tags:" "$SRC_ATAGS"
printf " %-25s %s (all-time)\n" "speakers:" "$SRC_SPEAKERS"
printf " %-25s %s (all-time)\n" "speaker_embeddings:" "$SRC_SEMB"
if [ "$SRC_FRAMES" -eq "0" ] && [ "$SRC_ACHUNKS" -eq "0" ]; then
log "No frames or audio chunks for $TARGET_DATE — skipping DB sync"
DB_ALREADY_SYNCED=true
fi
fi
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── INIT TABLES ──────────────────────────────────────────────────────────
step "Initialising tables (CREATE IF NOT EXISTS)"
run_sqlite_heredoc "creating vision tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.frames AS SELECT * FROM main.frames WHERE 0;
CREATE TABLE IF NOT EXISTS nas.elements AS SELECT * FROM main.elements WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ui_events AS SELECT * FROM main.ui_events WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ocr_text AS SELECT * FROM main.ocr_text WHERE 0;
CREATE TABLE IF NOT EXISTS nas.video_chunks AS SELECT * FROM main.video_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.meetings AS SELECT * FROM main.meetings WHERE 0;
DETACH nas;
"
run_sqlite_heredoc "creating audio tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.audio_chunks AS SELECT * FROM main.audio_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_transcriptions AS SELECT * FROM main.audio_transcriptions WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speakers AS SELECT * FROM main.speakers WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speaker_embeddings AS SELECT * FROM main.speaker_embeddings WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_tags AS SELECT * FROM main.audio_tags WHERE 0;
DETACH nas;
"
# Re-run install_id + index setup so freshly-created tables get them too.
for tbl in "${ALL_SYNC_TABLES[@]}"; do
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
# ─── SCHEMA DRIFT ─────────────────────────────────────────────────────────
step "Reconciling NAS schema with source"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
ensure_columns "$tbl"
done
run_sqlite_heredoc "creating indexes" "
ATTACH '$NAS_DB' AS nas;
-- vision
CREATE INDEX IF NOT EXISTS nas.idx_frames_timestamp ON frames(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_frames_app_name ON frames(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_window_name ON frames(window_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_video_chunk_id ON frames(video_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_frames_document_path ON frames(document_path) WHERE document_path IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_e...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
Remote - SSH - Output
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Shell Script
Editor Language Status: Inline Suggestions, next: $(copilot) No inline suggestion available, Inline suggestions
LF
UTF-8
Spaces: 2
Ln 674, Col 18
git-commit Not Committed Yet, Git Blame Information
Not Committed Yet
Info: Downloading VS Code Server...
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard
OUT
activity.py
audio.py
elements.py
__init__.py
interactions.py
ocr.py
__pycache__
report.py
summary.py
"""AI summary generator using Claude."""
import os
from datetime import date
from typing import Any
from consumers.activity import app_time, summary_stats, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
SUMMARY_TYPES = {
"day_recap": {
"title": "Day Recap",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a concise day recap using EXACTLY this format:
## Summary
One sentence: what I mainly worked on today.
## Accomplishments
- Top 3 things I finished, with approximate times
- Name specific apps, files, or projects
## Key Moments
- Important things seen or heard, with timestamps
## Unfinished Work
- What to continue tomorrow
## Patterns
- Apps used most, topics that came up repeatedly
Only report what you can verify from the data. End with:
**Next step:** [most important thing to continue]""",
},
"standup": {
"title": "Standup Update",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a brief standup update:
**Yesterday:** What I worked on (2-3 bullets)
**Today:** What I plan to continue (1-2 bullets)
**Blockers:** Any issues observed (or "None")
Keep it under 150 words. Only report what's verifiable from the data.""",
},
"focus_time": {
"title": "Focus & Distraction Analysis",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
Analyze focus vs distraction patterns:
## Deep Work Periods
- When and how long were sustained focus sessions?
- What was being worked on?
## Distractions
- Which apps/sites were used recreationally?
- How much time was lost to context switching?
## Recommendations
- 2-3 concrete suggestions to improve focus tomorrow
Be specific with times and app names.""",
},
}
def _build_context(for_date: date) -> dict[str, str]:
apps = app_time(for_date)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(for_date, limit=20)
url_lines = "\n".join(
f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls
)
ui_rows = ui_event_summary(for_date)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(for_date, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(for_date)
return {
"app_time": app_lines or "(no app data)",
"browser_urls": url_lines or "(no browser data)",
"ui_events": ui_lines or "(no UI event data)",
"ocr_text": ocr_lines or "(no screen data)",
"audio": audio[:3000] if len(audio) > 3000 else audio,
}
def generate(
summary_type: str = "day_recap",
for_date: date | None = None,
api_key: str | None = None,
) -> dict[str, Any]:
"""Generate an AI summary. Returns dict with 'title', 'content', 'date'."""
import anthropic
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {
"title": "Error",
"content": "ANTHROPIC_API_KEY not set. Add it to your .env file.",
"date": str(d),
}
spec = SUMMARY_TYPES.get(summary_type)
if not spec:
return {"title": "Error", "content": f"Unknown summary type: {summary_type}", "date": str(d)}
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {
"title": spec["title"],
"content": f"No screen data found for {d}.",
"date": str(d),
}
ctx = _build_context(d)
prompt = spec["prompt"].format(**ctx)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return {
"title": spec["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Thinking
Thinking
Bash
Read all pipe.md files
IN
for f in /volume1/screenpipe/pipes/*/pipe.md; do echo "=== $f ==="; cat "$f"; echo; done
Copy code to clipboard
OUT
=== /volume1/screenpipe/pipes/ai-habits/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: AI Habits
description: "How you use AI tools — patterns and insights"
icon: "🤖"
featured: true
---
Search my recordings from the last 24 hours for AI tool usage. Use app_name filter for each tool separately: ChatGPT, Claude, Copilot, Cursor, Gemini, Perplexity. Use limit=5 per search, max 6 searches total.
Read screenpipe skill first.
Use this exact format:
## AI Tools Used
- List each tool with approximate time spent (e.g. "Claude: ~45min")
## What I Used Them For
- For each tool: coding, writing, research, or brainstorming
## Usage Patterns
- Do I switch between tools? Use them in bursts or steadily?
## Effectiveness
- Which tool appeared alongside completed work vs. abandoned attempts
If no AI usage is found, say so clearly. End with: "**Tip:** [one suggestion to use AI tools more effectively]"
=== /volume1/screenpipe/pipes/day-recap/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
=== /volume1/screenpipe/pipes/meeting-summary/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
=== /volume1/screenpipe/pipes/standup-update/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Standup Update
description: "What you did, what's next, and any blockers"
icon: "🏢"
featured: true
---
Based on my recordings from the last 24 hours, generate a standup update. Use limit=10 per search, max 3 searches total.
Read screenpipe skill first.
Use this exact format:
## Yesterday
- What I worked on (name specific projects, files, tools, PRs)
## Today
- What I will work on next (based on unfinished tasks and recent activity)
## Blockers
- Issues I hit — errors, slow builds, waiting on someone
- If no blockers, write "None"
Keep it under 150 words. Copy-paste ready for a team standup.
=== /volume1/screenpipe/pipes/time-breakdown/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Time Breakdown
description: "Where your time went — by app, project, and category"
icon: "⏱"
featured: false
---
Analyze my app usage from today (last 12 hours). Use limit=10 per search, max 4 searches. Prefer /raw_sql with COUNT/GROUP BY queries.
Read screenpipe skill first.
Use this exact format with durations and percentages:
## By Application
- List each app with duration and percentage, sorted by time (e.g. "VS Code: 2h 15min (28%)")
## By Category
- Group into: coding, meetings, browsing, writing, communication, other
- Show hours and percentage per category
## By Project
- Group related activities by project/topic. Name specific repos or tasks.
## Productivity Score
- Calculate: focused_work_hours / total_hours as a percentage
- Focused = coding + writing. Unfocused = browsing + switching.
End with: "**Suggestion:** [one specific change to improve tomorrow's productivity]"
=== /volume1/screenpipe/pipes/video-export/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Export Video Clip
description: "Create a video of your recent screen activity"
icon: "🎬"
featured: false
---
Export a video of my screen activity from the last 5 minutes.
Read screenpipe skill first.
Use the POST /frames/export endpoint with the time range and fps=1.0. Then show me the exported video file path as an inline code block so I can watch it.
If the export is large, suggest a lower fps or shorter time range.
Thinking
Thinking
Now I have a clear picture. The pipes are markdown prompt templates in
/volume1/screenpipe/pipes/
, but there's no code to actually load and run them —
consumers/summary.py
has its own hardcoded prompts instead.
Here's what I'll implement:
consumers/pipes.py
— dynamically loads pipe.md files and runs them via Claude (data injected as system message, pipe body as user prompt)
main.py
— two new endpoints:
GET /api/pipes
and
POST /api/pipes/{pipe_id}/run
cli.py
— a
pipe
command to run pipes from the terminal
Let me build it:
Bash
Check browser_urls signature
IN
cat /volume1/screenpipe/app/consumers/activity.py | grep "def browser_urls" -A 5
Copy code to clipboard
OUT
def browser_urls(
for_date: date | None = None,
app_name: str | None = None,
ts_from: str | None = None,
ts_to: str | None = None,
limit: int = 30,
Write
pipes.py
pipes.py
143 lines
"""Dynamic pipe loader and runner."""
import os
import re
from datetime import date
from pathlib import Path
from typing import Any
def _parse_frontmatter(text: str) -> tuple[dict, str]:
if not text.startswith("---"):
return {}, text
try:
end = text.index("---", 3)
except ValueError:
return {}, text
fm_text = text[3:end].strip()
body = text[end + 3:].strip()
meta: dict = {}
for line in fm_text.splitlines():
if ":" in line:
k, _, v = line.partition(":")
meta[k.strip()] = v.strip().strip('"')
return meta, body
def load_pipes(pipes_dir: Path) -> list[dict]:
if not pipes_dir.exists():
return []
pipes = []
for pipe_dir in sorted(pipes_dir.iterdir()):
if not pipe_dir.is_dir() or pipe_dir.name.startswith("."):
continue
pipe_md = pipe_dir / "pipe.md"
if not pipe_md.exists():
continue
text = pipe_md.read_text(encoding="utf-8")
meta, body = _parse_frontmatter(text)
if meta.get("enabled", "true").lower() == "false":
continue
body = re.sub(r"Read screenpipe skill first\.?\n?", "", body).strip()
pipes.append({
"id": pipe_dir.name,
"title": meta.get("title", pipe_dir.name),
"description": meta.get("description", ""),
"icon": meta.get("icon", ""),
"schedule": meta.get("schedule", "manual"),
"featured": meta.get("featured", "false").lower() == "true",
"prompt": body,
})
return pipes
def _build_system(d: date, stats: dict) -> str:
from consumers.activity import app_time, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
apps = app_time(d)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(d, limit=20)
url_lines = "\n".join(f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls)
ui_rows = ui_event_summary(d)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(d, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(d)
if len(audio) > 3000:
audio = audio[:3000]
return f"""You are analyzing screenpipe data — recordings of a person's computer activity.
## App usage (minutes per app):
{app_lines or '(no app data)'}
## Browser URLs visited:
{url_lines or '(no browser data)'}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_lines or '(no UI event data)'}
## Screen content (sampled every 5 minutes):
{ocr_lines or '(no screen data)'}
## Audio / speech:
{audio or '(no audio data)'}
Date: {d}
Total frames recorded: {stats['total_frames']}"""
def run_pipe(
pipe_id: str,
for_date: date | None = None,
api_key: str | None = None,
pipes_dir: Path | None = None,
) -> dict[str, Any]:
import anthropic
from consumers.activity import summary_stats
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {"title": "Error", "content": "ANTHROPIC_API_KEY not set.", "date": str(d)}
_pipes_dir = pipes_dir or Path(os.environ.get("PIPES_DIR", "/data/pipes"))
all_pipes = {p["id"]: p for p in load_pipes(_pipes_dir)}
if pipe_id not in all_pipes:
return {"title": "Error", "content": f"Pipe not found: {pipe_id}", "date": str(d)}
pipe = all_pipes[pipe_id]
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {"title": pipe["title"], "content": f"No screen data found for {d}.", "date": str(d)}
system = _build_system(d, stats)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
system=system,
messages=[{"role": "user", "content": pipe["prompt"]}],
)
return {
"title": pipe["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Now add the API endpoints to
main.py
:
Edit
main.py
main.py
Added 1 line
Edit
main.py
main.py
Added 1 line
Now add the pipe endpoints at the end of main.py, before the closing summary section:
Edit
main.py
main.py
Added 14 lines
Now add the CLI command to
cli.py
cli.py
:
Edit
cli.py
cli.py
Added 40 lines
Done. Here's how to run pipes:
How to run pipes
How to run pipes
Via Docker (inside the running container):
Copy code to clipboard
docker exec -it screenpipe-app python cli.py pipe --list
docker exec -it screenpipe-app python cli.py pipe day-recap
docker exec -it screenpipe-app python cli.py pipe standup-update --date yesterday
Via the API
(once the container is running on port 8766):
Copy code to clipboard
# List pipes
curl http://localhost:8766/api/pipes
# Run a pipe
curl -X POST "http://localhost:8766/api/pipes/day-recap/run"
curl -X POST "http://localhost:8766/api/pipes/standup-update/run?d=2026-04-25"
Rebuild and restart
to pick up the changes:
Copy code to clipboard...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
Remote - SSH - Output
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Shell Script
Editor Language Status: Inline Suggestions, next: $(copilot) No inline suggestion available, Inline suggestions
LF
UTF-8
Spaces: 2
Ln 674, Col 18
git-commit Not Committed Yet, Git Blame Information
Not Committed Yet
Info: Downloading VS Code Server...
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard
OUT
activity.py
audio.py
elements.py
__init__.py
interactions.py
ocr.py
__pycache__
report.py
summary.py
"""AI summary generator using Claude."""
import os
from datetime import date
from typing import Any
from consumers.activity import app_time, summary_stats, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
SUMMARY_TYPES = {
"day_recap": {
"title": "Day Recap",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a concise day recap using EXACTLY this format:
## Summary
One sentence: what I mainly worked on today.
## Accomplishments
- Top 3 things I finished, with approximate times
- Name specific apps, files, or projects
## Key Moments
- Important things seen or heard, with timestamps
## Unfinished Work
- What to continue tomorrow
## Patterns
- Apps used most, topics that came up repeatedly
Only report what you can verify from the data. End with:
**Next step:** [most important thing to continue]""",
},
"standup": {
"title": "Standup Update",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a brief standup update:
**Yesterday:** What I worked on (2-3 bullets)
**Today:** What I plan to continue (1-2 bullets)
**Blockers:** Any issues observed (or "None")
Keep it under 150 words. Only report what's verifiable from the data.""",
},
"focus_time": {
"title": "Focus & Distraction Analysis",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
Analyze focus vs distraction patterns:
## Deep Work Periods
- When and how long were sustained focus sessions?
- What was being worked on?
## Distractions
- Which apps/sites were used recreationally?
- How much time was lost to context switching?
## Recommendations
- 2-3 concrete suggestions to improve focus tomorrow
Be specific with times and app names.""",
},
}
def _build_context(for_date: date) -> dict[str, str]:
apps = app_time(for_date)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(for_date, limit=20)
url_lines = "\n".join(
f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls
)
ui_rows = ui_event_summary(for_date)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(for_date, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(for_date)
return {
"app_time": app_lines or "(no app data)",
"browser_urls": url_lines or "(no browser data)",
"ui_events": ui_lines or "(no UI event data)",
"ocr_text": ocr_lines or "(no screen data)",
"audio": audio[:3000] if len(audio) > 3000 else audio,
}
def generate(
summary_type: str = "day_recap",
for_date: date | None = None,
api_key: str | None = None,
) -> dict[str, Any]:
"""Generate an AI summary. Returns dict with 'title', 'content', 'date'."""
import anthropic
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {
"title": "Error",
"content": "ANTHROPIC_API_KEY not set. Add it to your .env file.",
"date": str(d),
}
spec = SUMMARY_TYPES.get(summary_type)
if not spec:
return {"title": "Error", "content": f"Unknown summary type: {summary_type}", "date": str(d)}
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {
"title": spec["title"],
"content": f"No screen data found for {d}.",
"date": str(d),
}
ctx = _build_context(d)
prompt = spec["prompt"].format(**ctx)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return {
"title": spec["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Thinking
Thinking
Bash
Read all pipe.md files
IN
for f in /volume1/screenpipe/pipes/*/pipe.md; do echo "=== $f ==="; cat "$f"; echo; done
Copy code to clipboard
OUT
=== /volume1/screenpipe/pipes/ai-habits/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: AI Habits
description: "How you use AI tools — patterns and insights"
icon: "🤖"
featured: true
---
Search my recordings from the last 24 hours for AI tool usage. Use app_name filter for each tool separately: ChatGPT, Claude, Copilot, Cursor, Gemini, Perplexity. Use limit=5 per search, max 6 searches total.
Read screenpipe skill first.
Use this exact format:
## AI Tools Used
- List each tool with approximate time spent (e.g. "Claude: ~45min")
## What I Used Them For
- For each tool: coding, writing, research, or brainstorming
## Usage Patterns
- Do I switch between tools? Use them in bursts or steadily?
## Effectiveness
- Which tool appeared alongside completed work vs. abandoned attempts
If no AI usage is found, say so clearly. End with: "**Tip:** [one suggestion to use AI tools more effectively]"
=== /volume1/screenpipe/pipes/day-recap/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
=== /volume1/screenpipe/pipes/meeting-summary/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
=== /volume1/screenpipe/pipes/standup-update/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Standup Update
description: "What you did, what's next, and any blockers"
icon: "🏢"
featured: true
---
Based on my recordings from the last 24 hours, generate a standup update. Use limit=10 per search, max 3 searches total.
Read screenpipe skill first.
Use this exact format:
## Yesterday
- What I worked on (name specific projects, files, tools, PRs)
## Today
- What I will work on next (based on unfinished tasks and recent activity)
## Blockers
- Issues I hit — errors, slow builds, waiting on someone
- If no blockers, write "None"
Keep it under 150 words. Copy-paste ready for a team standup.
=== /volume1/screenpipe/pipes/time-breakdown/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Time Breakdown
description: "Where your time went — by app, project, and category"
icon: "⏱"
featured: false
---
Analyze my app usage from today (last 12 hours). Use limit=10 per search, max 4 searches. Prefer /raw_sql with COUNT/GROUP BY queries.
Read screenpipe skill first.
Use this exact format with durations and percentages:
## By Application
- List each app with duration and percentage, sorted by time (e.g. "VS Code: 2h 15min (28%)")
## By Category
- Group into: coding, meetings, browsing, writing, communication, other
- Show hours and percentage per category
## By Project
- Group related activities by project/topic. Name specific repos or tasks.
## Productivity Score
- Calculate: focused_work_hours / total_hours as a percentage
- Focused = coding + writing. Unfocused = browsing + switching.
End with: "**Suggestion:** [one specific change to improve tomorrow's productivity]"
=== /volume1/screenpipe/pipes/video-export/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Export Video Clip
description: "Create a video of your recent screen activity"
icon: "🎬"
featured: false
---
Export a video of my screen activity from the last 5 minutes.
Read screenpipe skill first.
Use the POST /frames/export endpoint with the time range and fps=1.0. Then show me the exported video file path as an inline code block so I can watch it.
If the export is large, suggest a lower fps or shorter time range.
Thinking
Thinking
Now I have a clear picture. The pipes are markdown prompt templates in
/volume1/screenpipe/pipes/
, but there's no code to actually load and run them —
consumers/summary.py
has its own hardcoded prompts instead.
Here's what I'll implement:
consumers/pipes.py
— dynamically loads pipe.md files and runs them via Claude (data injected as system message, pipe body as user prompt)
main.py
— two new endpoints:
GET /api/pipes
and
POST /api/pipes/{pipe_id}/run
cli.py
— a
pipe
command to run pipes from the terminal
Let me build it:
Bash
Check browser_urls signature
IN
cat /volume1/screenpipe/app/consumers/activity.py | grep "def browser_urls" -A 5
Copy code to clipboard
OUT
def browser_urls(
for_date: date | None = None,
app_name: str | None = None,
ts_from: str | None = None,
ts_to: str | None = None,
limit: int = 30,...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
Remote - SSH - Output
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Shell Script
Editor Language Status: Inline Suggestions, next: $(copilot) No inline suggestion available, Inline suggestions
LF
UTF-8
Spaces: 2
Ln 674, Col 18
git-commit Not Committed Yet, Git Blame Information
Not Committed Yet
Info: Downloading VS Code Server...
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
Remote - SSH - Output
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Shell Script
Editor Language Status: $(copilot) No inline suggestion available, Inline suggestions
LF
UTF-8
Spaces: 2
Ln 675, Col 1
git-commit Not Committed Yet, Git Blame Information
Not Committed Yet
Info: Downloading VS Code Server...
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard
OUT
activity.py
audio.py
elements.py
__init__.py
interactions.py
ocr.py
__pycache__
report.py
summary.py
"""AI summary generator using Claude."""
import os
from datetime import date
from typing import Any
from consumers.activity import app_time, summary_stats, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
SUMMARY_TYPES = {
"day_recap": {
"title": "Day Recap",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a concise day recap using EXACTLY this format:
## Summary
One sentence: what I mainly worked on today.
## Accomplishments
- Top 3 things I finished, with approximate times
- Name specific apps, files, or projects
## Key Moments
- Important things seen or heard, with timestamps
## Unfinished Work
- What to continue tomorrow
## Patterns
- Apps used most, topics that came up repeatedly
Only report what you can verify from the data. End with:
**Next step:** [most important thing to continue]""",
},
"standup": {
"title": "Standup Update",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a brief standup update:
**Yesterday:** What I worked on (2-3 bullets)
**Today:** What I plan to continue (1-2 bullets)
**Blockers:** Any issues observed (or "None")
Keep it under 150 words. Only report what's verifiable from the data.""",
},
"focus_time": {
"title": "Focus & Distraction Analysis",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
Analyze focus vs distraction patterns:
## Deep Work Periods
- When and how long were sustained focus sessions?
- What was being worked on?
## Distractions
- Which apps/sites were used recreationally?
- How much time was lost to context switching?
## Recommendations
- 2-3 concrete suggestions to improve focus tomorrow
Be specific with times and app names.""",
},
}
def _build_context(for_date: date) -> dict[str, str]:
apps = app_time(for_date)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(for_date, limit=20)
url_lines = "\n".join(
f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls
)
ui_rows = ui_event_summary(for_date)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(for_date, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(for_date)
return {
"app_time": app_lines or "(no app data)",
"browser_urls": url_lines or "(no browser data)",
"ui_events": ui_lines or "(no UI event data)",
"ocr_text": ocr_lines or "(no screen data)",
"audio": audio[:3000] if len(audio) > 3000 else audio,
}
def generate(
summary_type: str = "day_recap",
for_date: date | None = None,
api_key: str | None = None,
) -> dict[str, Any]:
"""Generate an AI summary. Returns dict with 'title', 'content', 'date'."""
import anthropic
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {
"title": "Error",
"content": "ANTHROPIC_API_KEY not set. Add it to your .env file.",
"date": str(d),
}
spec = SUMMARY_TYPES.get(summary_type)
if not spec:
return {"title": "Error", "content": f"Unknown summary type: {summary_type}", "date": str(d)}
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {
"title": spec["title"],
"content": f"No screen data found for {d}.",
"date": str(d),
}
ctx = _build_context(d)
prompt = spec["prompt"].format(**ctx)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return {
"title": spec["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Thinking
Thinking
Bash
Read all pipe.md files
IN
for f in /volume1/screenpipe/pipes/*/pipe.md; do echo "=== $f ==="; cat "$f"; echo; done
Copy code to clipboard
OUT
=== /volume1/screenpipe/pipes/ai-habits/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: AI Habits
description: "How you use AI tools — patterns and insights"
icon: "🤖"
featured: true
---
Search my recordings from the last 24 hours for AI tool usage. Use app_name filter for each tool separately: ChatGPT, Claude, Copilot, Cursor, Gemini, Perplexity. Use limit=5 per search, max 6 searches total.
Read screenpipe skill first.
Use this exact format:
## AI Tools Used
- List each tool with approximate time spent (e.g. "Claude: ~45min")
## What I Used Them For
- For each tool: coding, writing, research, or brainstorming
## Usage Patterns
- Do I switch between tools? Use them in bursts or steadily?
## Effectiveness
- Which tool appeared alongside completed work vs. abandoned attempts
If no AI usage is found, say so clearly. End with: "**Tip:** [one suggestion to use AI tools more effectively]"
=== /volume1/screenpipe/pipes/day-recap/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
=== /volume1/screenpipe/pipes/meeting-summary/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
=== /volume1/screenpipe/pipes/standup-update/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Standup Update
description: "What you did, what's next, and any blockers"
icon: "🏢"
featured: true
---
Based on my recordings from the last 24 hours, generate a standup update. Use limit=10 per search, max 3 searches total.
Read screenpipe skill first.
Use this exact format:
## Yesterday
- What I worked on (name specific projects, files, tools, PRs)
## Today
- What I will work on next (based on unfinished tasks and recent activity)
## Blockers
- Issues I hit — errors, slow builds, waiting on someone
- If no blockers, write "None"
Keep it under 150 words. Copy-paste ready for a team standup.
=== /volume1/screenpipe/pipes/time-breakdown/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Time Breakdown
description: "Where your time went — by app, project, and category"
icon: "⏱"
featured: false
---
Analyze my app usage from today (last 12 hours). Use limit=10 per search, max 4 searches. Prefer /raw_sql with COUNT/GROUP BY queries.
Read screenpipe skill first.
Use this exact format with durations and percentages:
## By Application
- List each app with duration and percentage, sorted by time (e.g. "VS Code: 2h 15min (28%)")
## By Category
- Group into: coding, meetings, browsing, writing, communication, other
- Show hours and percentage per category
## By Project
- Group related activities by project/topic. Name specific repos or tasks.
## Productivity Score
- Calculate: focused_work_hours / total_hours as a percentage
- Focused = coding + writing. Unfocused = browsing + switching.
End with: "**Suggestion:** [one specific change to improve tomorrow's productivity]"
=== /volume1/screenpipe/pipes/video-export/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Export Video Clip
description: "Create a video of your recent screen activity"
icon: "🎬"
featured: false
---
Export a video of my screen activity from the last 5 minutes.
Read screenpipe skill first.
Use the POST /frames/export endpoint with the time range and fps=1.0. Then show me the exported video file path as an inline code block so I can watch it.
If the export is large, suggest a lower fps or shorter time range.
Thinking
Thinking
Now I have a clear picture. The pipes are markdown prompt templates in
/volume1/screenpipe/pipes/...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
Remote - SSH - Output
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Shell Script
Editor Language Status: $(copilot) No inline suggestion available, Inline suggestions
LF
UTF-8
Spaces: 2
Ln 675, Col 1
git-commit Not Committed Yet, Git Blame Information
Not Committed Yet
Info: Downloading VS Code Server...
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
Remote - SSH - Output
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Shell Script
Editor Language Status: $(copilot) No inline suggestion available, Inline suggestions
LF
UTF-8
Spaces: 2
Ln 675, Col 1
git-commit Not Committed Yet, Git Blame Information
Not Committed Yet
Info: Downloading VS Code Server...
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
Remote - SSH - Output
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Shell Script
Editor Language Status: $(copilot) No inline suggestion available, Inline suggestions
LF
UTF-8
Spaces: 2
Ln 675, Col 1
Info: Downloading VS Code Server...
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update, 1 requires restart
3
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas] — Modif screenpipe_sync.sh — screenpipe [SSH: nas] — Modified...
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: finance [SSH: nas]
Explorer Section: finance [SSH: nas]
FINANCE [SSH: NAS]
auth
dsk-uploader
finance-hub
backend
frontend
mcp
scripts
.env
.env.example
M
.gitignore
M
docker-compose.yml
M
README.md
sms_export.json
payments-logger
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
docker-compose.yml, Editor Group 1
.env, Editor Group 1
.env.example, preview, Editor Group 1
report(1).csv, Editor Group 1
report(2).csv, Editor Group 1
sms_export.json, Editor Group 1
# ── Database ───────────────────────────────────────────────────────────────────
[ENV_SECRET]
# ── Notifier service ──────────────────────────────────────────────────────────
NOTIFIER_URL=[URL_WITH_CREDENTIALS]
# Secret key — generate with: openssl rand -base64 36
[ENV_SECRET]
# Public URL authentik is reachable at (used for redirects and the outpost)
AUTHENTIK_HOST=[URL_WITH_CREDENTIALS] version 2>/dev/null && node --version
Copy code to clipboard
OUT
Exit code 127
Bash
Verify public path configuration
IN
# Check if the upload endpoints are really public
grep -n 'PUBLIC_PATHS\|upload\|ingest' /volume2/docker/finance/finance-hub/backend/src/auth.js
Copy code to clipboard
OUT
<tool_use_error>Cancelled: parallel tool call Bash(npm view @modelcontextprotocol/sdk versi…) errored</tool_use_error>
Bash
Check node version and MCP SDK latest
IN
node --version 2>/dev/null && python3 -c "import urllib.request; import json; d=json.loads(urllib.request.urlopen('[URL_WITH_CREDENTIALS] "^1.29.0"
},
"engines": {
"node": ">=18"
}
}
Write
server.js
server.js
236 lines
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import {
CallToolRequestSchema,
ListToolsRequestSchema,
} from '@modelcontextprotocol/sdk/types.js';
const BASE_URL = (process.env.FINANCE_HUB_URL || '[URL_WITH_CREDENTIALS] '^1.29.0'}
Thinking
Thinking
Good. Now let me test the MCP server can at least start and call the health check:
Bash
Smoke test MCP server startup
IN
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | timeout 5 node server.js 2>/dev/null | head -3 || echo "NODE test (expected timeout on stdin wait)"
Copy code to clipboard
OUT
{"result":{"tools":[{"name":"list_transactions","description":"List finance transactions with optional filters. Returns paginated results including total amount for the filtered set.","inputSchema":{"type":"object","properties":{"page":{"type":"number","description":"Page number (default 1)"},"limit":{"type":"number","description":"Results per page, max 200 (default 50)"},"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD (inclusive)"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD (inclusive)"},"tag":{"type":"string","description":"Filter by tag name"},"recipient":{"type":"string","description":"Substring match on payee name"},"type":{"type":"string","description":"Transaction type: POS | ATM | WALLET"},"source":{"type":"string","description":"Import source: INGEST | UPLOAD"},"search":{"type":"string","description":"Full-text search across rawMessage and recipient"},"hideBalanceAlerts":{"type":"boolean","description":"Exclude balance-notification SMS (default false)"},"sortBy":{"type":"string","description":"Sort field: date | amount | recipient | createdAt"},"sortDir":{"type":"string","description":"asc or desc (default desc)"}}}},{"name":"spending_by_tag","description":"Aggregate spending totals by tag for a date range. Fetches up to 200 transactions and groups them. Good for monthly summaries.","inputSchema":{"type":"object","properties":{"dateFrom":{"type":"string","description":"Start date YYYY-MM-DD"},"dateTo":{"type":"string","description":"End date YYYY-MM-DD"}}}},{"name":"get_transaction","description":"Get a single transaction by its numeric ID.","inputSchema":{"type":"object","required":["id"],"properties":{"id":{"type":"number","description":"Transaction import ID"}}}},{"name":"list_tags","description":"List all available tags with their colors.","inputSchema":{"type":"object","properties":{}}},{"name":"add_tag","description":"Add a tag to a transaction. Creates the tag globally if it does not exist.","inputSchema":{"type":"object","required":["id","name"],"properties":{"id":{"type":"number","description":"Transaction ID"},"name":{"type":"string","description":"Tag name (e.g. \"Groceries\")"},"color":{"type":"string","description":"Hex color (e.g. \"#22c55e\"). Optional — auto-assigned if omitted."}}}},{"name":"remove_tag","description":"Remove a tag from a transaction.","inputSchema":{"type":"object","required":["id","tagId"],"properties":{"id":{"type":"number","description":"Transaction ID"},"tagId":{"type":"number","description":"Tag ID (from list_tags or the transaction object)"}}}},{"name":"health_check","description":"Check Finance Hub backend connectivity and database status.","inputSchema":{"type":"object","properties":{}}}]},"jsonrpc":"2.0","id":1}
Thinking
Thinking
MCP server responds correctly. Now test an actual tool call:
Bash
Test MCP health_check tool against live backend
IN
\
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"health_check","arguments":{}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 5 node server.js 2>/dev/null | tail -1 | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'])"
Copy code to clipboard
OUT
{
"status": "ok",
"timestamp": "2026-05-09T16:19:09.977Z",
"storage": {
"type": "PostgreSQL",
"host": "db",
"database": "finance_hub"
}
}
Bash
Rebuild backend with API key auth changes
IN
# Test the API key auth directly against the backend (backend still has DEV_BYPASS_AUTH=true, need to restart first)
# First rebuild with the new code
cd /volume2/docker/finance/finance-hub && docker compose up -d --build backend 2>&1 | tail -8
Copy code to clipboard
OUT
time="2026-05-09T19:19:13+03:00" level=warning msg="/volume2/docker/finance/finance-hub/docker-compose.yml: `version` is obsolete"
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.45/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.config-hash%22%3Atrue%2C%22com.docker.compose.project%3Dfinance-hub%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
Bash
Rebuild backend with sudo
IN
sudo docker compose -f /volume2/docker/finance/finance-hub/docker-compose.yml up -d --build backend 2>&1 | tail -10
Copy code to clipboard
OUT
sudo: a terminal is required to read the password; either use the -S option to read from standard input or configure an askpass helper
sudo: a password is required
Thinking
Thinking
Docker needs sudo+password. Let me verify the auth changes are correct by testing the logic directly:
Bash
Unit test the auth middleware logic
IN
node - << 'EOF'
// Simulate the auth middleware logic
const [ENV_SECRET];
process.env.API_KEY = API_KEY;
const { authentikMiddleware } = require('/volume2/docker/finance/finance-hub/backend/src/auth.js');
function testReq(path, headers = {}) {
const res = { status(code) { this.code = code; return this; }, json(body) { return body; } };
const req = { path, headers };
let passed = false;
authentikMiddleware(req, res, () => { passed = true; });
return { passed, code: res.code, user: req.user };
}
// Public path — no auth needed
let r = testReq('/api/health', {});
console.assert(r.passed, 'health should be public');
console.log('✓ /api/health is public');
// Valid API key
r = testReq('/api/payments', { authorization: `Bearer ${API_KEY}` });
console.assert(r.passed, 'valid key should pass');
console.assert(r.user.username === 'api-key', 'api user');
console.log('✓ valid API key passes');
// Wrong API key
r = testReq('/api/payments', { authorization: 'Bearer wrongkey' });
console.assert(!r.passed, 'wrong key should fail');
console.assert(r.code === 401, '401 on bad key');
console.log('✓ wrong API key returns 401');
// No auth + no bypass + no authentik header
process.env.DEV_BYPASS_AUTH = 'false';
r = testReq('/api/payments', {});
console.assert(!r.passed && r.code === 401, 'no creds → 401');
console.log('✓ no credentials returns 401');
// Upload endpoints no longer public
r = testReq('/api/upload/csv', {});
console.assert(!r.passed && r.code === 401, 'upload/csv requires auth');
console.log('✓ /api/upload/csv now requires auth (was public, now fixed)');
// Authentik header pass-through
r = testReq('/api/payments', { 'x-authentik-username': 'lukastest', 'x-authentik-email': '[EMAIL]' });
console.assert(r.passed && r.user.username === 'lukastest', 'authentik header works');
console.log('✓ Authentik X-authentik-username header works');
console.log('\nAll auth tests passed.');
EOF
Copy code to clipboard
OUT
✓ /api/health is public
✓ valid API key passes
✓ wrong API key returns 401
✓ no credentials returns 401
✓ /api/upload/csv now requires auth (was public, now fixed)
✓ Authentik X-authentik-username header works
All auth tests passed.
All auth cases pass. Now test the MCP spending_by_tag tool against the live backend (with DEV_BYPASS_AUTH still on):
Bash
Test spending_by_tag MCP tool against live backend
IN
cd /volume2/docker/finance/finance-hub/mcp && \
printf '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}\n{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"spending_by_tag","arguments":{"dateFrom":"2026-05-01","dateTo":"2026-05-09"}}}\n' | \
FINANCE_HUB_URL=http://localhost:3001 [ENV_SECRET] \
timeout 8 node server.js 2>/dev/null | grep '"id":2' | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['result']['content'][0]['text'][:500])"
Copy code to clipboard
OUT
{
"period": {
"from": "2026-05-01",
"to": "2026-05-09"
},
"totalTransactions": 36,
"breakdown": [
{
"tag": "Bills",
"total": 117.9,
"count": 6
},
{
"tag": "(untagged)",
"total": 1153.87,
"count": 22
}
]
}
MCP tools work correctly against real data. Let me write the Claude desktop config example and the
.gitignore
for the mcp dir:
Write
claude_desktop_config.example.json
claude_desktop_config.example.json
14 lines
{
"_comment": "Add this block to ~/.config/claude/claude_desktop_config.json (macOS: ~/Library/Application Support/Claude/claude_desktop_config.json)",
"mcpServers": {
"finance-hub": {
"command": "node",
"args": ["/volume2/docker/finance/finance-hub/mcp/server.js"],
"env": {
"FINANCE_HUB_URL": "http://localhost:3001",
"FINANCE_HUB_API_KEY": "94bb2917b9a1b9dcbaec49e7009c2a6b222fc310d32f530aa07e411e8d4a058c"
}
}
}
}
Write
.gitignore
.gitignore
2 lines
node_modules/
Update Todos
Fix auth.js: add API key support + remove upload endpoints from PUBLIC_PATHS
Update index.js: add CORS origin restriction via env var
Update .env and .env.example with API_KEY variable
Build MCP server (stdio, Claude desktop compatible)
Test: restart backend, verify API key auth and MCP connection
Now let me save relevant project context to memory:
Read
MEMORY.md
MEMORY.md
Write
MEMORY.md
MEMORY.md
4 lines
# Memory Index
- [project_finance_hub.md](project_finance_hub.md) — Finance Hub architecture, auth design, MCP server, API key setup
Write
project_finance_hub.md
project_finance_hub.md
35 lines
---
name: Finance Hub project state
description: Architecture, auth design, API key, MCP server location and tools for the finance-hub project
type: project
---
Finance Hub is a personal finance tracker at `/volume2/docker/finance/finance-hub/`.
**Running services** (as of 2026-05-09):
- Backend: port 3001 (Express/Prisma, PostgreSQL)
- Frontend: port 5175 (React/Vite)
- Adminer: port 8092
- Authentik: port 9100 (auth.lakylak.xyz, tag 2026.2.1)
**Auth design:**
- Production: NPM (Nginx Proxy Manager) + Authentik forward-auth outpost → injects `X-authentik-username` header
- `DEV_BYPASS_AUTH=true` is currently set in `.env` — app runs without real auth
- API key auth added: `Authorization: Bearer <API_KEY>` bypasses Authentik header check (used by MCP/scripts)
- API_KEY in `.env`: `94bb2917b9a1b9dcbaec49e7009c2a6b222fc310d32f530aa07e411e8d4a058c`
**Public endpoints** (no auth): `/api/health`, `/api/payments/ingest`
**Security fix applied (2026-05-09):** Removed `/api/upload/csv` and `/api/upload/preview` from PUBLIC_PATHS — they were incorrectly public.
**MCP server:** `/volume2/docker/finance/finance-hub/mcp/server.js`
- Stdio transport, works with Claude desktop
- Tools: list_transactions, spending_by_tag, get_transaction, list_tags, add_tag, remove_tag, health_check
- Env: FINANCE_HUB_URL, FINANCE_HUB_API_KEY
- Config example: `mcp/claude_desktop_config.example.json`
**To enable real Authentik auth:** Set `DEV_BYPASS_AUTH=false` in `.env`, configure NPM proxy host for finance-hub w...
|
Code
|
Implement Authentik inte… — finance [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 11 pending changes
11
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X)
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
#!/bin/bash
# screenpipe_sync.sh
# Syncs Screenpipe SQLite data (vision + audio) to a NAS archive database.
# Append-only, no deletions.
#
# Key design points
# -----------------
# 1. Multi-install safe via install_id.
# Source IDs reset to 1 on every screenpipe reinstall. To avoid collisions
# in the NAS archive, every synced table gets an extra `install_id` column,
# and the logical primary key becomes (install_id, id) enforced by a
# unique index. The install_id is a UUID stored in
# ~/.screenpipe/.sync_install_id — wiping ~/.screenpipe/ (which is what
# happens on reinstall) discards it, so the next run generates a new one.
#
# 2. Schema-drift tolerant. If screenpipe migrations add new columns to the
# source DB, the NAS gets ALTER TABLE'd to match. Inserts use explicit
# column lists so positional mismatches can't occur.
#
# 3. FTS caveat. FTS tables in the NAS use source `id` as rowid. After a
# reinstall, INSERT OR IGNORE will silently skip rows whose id collides
# with a previous install's id, so FTS only reliably indexes the most
# recent install. Falls back to LIKE queries on the base tables for
# multi-install searches (which can filter by install_id).
#
# Usage
# -----
# ./screenpipe_sync.sh # syncs yesterday
# ./screenpipe_sync.sh 2026-04-15 # syncs a specific date
# ./screenpipe_sync.sh today # syncs today so far
# ./screenpipe_sync.sh --reset-install-id # rotate install_id and exit
# ./screenpipe_sync.sh --show-install-id # print install_id and exit
set -euo pipefail
# ─── CONFIG ───────────────────────────────────────────────────────────────────
DB_SRC="${SCREENPIPE_DB:-$HOME/.screenpipe/db.sqlite}"
NAS_MOUNT="${NAS_MOUNT:-/Volumes/screenpipe}"
NAS_DB="$NAS_MOUNT/archive.db"
NAS_DATA="$NAS_MOUNT/data"
LOG_FILE="$HOME/.screenpipe/sync.log"
INSTALL_ID_FILE="$HOME/.screenpipe/.sync_install_id"
# Sync table groups. Order matters for FK-ish references
# (parents before children).
VISION_TABLES=(video_chunks frames elements ocr_text ui_events meetings)
AUDIO_TABLES=(speakers speaker_embeddings audio_chunks audio_transcriptions audio_tags)
ALL_SYNC_TABLES=("${VISION_TABLES[@]}" "${AUDIO_TABLES[@]}")
# ──────────────────────────────────────────────────────────────────────────────
SCRIPT_START=$(date +%s)
# ─── HELPERS ──────────────────────────────────────────────────────────────────
log() {
local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*"
echo "$msg" | tee -a "$LOG_FILE"
}
step() {
local now=$(date +%s)
local elapsed=$(( now - SCRIPT_START ))
local min=$(( elapsed / 60 ))
local sec=$(( elapsed % 60 ))
printf "\n[+%02dm%02ds] ▶ %s\n" "$min" "$sec" "$*" | tee -a "$LOG_FILE"
}
run_sqlite_heredoc() {
local label="$1"
local sql="$2"
local start=$(date +%s)
printf " %-36s " "$label"
sqlite3 "$DB_SRC" <<< "$sql" &
local pid=$!
local spin=[PASSWORD] '⠙' '⠹' '⠸' '⠼' '⠴' '⠦' '⠧' '⠇' '⠏')
local i=0
while kill -0 "$pid" 2>/dev/null; do
printf "\r %-36s %s " "$label" "${spin[$i]}"
i=$(( (i + 1) % 10 ))
sleep 0.2
done
wait "$pid"
local rc=$?
if [ $rc -ne 0 ]; then
printf "\r %-36s ✗ FAILED\n" "$label" | tee -a "$LOG_FILE"
exit $rc
fi
local dur=$(( $(date +%s) - start ))
printf "\r %-36s ✓ %dm%02ds\n" "$label" "$(( dur / 60 ))" "$(( dur % 60 ))" | tee -a "$LOG_FILE"
}
check() {
local label="$1" got="$2" expected="$3"
if [ "$got" -eq "$expected" ]; then
printf " %-25s %s / %s ✓\n" "$label:" "$got" "$expected"
else
printf " %-25s %s / %s ✗ MISMATCH\n" "$label:" "$got" "$expected"
fi
}
table_columns_with_types() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2 "|" $3}'
}
table_columns() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2}'
}
table_exists() {
local db="$1" table="$2"
local count
count=$(sqlite3 "$db" "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='$table';")
[ "$count" -gt 0 ]
}
# Adds any columns present in source but missing in NAS for the given table.
# Skips install_id (which is NAS-only and managed separately).
ensure_columns() {
local table="$1"
local label="schema: $table"
printf " %-36s " "$label"
if ! table_exists "$DB_SRC" "$table"; then
printf "✗ source missing — skipping\n"
return 0
fi
if ! table_exists "$NAS_DB" "$table"; then
printf "✓ fresh (created above)\n"
return 0
fi
local src_cols
src_cols=$(table_columns_with_types "$DB_SRC" "$table")
local nas_cols
nas_cols=$(table_columns "$NAS_DB" "$table")
local added=0
local added_names=""
while IFS='|' read -r name type; do
[ -z "$name" ] && continue
if ! echo "$nas_cols" | grep -Fxq "$name"; then
sqlite3 "$NAS_DB" "ALTER TABLE $table ADD COLUMN \"$name\" $type;"
added=$((added + 1))
added_names="$added_names $name"
fi
done <<< "$src_cols"
if [ "$added" -gt 0 ]; then
printf "✓ added %d:%s\n" "$added" "$added_names"
else
printf "✓ in sync\n"
fi
}
# Comma-separated, double-quoted column list for a table from source DB.
# Optional alias is prefixed (e.g. `o."col"`) for JOIN selects where column
# names would otherwise collide.
build_col_list() {
local table="$1"
local alias="${2:-}"
local prefix=""
[ -n "$alias" ] && prefix="${alias}."
table_columns "$DB_SRC" "$table" | awk 'NF' | awk -v p="$prefix" '{print p "\"" $0 "\""}' | paste -sd, -
}
# ──────────────────────────────────────────────────────────────────────────────
# ─── ARG HANDLING ─────────────────────────────────────────────────────────────
if [ "${1:-}" = "--reset-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
OLD=$(cat "$INSTALL_ID_FILE")
rm -f "$INSTALL_ID_FILE"
echo "Removed install_id: $OLD"
else
echo "No install_id file at $INSTALL_ID_FILE"
fi
echo "A new install_id will be generated on the next sync run."
exit 0
fi
if [ "${1:-}" = "--show-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
cat "$INSTALL_ID_FILE"
else
echo "(none — will be generated on next run)"
fi
exit 0
fi
if [ "${1:-}" = "today" ]; then
TARGET_DATE=$(date +%Y-%m-%d)
elif [ -n "${1:-}" ]; then
TARGET_DATE="$1"
if ! [[ "$TARGET_DATE" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
echo "ERROR: Invalid date. Use YYYY-MM-DD, 'today', or no argument for yesterday."
exit 1
fi
else
TARGET_DATE=$(date -v-1d +%Y-%m-%d)
fi
# ─── INSTALL ID ───────────────────────────────────────────────────────────────
INSTALL_ID=""
if [ -f "$INSTALL_ID_FILE" ]; then
INSTALL_ID=$(tr -d '[:space:]' < "$INSTALL_ID_FILE")
fi
if [ -z "$INSTALL_ID" ]; then
INSTALL_ID=$(uuidgen | tr 'A-Z' 'a-z')
echo "$INSTALL_ID" > "$INSTALL_ID_FILE"
log "Generated new install_id: $INSTALL_ID"
fi
log "========================================"
log "Screenpipe sync starting for: $TARGET_DATE"
log "install_id: $INSTALL_ID"
log "========================================"
# ─── PREFLIGHT ────────────────────────────────────────────────────────────────
step "Preflight checks"
if [ ! -f "$DB_SRC" ]; then
log "ERROR: Source DB not found at $DB_SRC"; exit 1
fi
printf " %-25s %s (%s)\n" "Source DB:" "OK" "$(du -sh "$DB_SRC" | cut -f1)"
if [ ! -d "$NAS_MOUNT" ]; then
log "ERROR: NAS not mounted at $NAS_MOUNT"; exit 1
fi
printf " %-25s %s\n" "NAS mount:" "OK $NAS_MOUNT"
# Check if DB already synced for this date+install_id
DB_ALREADY_SYNCED=false
if [ -f "$NAS_DB" ]; then
if table_exists "$NAS_DB" "frames"; then
HAS_INSTALL_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('frames') WHERE name='install_id';")
if [ "$HAS_INSTALL_COL" -gt "0" ]; then
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE' AND install_id='$INSTALL_ID';" 2>/dev/null || echo "0")
else
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE';" 2>/dev/null || echo "0")
fi
if [ "$EXISTING" -gt "0" ]; then
log "Date $TARGET_DATE / install $INSTALL_ID already has $EXISTING frames — skipping DB sync"
DB_ALREADY_SYNCED=true
else
printf " %-25s %s (%s)\n" "Archive DB:" "exists" "$(du -sh "$NAS_DB" | cut -f1)"
fi
else
printf " %-25s %s\n" "Archive DB:" "exists, no frames table yet"
fi
else
printf " %-25s %s\n" "Archive DB:" "will be created"
fi
# Source data dir for this date (video frames)
DATA_SRC="$HOME/.screenpipe/data/data/$TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
DATA_SIZE=$(du -sh "$DATA_SRC" | cut -f1)
DATA_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
printf " %-25s %s (%s files, %s)\n" "Frame data dir:" "OK" "$DATA_FILES" "$DATA_SIZE"
else
printf " %-25s %s\n" "Frame data dir:" "not found — skipping"
fi
# Audio files (flat in ~/.screenpipe/data/, dated by filename)
shopt -s nullglob
AUDIO_SRC_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_SRC_FILES[@]} -gt 0 ]; then
AUDIO_TOTAL=$(du -ch "${AUDIO_SRC_FILES[@]}" | tail -1 | cut -f1)
printf " %-25s %s (%s files, %s)\n" "Audio files:" "OK" "${#AUDIO_SRC_FILES[@]}" "$AUDIO_TOTAL"
else
printf " %-25s %s\n" "Audio files:" "none for this date"
fi
# ─── SCHEMA MIGRATION: install_id ─────────────────────────────────────────────
# Adds install_id column to existing NAS tables, backfills NULLs with a
# legacy tag, and creates the (install_id, id) unique index. Idempotent.
if [ -f "$NAS_DB" ]; then
step "Schema migration: install_id"
LEGACY_TAG="legacy-$(date +%Y%m%d)"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
if ! table_exists "$NAS_DB" "$tbl"; then
continue
fi
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
ROW_COUNT=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM $tbl;")
printf " %-36s adding install_id, backfill %s rows → %s\n" "$tbl" "$ROW_COUNT" "$LEGACY_TAG"
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
sqlite3 "$NAS_DB" "UPDATE $tbl SET install_id = '$LEGACY_TAG' WHERE install_id IS NULL;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
fi
# ─── DB SYNC ──────────────────────────────────────────────────────────────────
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── COUNT SOURCE ROWS ────────────────────────────────────────────────────
step "Counting source rows for $TARGET_DATE"
SRC_FRAMES=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ELEMENTS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM elements WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_UI=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE';")
SRC_OCR=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ocr_text WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_MEETINGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE';")
SRC_ACHUNKS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ATRANS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_transcriptions WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
SRC_ATAGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_tags WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
# speakers + speaker_embeddings are install-global, not per-date; we sync all.
SRC_SPEAKERS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speakers;")
SRC_SEMB=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speaker_embeddings;")
printf " %-25s %s\n" "frames:" "$SRC_FRAMES"
printf " %-25s %s\n" "elements:" "$SRC_ELEMENTS"
printf " %-25s %s\n" "ui_events:" "$SRC_UI"
printf " %-25s %s\n" "ocr_text:" "$SRC_OCR"
printf " %-25s %s\n" "meetings:" "$SRC_MEETINGS"
printf " %-25s %s\n" "audio_chunks:" "$SRC_ACHUNKS"
printf " %-25s %s\n" "audio_transcriptions:" "$SRC_ATRANS"
printf " %-25s %s\n" "audio_tags:" "$SRC_ATAGS"
printf " %-25s %s (all-time)\n" "speakers:" "$SRC_SPEAKERS"
printf " %-25s %s (all-time)\n" "speaker_embeddings:" "$SRC_SEMB"
if [ "$SRC_FRAMES" -eq "0" ] && [ "$SRC_ACHUNKS" -eq "0" ]; then
log "No frames or audio chunks for $TARGET_DATE — skipping DB sync"
DB_ALREADY_SYNCED=true
fi
fi
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── INIT TABLES ──────────────────────────────────────────────────────────
step "Initialising tables (CREATE IF NOT EXISTS)"
run_sqlite_heredoc "creating vision tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.frames AS SELECT * FROM main.frames WHERE 0;
CREATE TABLE IF NOT EXISTS nas.elements AS SELECT * FROM main.elements WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ui_events AS SELECT * FROM main.ui_events WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ocr_text AS SELECT * FROM main.ocr_text WHERE 0;
CREATE TABLE IF NOT EXISTS nas.video_chunks AS SELECT * FROM main.video_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.meetings AS SELECT * FROM main.meetings WHERE 0;
DETACH nas;
"
run_sqlite_heredoc "creating audio tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.audio_chunks AS SELECT * FROM main.audio_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_transcriptions AS SELECT * FROM main.audio_transcriptions WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speakers AS SELECT * FROM main.speakers WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speaker_embeddings AS SELECT * FROM main.speaker_embeddings WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_tags AS SELECT * FROM main.audio_tags WHERE 0;
DETACH nas;
"
# Re-run install_id + index setup so freshly-created tables get them too.
for tbl in "${ALL_SYNC_TABLES[@]}"; do
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
# ─── SCHEMA DRIFT ─────────────────────────────────────────────────────────
step "Reconciling NAS schema with source"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
ensure_columns "$tbl"
done
run_sqlite_heredoc "creating indexes" "
ATTACH '$NAS_DB' AS nas;
-- vision
CREATE INDEX IF NOT EXISTS nas.idx_frames_timestamp ON frames(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_frames_app_name ON frames(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_window_name ON frames(window_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_video_chunk_id ON frames(video_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_frames_document_path ON frames(document_path) WHERE document_path IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_id ON elements(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_src_role ON elements(frame_id, source, role) WHERE text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_onscreen_frame ON elements(frame_id) WHERE on_screen = 1 AND text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_timestamp ON ui_events(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_app_name ON ui_events(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_frame_id ON ui_events(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_ocr_text_frame_id ON ocr_text(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_meetings_start ON meetings(meeting_start);
CREATE INDEX IF NOT EXISTS nas.idx_video_chunks_device ON video_chunks(device_name);
-- audio
CREATE INDEX IF NOT EXISTS nas.idx_audio_chunks_timestamp ON audio_chunks(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_chunk_id ON audio_transcriptions(audio_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_timestamp ON audio_transcriptions(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_speaker ON audio_transcriptions(speaker_id, timestamp DESC);
CREATE INDEX IF NOT EXISTS nas.idx_speaker_emb_speaker_id ON speaker_embeddings(speaker_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_tags_chunk_id ON audio_tags(audio_chunk_id);
DETACH nas;
"
run_sqlite_heredoc "creating FTS tables" "
ATTACH '$NAS_DB' AS nas;
CREATE VIRTUAL TABLE IF NOT EXISTS nas.elements_fts USING fts5(
text, role, frame_id UNINDEXED,
content='elements', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.frames_fts USING fts5(
full_text, app_name, window_name, browser_url, id UNINDEXED,
tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.ui_events_fts USING fts5(
text_content, app_name, window_title, element_name,
content='ui_events', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.audio_transcriptions_fts USING fts5(
transcription, device, speaker_id UNINDEXED, id UNINDEXED,
tokenize='unicode61'
);
DETACH nas;
"
# ─── BUILD EXPLICIT COLUMN LISTS ──────────────────────────────────────────
# Source columns + install_id appended. Same on both sides of the INSERT.
FRAMES_COLS=$(build_col_list frames)
ELEMENTS_COLS=$(build_col_list elements)
ELEMENTS_COLS_E=$(build_col_list elements e)
UI_EVENTS_COLS=$(build_col_list ui_events)
OCR_TEXT_COLS=$(build_col_list ocr_text)
OCR_TEXT_COLS_O=$(build_col_list ocr_text o)
VIDEO_CHUNKS_COLS=$(build_col_list video_chunks)
MEETINGS_COLS=$(build_col_list meetings)
ACHUNKS_COLS=$(build_col_list audio_chunks)
ATRANS_COLS=$(build_col_list audio_transcriptions)
ATRANS_COLS_T=$(build_col_list audio_transcriptions t)
SPEAKERS_COLS=$(build_col_list speakers)
SEMB_COLS=$(build_col_list speaker_embeddings)
ATAGS_COLS=$(build_col_list audio_tags)
ATAGS_COLS_AT=$(build_col_list audio_tags at)
# ─── SYNC VISION DATA ─────────────────────────────────────────────────────
step "Syncing vision data for $TARGET_DATE"
run_sqlite_heredoc "video_chunks" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.video_chunks ($VIDEO_CHUNKS_COLS, install_id)
SELECT $VIDEO_CHUNKS_COLS, '$INSTALL_ID' FROM main.video_chunks
WHERE id IN (
SELECT DISTINCT video_chunk_id FROM main.frames
WHERE date(timestamp) = '$TARGET_DATE' AND video_chunk_id IS NOT NULL
);
DETACH nas;
"
run_sqlite_heredoc "frames ($SRC_FRAMES rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames ($FRAMES_COLS, install_id)
SELECT $FRAMES_COLS, '$INSTALL_ID' FROM main.frames WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ocr_text ($SRC_OCR rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ocr_text ($OCR_TEXT_COLS, install_id)
SELECT $OCR_TEXT_COLS_O, '$INSTALL_ID' FROM main.ocr_text o
JOIN main.frames f ON o.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ui_events ($SRC_UI rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events ($UI_EVENTS_COLS, install_id)
SELECT $UI_EVENTS_COLS, '$INSTALL_ID' FROM main.ui_events WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "elements ($SRC_ELEMENTS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements ($ELEMENTS_COLS, install_id)
SELECT $ELEMENTS_COLS_E, '$INSTALL_ID' FROM main.elements e
JOIN main.frames f ON e.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "meetings ($SRC_MEETINGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.meetings ($MEETINGS_COLS, install_id)
SELECT $MEETINGS_COLS, '$INSTALL_ID' FROM main.meetings WHERE date(meeting_start) = '$TARGET_DATE';
DETACH nas;
"
# ─── SYNC AUDIO DATA ──────────────────────────────────────────────────────
step "Syncing audio data for $TARGET_DATE"
# Speakers + embeddings are install-global, not per-date. Sync everything
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
#!/bin/bash
# screenpipe_sync.sh
# Syncs Screenpipe SQLite data (vision + audio) to a NAS archive database.
# Append-only, no deletions.
#
# Key design points
# -----------------
# 1. Multi-install safe via install_id.
# Source IDs reset to 1 on every screenpipe reinstall. To avoid collisions
# in the NAS archive, every synced table gets an extra `install_id` column,
# and the logical primary key becomes (install_id, id) enforced by a
# unique index. The install_id is a UUID stored in
# ~/.screenpipe/.sync_install_id — wiping ~/.screenpipe/ (which is what
# happens on reinstall) discards it, so the next run generates a new one.
#
# 2. Schema-drift tolerant. If screenpipe migrations add new columns to the
# source DB, the NAS gets ALTER TABLE'd to match. Inserts use explicit
# column lists so positional mismatches can't occur.
#
# 3. FTS caveat. FTS tables in the NAS use source `id` as rowid. After a
# reinstall, INSERT OR IGNORE will silently skip rows whose id collides
# with a previous install's id, so FTS only reliably indexes the most
# recent install. Falls back to LIKE queries on the base tables for
# multi-install searches (which can filter by install_id).
#
# Usage
# -----
# ./screenpipe_sync.sh # syncs yesterday
# ./screenpipe_sync.sh 2026-04-15 # syncs a specific date
# ./screenpipe_sync.sh today # syncs today so far
# ./screenpipe_sync.sh --reset-install-id # rotate install_id and exit
# ./screenpipe_sync.sh --show-install-id # print install_id and exit
set -euo pipefail
# ─── CONFIG ───────────────────────────────────────────────────────────────────
DB_SRC="${SCREENPIPE_DB:-$HOME/.screenpipe/db.sqlite}"
NAS_MOUNT="${NAS_MOUNT:-/Volumes/screenpipe}"
NAS_DB="$NAS_MOUNT/archive.db"
NAS_DATA="$NAS_MOUNT/data"
LOG_FILE="$HOME/.screenpipe/sync.log"
INSTALL_ID_FILE="$HOME/.screenpipe/.sync_install_id"
# Sync table groups. Order matters for FK-ish references
# (parents before children).
VISION_TABLES=(video_chunks frames elements ocr_text ui_events meetings)
AUDIO_TABLES=(speakers speaker_embeddings audio_chunks audio_transcriptions audio_tags)
ALL_SYNC_TABLES=("${VISION_TABLES[@]}" "${AUDIO_TABLES[@]}")
# ──────────────────────────────────────────────────────────────────────────────
SCRIPT_START=$(date +%s)
# ─── HELPERS ──────────────────────────────────────────────────────────────────
log() {
local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*"
echo "$msg" | tee -a "$LOG_FILE"
}
step() {
local now=$(date +%s)
local elapsed=$(( now - SCRIPT_START ))
local min=$(( elapsed / 60 ))
local sec=$(( elapsed % 60 ))
printf "\n[+%02dm%02ds] ▶ %s\n" "$min" "$sec" "$*" | tee -a "$LOG_FILE"
}
run_sqlite_heredoc() {
local label="$1"
local sql="$2"
local start=$(date +%s)
printf " %-36s " "$label"
sqlite3 "$DB_SRC" <<< "$sql" &
local pid=$!
local spin=[PASSWORD] '⠙' '⠹' '⠸' '⠼' '⠴' '⠦' '⠧' '⠇' '⠏')
local i=0
while kill -0 "$pid" 2>/dev/null; do
printf "\r %-36s %s " "$label" "${spin[$i]}"
i=$(( (i + 1) % 10 ))
sleep 0.2
done
wait "$pid"
local rc=$?
if [ $rc -ne 0 ]; then
printf "\r %-36s ✗ FAILED\n" "$label" | tee -a "$LOG_FILE"
exit $rc
fi
local dur=$(( $(date +%s) - start ))
printf "\r %-36s ✓ %dm%02ds\n" "$label" "$(( dur / 60 ))" "$(( dur % 60 ))" | tee -a "$LOG_FILE"
}
check() {
local label="$1" got="$2" expected="$3"
if [ "$got" -eq "$expected" ]; then
printf " %-25s %s / %s ✓\n" "$label:" "$got" "$expected"
else
printf " %-25s %s / %s ✗ MISMATCH\n" "$label:" "$got" "$expected"
fi
}
table_columns_with_types() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2 "|" $3}'
}
table_columns() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2}'
}
table_exists() {
local db="$1" table="$2"
local count
count=$(sqlite3 "$db" "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='$table';")
[ "$count" -gt 0 ]
}
# Adds any columns present in source but missing in NAS for the given table.
# Skips install_id (which is NAS-only and managed separately).
ensure_columns() {
local table="$1"
local label="schema: $table"
printf " %-36s " "$label"
if ! table_exists "$DB_SRC" "$table"; then
printf "✗ source missing — skipping\n"
return 0
fi
if ! table_exists "$NAS_DB" "$table"; then
printf "✓ fresh (created above)\n"
return 0
fi
local src_cols
src_cols=$(table_columns_with_types "$DB_SRC" "$table")
local nas_cols
nas_cols=$(table_columns "$NAS_DB" "$table")
local added=0
local added_names=""
while IFS='|' read -r name type; do
[ -z "$name" ] && continue
if ! echo "$nas_cols" | grep -Fxq "$name"; then
sqlite3 "$NAS_DB" "ALTER TABLE $table ADD COLUMN \"$name\" $type;"
added=$((added + 1))
added_names="$added_names $name"
fi
done <<< "$src_cols"
if [ "$added" -gt 0 ]; then
printf "✓ added %d:%s\n" "$added" "$added_names"
else
printf "✓ in sync\n"
fi
}
# Comma-separated, double-quoted column list for a table from source DB.
# Optional alias is prefixed (e.g. `o."col"`) for JOIN selects where column
# names would otherwise collide.
build_col_list() {
local table="$1"
local alias="${2:-}"
local prefix=""
[ -n "$alias" ] && prefix="${alias}."
table_columns "$DB_SRC" "$table" | awk 'NF' | awk -v p="$prefix" '{print p "\"" $0 "\""}' | paste -sd, -
}
# ──────────────────────────────────────────────────────────────────────────────
# ─── ARG HANDLING ─────────────────────────────────────────────────────────────
if [ "${1:-}" = "--reset-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
OLD=$(cat "$INSTALL_ID_FILE")
rm -f "$INSTALL_ID_FILE"
echo "Removed install_id: $OLD"
else
echo "No install_id file at $INSTALL_ID_FILE"
fi
echo "A new install_id will be generated on the next sync run."
exit 0
fi
if [ "${1:-}" = "--show-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
cat "$INSTALL_ID_FILE"
else
echo "(none — will be generated on next run)"
fi
exit 0
fi
if [ "${1:-}" = "today" ]; then
TARGET_DATE=$(date +%Y-%m-%d)
elif [ -n "${1:-}" ]; then
TARGET_DATE="$1"
if ! [[ "$TARGET_DATE" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
echo "ERROR: Invalid date. Use YYYY-MM-DD, 'today', or no argument for yesterday."
exit 1
fi
else
TARGET_DATE=$(date -v-1d +%Y-%m-%d)
fi
# ─── INSTALL ID ───────────────────────────────────────────────────────────────
INSTALL_ID=""
if [ -f "$INSTALL_ID_FILE" ]; then
INSTALL_ID=$(tr -d '[:space:]' < "$INSTALL_ID_FILE")
fi
if [ -z "$INSTALL_ID" ]; then
INSTALL_ID=$(uuidgen | tr 'A-Z' 'a-z')
echo "$INSTALL_ID" > "$INSTALL_ID_FILE"
log "Generated new install_id: $INSTALL_ID"
fi
log "========================================"
log "Screenpipe sync starting for: $TARGET_DATE"
log "install_id: $INSTALL_ID"
log "========================================"
# ─── PREFLIGHT ────────────────────────────────────────────────────────────────
step "Preflight checks"
if [ ! -f "$DB_SRC" ]; then
log "ERROR: Source DB not found at $DB_SRC"; exit 1
fi
printf " %-25s %s (%s)\n" "Source DB:" "OK" "$(du -sh "$DB_SRC" | cut -f1)"
if [ ! -d "$NAS_MOUNT" ]; then
log "ERROR: NAS not mounted at $NAS_MOUNT"; exit 1
fi
printf " %-25s %s\n" "NAS mount:" "OK $NAS_MOUNT"
# Check if DB already synced for this date+install_id
DB_ALREADY_SYNCED=false
if [ -f "$NAS_DB" ]; then
if table_exists "$NAS_DB" "frames"; then
HAS_INSTALL_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('frames') WHERE name='install_id';")
if [ "$HAS_INSTALL_COL" -gt "0" ]; then
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE' AND install_id='$INSTALL_ID';" 2>/dev/null || echo "0")
else
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE';" 2>/dev/null || echo "0")
fi
if [ "$EXISTING" -gt "0" ]; then
log "Date $TARGET_DATE / install $INSTALL_ID already has $EXISTING frames — skipping DB sync"
DB_ALREADY_SYNCED=true
else
printf " %-25s %s (%s)\n" "Archive DB:" "exists" "$(du -sh "$NAS_DB" | cut -f1)"
fi
else
printf " %-25s %s\n" "Archive DB:" "exists, no frames table yet"
fi
else
printf " %-25s %s\n" "Archive DB:" "will be created"
fi
# Source data dir for this date (video frames)
DATA_SRC="$HOME/.screenpipe/data/data/$TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
DATA_SIZE=$(du -sh "$DATA_SRC" | cut -f1)
DATA_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
printf " %-25s %s (%s files, %s)\n" "Frame data dir:" "OK" "$DATA_FILES" "$DATA_SIZE"
else
printf " %-25s %s\n" "Frame data dir:" "not found — skipping"
fi
# Audio files (flat in ~/.screenpipe/data/, dated by filename)
shopt -s nullglob
AUDIO_SRC_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_SRC_FILES[@]} -gt 0 ]; then
AUDIO_TOTAL=$(du -ch "${AUDIO_SRC_FILES[@]}" | tail -1 | cut -f1)
printf " %-25s %s (%s files, %s)\n" "Audio files:" "OK" "${#AUDIO_SRC_FILES[@]}" "$AUDIO_TOTAL"
else
printf " %-25s %s\n" "Audio files:" "none for this date"
fi
# ─── SCHEMA MIGRATION: install_id ─────────────────────────────────────────────
# Adds install_id column to existing NAS tables, backfills NULLs with a
# legacy tag, and creates the (install_id, id) unique index. Idempotent.
if [ -f "$NAS_DB" ]; then
step "Schema migration: install_id"
LEGACY_TAG="legacy-$(date +%Y%m%d)"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
if ! table_exists "$NAS_DB" "$tbl"; then
continue
fi
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
ROW_COUNT=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM $tbl;")
printf " %-36s adding install_id, backfill %s rows → %s\n" "$tbl" "$ROW_COUNT" "$LEGACY_TAG"
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
sqlite3 "$NAS_DB" "UPDATE $tbl SET install_id = '$LEGACY_TAG' WHERE install_id IS NULL;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
fi
# ─── DB SYNC ──────────────────────────────────────────────────────────────────
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── COUNT SOURCE ROWS ────────────────────────────────────────────────────
step "Counting source rows for $TARGET_DATE"
SRC_FRAMES=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ELEMENTS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM elements WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_UI=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE';")
SRC_OCR=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ocr_text WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_MEETINGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE';")
SRC_ACHUNKS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ATRANS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_transcriptions WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
SRC_ATAGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_tags WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
# speakers + speaker_embeddings are install-global, not per-date; we sync all.
SRC_SPEAKERS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speakers;")
SRC_SEMB=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speaker_embeddings;")
printf " %-25s %s\n" "frames:" "$SRC_FRAMES"
printf " %-25s %s\n" "elements:" "$SRC_ELEMENTS"
printf " %-25s %s\n" "ui_events:" "$SRC_UI"
printf " %-25s %s\n" "ocr_text:" "$SRC_OCR"
printf " %-25s %s\n" "meetings:" "$SRC_MEETINGS"
printf " %-25s %s\n" "audio_chunks:" "$SRC_ACHUNKS"
printf " %-25s %s\n" "audio_transcriptions:" "$SRC_ATRANS"
printf " %-25s %s\n" "audio_tags:" "$SRC_ATAGS"
printf " %-25s %s (all-time)\n" "speakers:" "$SRC_SPEAKERS"
printf " %-25s %s (all-time)\n" "speaker_embeddings:" "$SRC_SEMB"
if [ "$SRC_FRAMES" -eq "0" ] && [ "$SRC_ACHUNKS" -eq "0" ]; then
log "No frames or audio chunks for $TARGET_DATE — skipping DB sync"
DB_ALREADY_SYNCED=true
fi
fi
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── INIT TABLES ──────────────────────────────────────────────────────────
step "Initialising tables (CREATE IF NOT EXISTS)"
run_sqlite_heredoc "creating vision tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.frames AS SELECT * FROM main.frames WHERE 0;
CREATE TABLE IF NOT EXISTS nas.elements AS SELECT * FROM main.elements WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ui_events AS SELECT * FROM main.ui_events WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ocr_text AS SELECT * FROM main.ocr_text WHERE 0;
CREATE TABLE IF NOT EXISTS nas.video_chunks AS SELECT * FROM main.video_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.meetings AS SELECT * FROM main.meetings WHERE 0;
DETACH nas;
"
run_sqlite_heredoc "creating audio tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.audio_chunks AS SELECT * FROM main.audio_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_transcriptions AS SELECT * FROM main.audio_transcriptions WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speakers AS SELECT * FROM main.speakers WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speaker_embeddings AS SELECT * FROM main.speaker_embeddings WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_tags AS SELECT * FROM main.audio_tags WHERE 0;
DETACH nas;
"
# Re-run install_id + index setup so freshly-created tables get them too.
for tbl in "${ALL_SYNC_TABLES[@]}"; do
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
# ─── SCHEMA DRIFT ─────────────────────────────────────────────────────────
step "Reconciling NAS schema with source"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
ensure_columns "$tbl"
done
run_sqlite_heredoc "creating indexes" "
ATTACH '$NAS_DB' AS nas;
-- vision
CREATE INDEX IF NOT EXISTS nas.idx_frames_timestamp ON frames(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_frames_app_name ON frames(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_window_name ON frames(window_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_video_chunk_id ON frames(video_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_frames_document_path ON frames(document_path) WHERE document_path IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_id ON elements(frame_...
|
Code
|
Implement and run pipe l… — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
iTerm2ShellEditViewSessionScriptsProfilesWindowHelpDOCKER-rw-r--r=--rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r---rw-r--r--₴811 lukaslukaslukas111lukaslukaslukaslukaslukaslukaslukas1lukaslukaslukaslukaslukaslukaslukas1lukas1lukas1 lukaslukaslukaslukas1lukaslukaslukaslukas1lukaslukaslukaslukaslukaslukaslukaslukaslukaslukaslukaslukaslukaslukaslukaslukaslukaslukasstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaffstaff-zshDEV (docker)₴2APP (-zsh)883-zsh|728611May14:52SystemAudio (output)_2026-05-11_11-51-57.mp4462011May14:52462011SystemAudio (output)_2026-05-11_11-52-19.mp4May14:53SystemAudio (output)_2026-05-11_11-52-42.mp4462811May14:53SystemAudio (output)_2026-05-11_11-53-05.mp4462011May14:53462811SystemAudio (output)_2026-05-11_11-53-27.mp4May14:54SystemAudio (output)_2026-05-11_11-53-49.mp4462011May14:54SystemAudio (output)_2026-05-11_11-54-12.mp4462011May14:54SystemAudio (output)_2026-05-11_11-54-34.mp4462811May14:55SystemAudio (output)_2026-05-11_11-54-57.mp4462011May14:55SystemAudio (output)_2026-05-11_11-55-19.mp4462011May14:56System462011May14:56SystemAudio (output)_2026-05-11_11-55-42.mp4Audio (output)_2026-05-11_11-56-04.mp4462011May14:56SystemAudio (output)_2026-05-11_11-56-26.mp4462011May14:57SystemAudio (output)_2026-05-11_11-56-49.mp4462011May14:57462011May14:57SystemAudio (output)_2026-05-11_11-57-11.mp4SystemAudio (output)_2026-05-11_11-57-34.mp4462011May14:58System462011Audio (output)_2026-05-11_11-57-56.mp4May14:58SystemAudio (output)_2026-05-11_11-58-19.mp4462011May14:59System Audio (output)_2026-05-11_11-58-41.mp42105011May14:59System Audio (output)_2026-05-11_11-59-04.mp4462811May14:59462011May15:00System Audio (output)_2026-05-11_11-59-26.mp4462011System Audio (output)_2026-05-11_11-59-49.mp4May15:00System Audio (output)_2026-05-11_12-00-11.mp44628 11 May15:00SystemAudio (output)_2026-05-11_12-00-34.mp4462011May15:01462011SystemAudio (output)_2026-05-11_12-00-56.mp4May15:01SystemAudio (output)_2026-05-11_12-01-19.mp4462811May15:02SystemAudio (output)_2026-05-11_12-01-41.mp4462811May15:02SystemAudio (output)_2026-05-11_12-02-04.mp4462811May15:02SystemAudio (output)_2026-05-11_12-02-27.mp4462011May15:03System Audio (output)_2026-05-11_12-02-49.mp4462811May15:03System Audio (output)_2026-05-11_12-03-12.mp4462011May15:03System Audio (output)_2026-05-11_12-03-35.mp4462811May15:04System Audio (output)_2026-05-11_12-03-57.mp4462011May15:04System Audio (output)_2026-05-11_12-04-20.mp4462011May15:05System Audio (output)_2026-05-11_12-04-42.mp4462011May15:05System Audio (output)_2026-05-11_12-05-04.mp4462011May15:05System Audio (output)_2026-05-11_12-05-27.mp4462011May15:06System Audio (output)_2026-05-11_12-05-49.mp4462011May15:06SystemAudio (output)_2026-05-11_12-06-12.mp4462011May15:06System Audio (output)_2026-05-11_12-06-35.mp4462811May15:07System Audio (output)_2026-05-11_12-06-57.mp4462011May15:07System Audio (output)_2026-05-11_12-07-20.mp4462011May15:08System Audio (output)_2026-05-11_12-07-42.mp4462011 May15:08System Audio (output)_2026-05-11_12-08-05 .mp4462811 May15:08 System Audio (output)_2026-05-11_12-08-27.mp484-zsh*5screenpipe"100% <8• Mon 11 May 20:50:37181O ₴6-zshX7...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
__pycache__
consumers
static
.env
cli.py
M
db.py
docker-compose.yml
Dockerfile
M
main.py
M
requirements.txt
M
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
#!/bin/bash
# screenpipe_sync.sh
# Syncs Screenpipe SQLite data (vision + audio) to a NAS archive database.
# Append-only, no deletions.
#
# Key design points
# -----------------
# 1. Multi-install safe via install_id.
# Source IDs reset to 1 on every screenpipe reinstall. To avoid collisions
# in the NAS archive, every synced table gets an extra `install_id` column,
# and the logical primary key becomes (install_id, id) enforced by a
# unique index. The install_id is a UUID stored in
# ~/.screenpipe/.sync_install_id — wiping ~/.screenpipe/ (which is what
# happens on reinstall) discards it, so the next run generates a new one.
#
# 2. Schema-drift tolerant. If screenpipe migrations add new columns to the
# source DB, the NAS gets ALTER TABLE'd to match. Inserts use explicit
# column lists so positional mismatches can't occur.
#
# 3. FTS caveat. FTS tables in the NAS use source `id` as rowid. After a
# reinstall, INSERT OR IGNORE will silently skip rows whose id collides
# with a previous install's id, so FTS only reliably indexes the most
# recent install. Falls back to LIKE queries on the base tables for
# multi-install searches (which can filter by install_id).
#
# Usage
# -----
# ./screenpipe_sync.sh # syncs yesterday
# ./screenpipe_sync.sh 2026-04-15 # syncs a specific date
# ./screenpipe_sync.sh today # syncs today so far
# ./screenpipe_sync.sh --reset-install-id # rotate install_id and exit
# ./screenpipe_sync.sh --show-install-id # print install_id and exit
set -euo pipefail
# ─── CONFIG ───────────────────────────────────────────────────────────────────
DB_SRC="${SCREENPIPE_DB:-$HOME/.screenpipe/db.sqlite}"
NAS_MOUNT="${NAS_MOUNT:-/Volumes/screenpipe}"
NAS_DB="$NAS_MOUNT/archive.db"
NAS_DATA="$NAS_MOUNT/data"
LOG_FILE="$HOME/.screenpipe/sync.log"
INSTALL_ID_FILE="$HOME/.screenpipe/.sync_install_id"
# Sync table groups. Order matters for FK-ish references
# (parents before children).
VISION_TABLES=(video_chunks frames elements ocr_text ui_events meetings)
AUDIO_TABLES=(speakers speaker_embeddings audio_chunks audio_transcriptions audio_tags)
ALL_SYNC_TABLES=("${VISION_TABLES[@]}" "${AUDIO_TABLES[@]}")
# ──────────────────────────────────────────────────────────────────────────────
SCRIPT_START=$(date +%s)
# ─── HELPERS ──────────────────────────────────────────────────────────────────
log() {
local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*"
echo "$msg" | tee -a "$LOG_FILE"
}
step() {
local now=$(date +%s)
local elapsed=$(( now - SCRIPT_START ))
local min=$(( elapsed / 60 ))
local sec=$(( elapsed % 60 ))
printf "\n[+%02dm%02ds] ▶ %s\n" "$min" "$sec" "$*" | tee -a "$LOG_FILE"
}
run_sqlite_heredoc() {
local label="$1"
local sql="$2"
local start=$(date +%s)
printf " %-36s " "$label"
sqlite3 "$DB_SRC" <<< "$sql" &
local pid=$!
local spin=[PASSWORD] '⠙' '⠹' '⠸' '⠼' '⠴' '⠦' '⠧' '⠇' '⠏')
local i=0
while kill -0 "$pid" 2>/dev/null; do
printf "\r %-36s %s " "$label" "${spin[$i]}"
i=$(( (i + 1) % 10 ))
sleep 0.2
done
wait "$pid"
local rc=$?
if [ $rc -ne 0 ]; then
printf "\r %-36s ✗ FAILED\n" "$label" | tee -a "$LOG_FILE"
exit $rc
fi
local dur=$(( $(date +%s) - start ))
printf "\r %-36s ✓ %dm%02ds\n" "$label" "$(( dur / 60 ))" "$(( dur % 60 ))" | tee -a "$LOG_FILE"
}
check() {
local label="$1" got="$2" expected="$3"
if [ "$got" -eq "$expected" ]; then
printf " %-25s %s / %s ✓\n" "$label:" "$got" "$expected"
else
printf " %-25s %s / %s ✗ MISMATCH\n" "$label:" "$got" "$expected"
fi
}
table_columns_with_types() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2 "|" $3}'
}
table_columns() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2}'
}
table_exists() {
local db="$1" table="$2"
local count
count=$(sqlite3 "$db" "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='$table';")
[ "$count" -gt 0 ]
}
# Adds any columns present in source but missing in NAS for the given table.
# Skips install_id (which is NAS-only and managed separately).
ensure_columns() {
local table="$1"
local label="schema: $table"
printf " %-36s " "$label"
if ! table_exists "$DB_SRC" "$table"; then
printf "✗ source missing — skipping\n"
return 0
fi
if ! table_exists "$NAS_DB" "$table"; then
printf "✓ fresh (created above)\n"
return 0
fi
local src_cols
src_cols=$(table_columns_with_types "$DB_SRC" "$table")
local nas_cols
nas_cols=$(table_columns "$NAS_DB" "$table")
local added=0
local added_names=""
while IFS='|' read -r name type; do
[ -z "$name" ] && continue
if ! echo "$nas_cols" | grep -Fxq "$name"; then
sqlite3 "$NAS_DB" "ALTER TABLE $table ADD COLUMN \"$name\" $type;"
added=$((added + 1))
added_names="$added_names $name"
fi
done <<< "$src_cols"
if [ "$added" -gt 0 ]; then
printf "✓ added %d:%s\n" "$added" "$added_names"
else
printf "✓ in sync\n"
fi
}
# Comma-separated, double-quoted column list for a table from source DB.
# Optional alias is prefixed (e.g. `o."col"`) for JOIN selects where column
# names would otherwise collide.
build_col_list() {
local table="$1"
local alias="${2:-}"
local prefix=""
[ -n "$alias" ] && prefix="${alias}."
table_columns "$DB_SRC" "$table" | awk 'NF' | awk -v p="$prefix" '{print p "\"" $0 "\""}' | paste -sd, -
}
# ──────────────────────────────────────────────────────────────────────────────
# ─── ARG HANDLING ─────────────────────────────────────────────────────────────
if [ "${1:-}" = "--reset-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
OLD=$(cat "$INSTALL_ID_FILE")
rm -f "$INSTALL_ID_FILE"
echo "Removed install_id: $OLD"
else
echo "No install_id file at $INSTALL_ID_FILE"
fi
echo "A new install_id will be generated on the next sync run."
exit 0
fi
if [ "${1:-}" = "--show-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
cat "$INSTALL_ID_FILE"
else
echo "(none — will be generated on next run)"
fi
exit 0
fi
if [ "${1:-}" = "today" ]; then
TARGET_DATE=$(date +%Y-%m-%d)
elif [ -n "${1:-}" ]; then
TARGET_DATE="$1"
if ! [[ "$TARGET_DATE" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
echo "ERROR: Invalid date. Use YYYY-MM-DD, 'today', or no argument for yesterday."
exit 1
fi
else
TARGET_DATE=$(date -v-1d +%Y-%m-%d)
fi
# ─── INSTALL ID ───────────────────────────────────────────────────────────────
INSTALL_ID=""
if [ -f "$INSTALL_ID_FILE" ]; then
INSTALL_ID=$(tr -d '[:space:]' < "$INSTALL_ID_FILE")
fi
if [ -z "$INSTALL_ID" ]; then
INSTALL_ID=$(uuidgen | tr 'A-Z' 'a-z')
echo "$INSTALL_ID" > "$INSTALL_ID_FILE"
log "Generated new install_id: $INSTALL_ID"
fi
log "========================================"
log "Screenpipe sync starting for: $TARGET_DATE"
log "install_id: $INSTALL_ID"
log "========================================"
# ─── PREFLIGHT ────────────────────────────────────────────────────────────────
step "Preflight checks"
if [ ! -f "$DB_SRC" ]; then
log "ERROR: Source DB not found at $DB_SRC"; exit 1
fi
printf " %-25s %s (%s)\n" "Source DB:" "OK" "$(du -sh "$DB_SRC" | cut -f1)"
if [ ! -d "$NAS_MOUNT" ]; then
log "ERROR: NAS not mounted at $NAS_MOUNT"; exit 1
fi
printf " %-25s %s\n" "NAS mount:" "OK $NAS_MOUNT"
# Check if DB already synced for this date+install_id
DB_ALREADY_SYNCED=false
if [ -f "$NAS_DB" ]; then
if table_exists "$NAS_DB" "frames"; then
HAS_INSTALL_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('frames') WHERE name='install_id';")
if [ "$HAS_INSTALL_COL" -gt "0" ]; then
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE' AND install_id='$INSTALL_ID';" 2>/dev/null || echo "0")
else
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE';" 2>/dev/null || echo "0")
fi
if [ "$EXISTING" -gt "0" ]; then
log "Date $TARGET_DATE / install $INSTALL_ID already has $EXISTING frames — skipping DB sync"
DB_ALREADY_SYNCED=true
else
printf " %-25s %s (%s)\n" "Archive DB:" "exists" "$(du -sh "$NAS_DB" | cut -f1)"
fi
else
printf " %-25s %s\n" "Archive DB:" "exists, no frames table yet"
fi
else
printf " %-25s %s\n" "Archive DB:" "will be created"
fi
# Source data dir for this date (video frames)
DATA_SRC="$HOME/.screenpipe/data/data/$TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
DATA_SIZE=$(du -sh "$DATA_SRC" | cut -f1)
DATA_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
printf " %-25s %s (%s files, %s)\n" "Frame data dir:" "OK" "$DATA_FILES" "$DATA_SIZE"
else
printf " %-25s %s\n" "Frame data dir:" "not found — skipping"
fi
# Audio files (flat in ~/.screenpipe/data/, dated by filename)
shopt -s nullglob
AUDIO_SRC_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_SRC_FILES[@]} -gt 0 ]; then
AUDIO_TOTAL=$(du -ch "${AUDIO_SRC_FILES[@]}" | tail -1 | cut -f1)
printf " %-25s %s (%s files, %s)\n" "Audio files:" "OK" "${#AUDIO_SRC_FILES[@]}" "$AUDIO_TOTAL"
else
printf " %-25s %s\n" "Audio files:" "none for this date"
fi
# ─── SCHEMA MIGRATION: install_id ─────────────────────────────────────────────
# Adds install_id column to existing NAS tables, backfills NULLs with a
# legacy tag, and creates the (install_id, id) unique index. Idempotent.
if [ -f "$NAS_DB" ]; then
step "Schema migration: install_id"
LEGACY_TAG="legacy-$(date +%Y%m%d)"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
if ! table_exists "$NAS_DB" "$tbl"; then
continue
fi
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
ROW_COUNT=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM $tbl;")
printf " %-36s adding install_id, backfill %s rows → %s\n" "$tbl" "$ROW_COUNT" "$LEGACY_TAG"
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
sqlite3 "$NAS_DB" "UPDATE $tbl SET install_id = '$LEGACY_TAG' WHERE install_id IS NULL;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
fi
# ─── DB SYNC ──────────────────────────────────────────────────────────────────
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── COUNT SOURCE ROWS ────────────────────────────────────────────────────
step "Counting source rows for $TARGET_DATE"
SRC_FRAMES=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ELEMENTS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM elements WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_UI=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE';")
SRC_OCR=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ocr_text WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_MEETINGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE';")
SRC_ACHUNKS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ATRANS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_transcriptions WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
SRC_ATAGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_tags WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
# speakers + speaker_embeddings are install-global, not per-date; we sync all.
SRC_SPEAKERS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speakers;")
SRC_SEMB=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speaker_embeddings;")
printf " %-25s %s\n" "frames:" "$SRC_FRAMES"
printf " %-25s %s\n" "elements:" "$SRC_ELEMENTS"
printf " %-25s %s\n" "ui_events:" "$SRC_UI"
printf " %-25s %s\n" "ocr_text:" "$SRC_OCR"
printf " %-25s %s\n" "meetings:" "$SRC_MEETINGS"
printf " %-25s %s\n" "audio_chunks:" "$SRC_ACHUNKS"
printf " %-25s %s\n" "audio_transcriptions:" "$SRC_ATRANS"
printf " %-25s %s\n" "audio_tags:" "$SRC_ATAGS"
printf " %-25s %s (all-time)\n" "speakers:" "$SRC_SPEAKERS"
printf " %-25s %s (all-time)\n" "speaker_embeddings:" "$SRC_SEMB"
if [ "$SRC_FRAMES" -eq "0" ] && [ "$SRC_ACHUNKS" -eq "0" ]; then
log "No frames or audio chunks for $TARGET_DATE — skipping DB sync"
DB_ALREADY_SYNCED=true
fi
fi
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── INIT TABLES ──────────────────────────────────────────────────────────
step "Initialising tables (CREATE IF NOT EXISTS)"
run_sqlite_heredoc "creating vision tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.frames AS SELECT * FROM main.frames WHERE 0;
CREATE TABLE IF NOT EXISTS nas.elements AS SELECT * FROM main.elements WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ui_events AS SELECT * FROM main.ui_events WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ocr_text AS SELECT * FROM main.ocr_text WHERE 0;
CREATE TABLE IF NOT EXISTS nas.video_chunks AS SELECT * FROM main.video_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.meetings AS SELECT * FROM main.meetings WHERE 0;
DETACH nas;
"
run_sqlite_heredoc "creating audio tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.audio_chunks AS SELECT * FROM main.audio_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_transcriptions AS SELECT * FROM main.audio_transcriptions WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speakers AS SELECT * FROM main.speakers WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speaker_embeddings AS SELECT * FROM main.speaker_embeddings WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_tags AS SELECT * FROM main.audio_tags WHERE 0;
DETACH nas;
"
# Re-run install_id + index setup so freshly-created tables get them too.
for tbl in "${ALL_SYNC_TABLES[@]}"; do
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
# ─── SCHEMA DRIFT ─────────────────────────────────────────────────────────
step "Reconciling NAS schema with source"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
ensure_columns "$tbl"
done
run_sqlite_heredoc "creating indexes" "
ATTACH '$NAS_DB' AS nas;
-- vision
CREATE INDEX IF NOT EXISTS nas.idx_frames_timestamp ON frames(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_frames_app_name ON frames(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_window_name ON frames(window_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_video_chunk_id ON frames(video_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_frames_document_path ON frames(document_path) WHERE document_path IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_id ON elements(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_src_role ON elements(frame_id, source, role) WHERE text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_onscreen_frame ON elements(frame_id) WHERE on_screen = 1 AND text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_timestamp ON ui_events(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_app_name ON ui_events(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_frame_id ON ui_events(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_ocr_text_frame_id ON ocr_text(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_meetings_start ON meetings(meeting_start);
CREATE INDEX IF NOT EXISTS nas.idx_video_chunks_device ON video_chunks(device_name);
-- audio
CREATE INDEX IF NOT EXISTS nas.idx_audio_chunks_timestamp ON audio_chunks(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_chunk_id ON audio_transcriptions(audio_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_timestamp ON audio_transcriptions(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_speaker ON audio_transcriptions(speaker_id, timestamp DESC);
CREATE INDEX IF NOT EXISTS nas.idx_speaker_emb_speaker_id ON speaker_embeddings(speaker_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_tags_chunk_id ON audio_tags(audio_chunk_id);
DETACH nas;
"
run_sqlite_heredoc "creating FTS tables" "
ATTACH '$NAS_DB' AS nas;
CREATE VIRTUAL TABLE IF NOT EXISTS nas.elements_fts USING fts5(
text, role, frame_id UNINDEXED,
content='elements', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.frames_fts USING fts5(
full_text, app_name, window_name, browser_url, id UNINDEXED,
tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.ui_events_fts USING fts5(
text_content, app_name, window_title, element_name,
content='ui_events', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.audio_transcriptions_fts USING fts5(
transcription, device, speaker_id UNINDEXED, id UNINDEXED,
tokenize='unicode61'
);
DETACH nas;
"
# ─── BUILD EXPLICIT COLUMN LISTS ──────────────────────────────────────────
# Source columns + install_id appended. Same on both sides of the INSERT.
FRAMES_COLS=$(build_col_list frames)
ELEMENTS_COLS=$(build_col_list elements)
ELEMENTS_COLS_E=$(build_col_list elements e)
UI_EVENTS_COLS=$(build_col_list ui_events)
OCR_TEXT_COLS=$(build_col_list ocr_text)
OCR_TEXT_COLS_O=$(build_col_list ocr_text o)
VIDEO_CHUNKS_COLS=$(build_col_list video_chunks)
MEETINGS_COLS=$(build_col_list meetings)
ACHUNKS_COLS=$(build_col_list audio_chunks)
ATRANS_COLS=$(build_col_list audio_transcriptions)
ATRANS_COLS_T=$(build_col_list audio_transcriptions t)
SPEAKERS_COLS=$(build_col_list speakers)
SEMB_COLS=$(build_col_list speaker_embeddings)
ATAGS_COLS=$(build_col_list audio_tags)
ATAGS_COLS_AT=$(build_col_list audio_tags at)
# ─── SYNC VISION DATA ─────────────────────────────────────────────────────
step "Syncing vision data for $TARGET_DATE"
run_sqlite_heredoc "video_chunks" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.video_chunks ($VIDEO_CHUNKS_COLS, install_id)
SELECT $VIDEO_CHUNKS_COLS, '$INSTALL_ID' FROM main.video_chunks
WHERE id IN (
SELECT DISTINCT video_chunk_id FROM main.frames
WHERE date(timestamp) = '$TARGET_DATE' AND video_chunk_id IS NOT NULL
);
DETACH nas;
"
run_sqlite_heredoc "frames ($SRC_FRAMES rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames ($FRAMES_COLS, install_id)
SELECT $FRAMES_COLS, '$INSTALL_ID' FROM main.frames WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ocr_text ($SRC_OCR rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ocr_text ($OCR_TEXT_COLS, install_id)
SELECT $OCR_TEXT_COLS_O, '$INSTALL_ID' FROM main.ocr_text o
JOIN main.frames f ON o.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ui_events ($SRC_UI rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events ($UI_EVENTS_COLS, install_id)
SELECT $UI_EVENTS_COLS, '$INSTALL_ID' FROM main.ui_events WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "elements ($SRC_ELEMENTS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements ($ELEMENTS_COLS, install_id)
SELECT $ELEMENTS_COLS_E, '$INSTALL_ID' FROM main.elements e
JOIN main.frames f ON e.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "meetings ($SRC_MEETINGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.meetings ($MEETINGS_COLS, install_id)
SELECT $MEETINGS_COLS, '$INSTALL_ID' FROM main.meetings WHERE date(meeting_start) = '$TARGET_DATE';
DETACH nas;
"
# ─── SYNC AUDIO DATA ──────────────────────────────────────────────────────
step "Syncing audio data for $TARGET_DATE"
# Speakers + embeddings are install-global, not per-date. Sync everything
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
#!/bin/bash
# screenpipe_sync.sh
# Syncs Screenpipe SQLite data (vision + audio) to a NAS archive database.
# Append-only, no deletions.
#
# Key design points
# -----------------
# 1. Multi-install safe via install_id.
# Source IDs reset to 1 on every screenpipe reinstall. To avoid collisions
# in the NAS archive, every synced table gets an extra `install_id` column,
# and the logical primary key becomes (install_id, id) enforced by a
# unique index. The install_id is a UUID stored in
# ~/.screenpipe/.sync_install_id — wiping ~/.screenpipe/ (which is what
# happens on reinstall) discards it, so the next run generates a new one.
#
# 2. Schema-drift tolerant. If screenpipe migrations add new columns to the
# source DB, the NAS gets ALTER TABLE'd to match. Inserts use explicit
# column lists so positional mismatches can't occur.
#
# 3. FTS caveat. FTS tables in the NAS use source `id` as rowid. After a
# reinstall, INSERT OR IGNORE will silently skip rows whose id collides
# with a previous install's id, so FTS only reliably indexes the most
# recent install. Falls back to LIKE queries on the base tables for
# multi-install searches (which can filter by install_id).
#
# Usage
# -----
# ./screenpipe_sync.sh # syncs yesterday
# ./screenpipe_sync.sh 2026-04-15 # syncs a specific date
# ./screenpipe_sync.sh today # syncs today so far
# ./screenpipe_sync.sh --reset-install-id # rotate install_id and exit
# ./screenpipe_sync.sh --show-install-id # print install_id and exit
set -euo pipefail
# ─── CONFIG ───────────────────────────────────────────────────────────────────
DB_SRC="${SCREENPIPE_DB:-$HOME/.screenpipe/db.sqlite}"
NAS_MOUNT="${NAS_MOUNT:-/Volumes/screenpipe}"
NAS_DB="$NAS_MOUNT/archive.db"
NAS_DATA="$NAS_MOUNT/data"
LOG_FILE="$HOME/.screenpipe/sync.log"
INSTALL_ID_FILE="$HOME/.screenpipe/.sync_install_id"
# Sync table groups. Order matters for FK-ish references
# (parents before children).
VISION_TABLES=(video_chunks frames elements ocr_text ui_events meetings)
AUDIO_TABLES=(speakers speaker_embeddings audio_chunks audio_transcriptions audio_tags)
ALL_SYNC_TABLES=("${VISION_TABLES[@]}" "${AUDIO_TABLES[@]}")
# ──────────────────────────────────────────────────────────────────────────────
SCRIPT_START=$(date +%s)
# ─── HELPERS ──────────────────────────────────────────────────────────────────
log() {
local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*"
echo "$msg" | tee -a "$LOG_FILE"
}
step() {
local now=$(date +%s)
local elapsed=$(( now - SCRIPT_START ))
local min=$(( elapsed / 60 ))
local sec=$(( elapsed % 60 ))
printf "\n[+%02dm%02ds] ▶ %s\n" "$min" "$sec" "$*" | tee -a "$LOG_FILE"
}
run_sqlite_heredoc() {
local label="$1"
local sql="$2"
local start=$(date +%s)
printf " %-36s " "$label"
sqlite3 "$DB_SRC" <<< "$sql" &
local pid=$!
local spin=[PASSWORD] '⠙' '⠹' '⠸' '⠼' '⠴' '⠦' '⠧' '⠇' '⠏')
local i=0
while kill -0 "$pid" 2>/dev/null; do
printf "\r %-36s %s " "$label" "${spin[$i]}"
i=$(( (i + 1) % 10 ))
sleep 0.2
done
wait "$pid"
local rc=$?
if [ $rc -ne 0 ]; then
printf "\r %-36s ✗ FAILED\n" "$label" | tee -a "$LOG_FILE"
exit $rc
fi
local dur=$(( $(date +%s) - start ))
printf "\r %-36s ✓ %dm%02ds\n" "$label" "$(( dur / 60 ))" "$(( dur % 60 ))" | tee -a "$LOG_FILE"
}
check() {
local label="$1" got="$2" expected="$3"
if [ "$got" -eq "$expected" ]; then
printf " %-25s %s / %s ✓\n" "$label:" "$got" "$expected"
else
printf " %-25s %s / %s ✗ MISMATCH\n" "$label:" "$got" "$expected"
fi
}
table_columns_with_types() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2 "|" $3}'
}
table_columns() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2}'
}
table_exists() {
local db="$1" table="$2"
local count
count=$(sqlite3 "$db" "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='$table';")
[ "$count" -gt 0 ]
}
# Adds any columns present in source but missing in NAS for the given table.
# Skips install_id (which is NAS-only and managed separately).
ensure_columns() {
local table="$1"
local label="schema: $table"
printf " %-36s " "$label"
if ! table_exists "$DB_SRC" "$table"; then
printf "✗ source missing — skipping\n"
return 0
fi
if ! table_exists "$NAS_DB" "$table"; then
printf "✓ fresh (created above)\n"
return 0
fi
local src_cols
src_cols=$(table_columns_with_types "$DB_SRC" "$table")
local nas_cols
nas_cols=$(table_columns "$NAS_DB" "$table")
local added=0
local added_names=""
while IFS='|' read -r name type; do
[ -z "$name" ] && continue
if ! echo "$nas_cols" | grep -Fxq "$name"; then
sqlite3 "$NAS_DB" "ALTER TABLE $table ADD COLUMN \"$name\" $type;"
added=$((added + 1))
added_names="$added_names $name"
fi
done <<< "$src_cols"
if [ "$added" -gt 0 ]; then
printf "✓ added %d:%s\n" "$added" "$added_names"
else
printf "✓ in sync\n"
fi
}
# Comma-separated, double-quoted column list for a table from source DB.
# Optional alias is prefixed (e.g. `o."col"`) for JOIN selects where column
# names would otherwise collide.
build_col_list() {
local table="$1"
local alias="${2:-}"
local prefix=""
[ -n "$alias" ] && prefix="${alias}."
table_columns "$DB_SRC" "$table" | awk 'NF' | awk -v p="$prefix" '{print p "\"" $0 "\""}' | paste -sd, -
}
# ──────────────────────────────────────────────────────────────────────────────
# ─── ARG HANDLING ─────────────────────────────────────────────────────────────
if [ "${1:-}" = "--reset-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
OLD=$(cat "$INSTALL_ID_FILE")
rm -f "$INSTALL_ID_FILE"
echo "Removed install_id: $OLD"
else
echo "No install_id file at $INSTALL_ID_FILE"
fi
echo "A new install_id will be generated on the next sync run."
exit 0
fi
if [ "${1:-}" = "--show-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
cat "$INSTALL_ID_FILE"
else
echo "(none — will be generated on next run)"
fi
exit 0
fi
if [ "${1:-}" = "today" ]; then
TARGET_DATE=$(date +%Y-%m-%d)
elif [ -n "${1:-}" ]; then
TARGET_DATE="$1"
if ! [[ "$TARGET_DATE" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
echo "ERROR: Invalid date. Use YYYY-MM-DD, 'today', or no argument for yesterday."
exit 1
fi
else
TARGET_DATE=$(date -v-1d +%Y-%m-%d)
fi
# ─── INSTALL ID ───────────────────────────────────────────────────────────────
INSTALL_ID=""
if [ -f "$INSTALL_ID_FILE" ]; then
INSTALL_ID=$(tr -d '[:space:]' < "$INSTALL_ID_FILE")
fi
if [ -z "$INSTALL_ID" ]; then
INSTALL_ID=$(uuidgen | tr 'A-Z' 'a-z')
echo "$INSTALL_ID" > "$INSTALL_ID_FILE"
log "Generated new install_id: $INSTALL_ID"
fi
log "========================================"
log "Screenpipe sync starting for: $TARGET_DATE"
log "install_id: $INSTALL_ID"
log "========================================"
# ─── PREFLIGHT ────────────────────────────────────────────────────────────────
step "Preflight checks"
if [ ! -f "$DB_SRC" ]; then
log "ERROR: Source DB not found at $DB_SRC"; exit 1
fi
printf " %-25s %s (%s)\n" "Source DB:" "OK" "$(du -sh "$DB_SRC" | cut -f1)"
if [ ! -d "$NAS_MOUNT" ]; then
log "ERROR: NAS not mounted at $NAS_MOUNT"; exit 1
fi
printf " %-25s %s\n" "NAS mount:" "OK $NAS_MOUNT"
# Check if DB already synced for this date+install_id
DB_ALREADY_SYNCED=false
if [ -f "$NAS_DB" ]; then
if table_exists "$NAS_DB" "frames"; then
HAS_INSTALL_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('frames') WHERE name='install_id';")
if [ "$HAS_INSTALL_COL" -gt "0" ]; then
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE' AND install_id='$INSTALL_ID';" 2>/dev/null || echo "0")
else
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE';" 2>/dev/null || echo "0")
fi
if [ "$EXISTING" -gt "0" ]; then
log "Date $TARGET_DATE / install $INSTALL_ID already has $EXISTING frames — skipping DB sync"
DB_ALREADY_SYNCED=true
else
printf " %-25s %s (%s)\n" "Archive DB:" "exists" "$(du -sh "$NAS_DB" | cut -f1)"
fi
else
printf " %-25s %s\n" "Archive DB:" "exists, no frames table yet"
fi
else
printf " %-25s %s\n" "Archive DB:" "will be created"
fi
# Source data dir for this date (video frames)
DATA_SRC="$HOME/.screenpipe/data/data/$TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
DATA_SIZE=$(du -sh "$DATA_SRC" | cut -f1)
DATA_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
printf " %-25s %s (%s files, %s)\n" "Frame data dir:" "OK" "$DATA_FILES" "$DATA_SIZE"
else
printf " %-25s %s\n" "Frame data dir:" "not found — skipping"
fi
# Audio files (flat in ~/.screenpipe/data/, dated by filename)
shopt -s nullglob
AUDIO_SRC_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_SRC_FILES[@]} -gt 0 ]; then
AUDIO_TOTAL=$(du -ch "${AUDIO_SRC_FILES[@]}" | tail -1 | cut -f1)
printf " %-25s %s (%s files, %s)\n" "Audio files:" "OK" "${#AUDIO_SRC_FILES[@]}" "$AUDIO_TOTAL"
else
printf " %-25s %s\n" "Audio files:" "none for this date"
fi
# ─── SCHEMA MIGRATION: install_id ─────────────────────────────────────────────
# Adds install_id column to existing NAS tables, backfills NULLs with a
# legacy tag, and creates the (install_id, id) unique index. Idempotent.
if [ -f "$NAS_DB" ]; then
step "Schema migration: install_id"
LEGACY_TAG="legacy-$(date +%Y%m%d)"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
if ! table_exists "$NAS_DB" "$tbl"; then
continue
fi
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
ROW_COUNT=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM $tbl;")
printf " %-36s adding install_id, backfill %s rows → %s\n" "$tbl" "$ROW_COUNT" "$LEGACY_TAG"
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
sqlite3 "$NAS_DB" "UPDATE $tbl SET install_id = '$LEGACY_TAG' WHERE install_id IS NULL;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
fi
# ─── DB SYNC ──────────────────────────────────────────────────────────────────
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── COUNT SOURCE ROWS ────────────────────────────────────────────────────
step "Counting source rows for $TARGET_DATE"
SRC_FRAMES=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ELEMENTS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM elements WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_UI=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE';")
SRC_OCR=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ocr_text WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_MEETINGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE';")
SRC_ACHUNKS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ATRANS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_transcriptions WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
SRC_ATAGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_tags WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
# speakers + speaker_embeddings are install-global, not per-date; we sync all.
SRC_SPEAKERS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speakers;")
SRC_SEMB=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speaker_embeddings;")
printf " %-25s %s\n" "frames:" "$SRC_FRAMES"
printf " %-25s %s\n" "elements:" "$SRC_ELEMENTS"
printf " %-25s %s\n" "ui_events:" "$SRC_UI"
printf " %-25s %s\n" "ocr_text:" "$SRC_OCR"
printf " %-25s %s\n" "meetings:" "$SRC_MEETINGS"
printf " %-25s %s\n" "audio_chunks:" "$SRC_ACHUNKS"
printf " %-25s %s\n" "audio_transcriptions:" "$SRC_ATRANS"
printf " %-25s %s\n" "audio_tags:" "$SRC_ATAGS"
printf " %-25s %s (all-time)\n" "speakers:" "$SRC_SPEAKERS"
printf " %-25s %s (all-time)\n" "speaker_embeddings:" "$SRC_SEMB"
if [ "$SRC_FRAMES" -eq "0" ] && [ "$SRC_ACHUNKS" -eq "0" ]; then
log "No frames or audio chunks for $TARGET_DATE — skipping DB sync"
DB_ALREADY_SYNCED=true
fi
fi
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── INIT TABLES ──────────────────────────────────────────────────────────
step "Initialising tables (CREATE IF NOT EXISTS)"
run_sqlite_heredoc "creating vision tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.frames AS SELECT * FROM main.frames WHERE 0;
CREATE TABLE IF NOT EXISTS nas.elements AS SELECT * FROM main.elements WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ui_events AS SELECT * FROM main.ui_events WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ocr_text AS SELECT * FROM main.ocr_text WHERE 0;
CREATE TABLE IF NOT EXISTS nas.video_chunks AS SELECT * FROM main.video_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.meetings AS SELECT * FROM main.meetings WHERE 0;
DETACH nas;
"
run_sqlite_heredoc "creating audio tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.audio_chunks AS SELECT * FROM main.audio_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_transcriptions AS SELECT * FROM main.audio_transcriptions WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speakers AS SELECT * FROM main.speakers WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speaker_embeddings AS SELECT * FROM main.speaker_embeddings WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_tags AS SELECT * FROM main.audio_tags WHERE 0;
DETACH nas;
"
# Re-run install_id + index setup so freshly-created tables get them too.
for tbl in "${ALL_SYNC_TABLES[@]}"; do
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
# ─── SCHEMA DRIFT ─────────────────────────────────────────────────────────
step "Reconciling NAS schema with source"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
ensure_columns "$tbl"
done
run_sqlite_heredoc "creating indexes" "
ATTACH '$NAS_DB' AS nas;
-- vision
CREATE INDEX IF NOT EXISTS nas.idx_frames_timestamp ON frames(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_frames_app_name ON frames(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_window_name ON frames(window_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_video_chunk_id ON frames(video_chunk_id);
CREATE INDEX IF N...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
__pycache__
consumers
static
.env
cli.py
M
db.py
docker-compose.yml
Dockerfile
M
main.py
M
requirements.txt
M
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
#!/bin/bash
# screenpipe_sync.sh
# Syncs Screenpipe SQLite data (vision + audio) to a NAS archive database.
# Append-only, no deletions.
#
# Key design points
# -----------------
# 1. Multi-install safe via install_id.
# Source IDs reset to 1 on every screenpipe reinstall. To avoid collisions
# in the NAS archive, every synced table gets an extra `install_id` column,
# and the logical primary key becomes (install_id, id) enforced by a
# unique index. The install_id is a UUID stored in
# ~/.screenpipe/.sync_install_id — wiping ~/.screenpipe/ (which is what
# happens on reinstall) discards it, so the next run generates a new one.
#
# 2. Schema-drift tolerant. If screenpipe migrations add new columns to the
# source DB, the NAS gets ALTER TABLE'd to match. Inserts use explicit
# column lists so positional mismatches can't occur.
#
# 3. FTS caveat. FTS tables in the NAS use source `id` as rowid. After a
# reinstall, INSERT OR IGNORE will silently skip rows whose id collides
# with a previous install's id, so FTS only reliably indexes the most
# recent install. Falls back to LIKE queries on the base tables for
# multi-install searches (which can filter by install_id).
#
# Usage
# -----
# ./screenpipe_sync.sh # syncs yesterday
# ./screenpipe_sync.sh 2026-04-15 # syncs a specific date
# ./screenpipe_sync.sh today # syncs today so far
# ./screenpipe_sync.sh --reset-install-id # rotate install_id and exit
# ./screenpipe_sync.sh --show-install-id # print install_id and exit
set -euo pipefail
# ─── CONFIG ───────────────────────────────────────────────────────────────────
DB_SRC="${SCREENPIPE_DB:-$HOME/.screenpipe/db.sqlite}"
NAS_MOUNT="${NAS_MOUNT:-/Volumes/screenpipe}"
NAS_DB="$NAS_MOUNT/archive.db"
NAS_DATA="$NAS_MOUNT/data"
LOG_FILE="$HOME/.screenpipe/sync.log"
INSTALL_ID_FILE="$HOME/.screenpipe/.sync_install_id"
# Sync table groups. Order matters for FK-ish references
# (parents before children).
VISION_TABLES=(video_chunks frames elements ocr_text ui_events meetings)
AUDIO_TABLES=(speakers speaker_embeddings audio_chunks audio_transcriptions audio_tags)
ALL_SYNC_TABLES=("${VISION_TABLES[@]}" "${AUDIO_TABLES[@]}")
# ──────────────────────────────────────────────────────────────────────────────
SCRIPT_START=$(date +%s)
# ─── HELPERS ──────────────────────────────────────────────────────────────────
log() {
local msg="[$(date '+%Y-%m-%d %H:%M:%S')] $*"
echo "$msg" | tee -a "$LOG_FILE"
}
step() {
local now=$(date +%s)
local elapsed=$(( now - SCRIPT_START ))
local min=$(( elapsed / 60 ))
local sec=$(( elapsed % 60 ))
printf "\n[+%02dm%02ds] ▶ %s\n" "$min" "$sec" "$*" | tee -a "$LOG_FILE"
}
run_sqlite_heredoc() {
local label="$1"
local sql="$2"
local start=$(date +%s)
printf " %-36s " "$label"
sqlite3 "$DB_SRC" <<< "$sql" &
local pid=$!
local spin=[PASSWORD] '⠙' '⠹' '⠸' '⠼' '⠴' '⠦' '⠧' '⠇' '⠏')
local i=0
while kill -0 "$pid" 2>/dev/null; do
printf "\r %-36s %s " "$label" "${spin[$i]}"
i=$(( (i + 1) % 10 ))
sleep 0.2
done
wait "$pid"
local rc=$?
if [ $rc -ne 0 ]; then
printf "\r %-36s ✗ FAILED\n" "$label" | tee -a "$LOG_FILE"
exit $rc
fi
local dur=$(( $(date +%s) - start ))
printf "\r %-36s ✓ %dm%02ds\n" "$label" "$(( dur / 60 ))" "$(( dur % 60 ))" | tee -a "$LOG_FILE"
}
check() {
local label="$1" got="$2" expected="$3"
if [ "$got" -eq "$expected" ]; then
printf " %-25s %s / %s ✓\n" "$label:" "$got" "$expected"
else
printf " %-25s %s / %s ✗ MISMATCH\n" "$label:" "$got" "$expected"
fi
}
table_columns_with_types() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2 "|" $3}'
}
table_columns() {
local db="$1" table="$2"
sqlite3 "$db" "PRAGMA table_info($table);" | awk -F'|' '{print $2}'
}
table_exists() {
local db="$1" table="$2"
local count
count=$(sqlite3 "$db" "SELECT COUNT(*) FROM sqlite_master WHERE type='table' AND name='$table';")
[ "$count" -gt 0 ]
}
# Adds any columns present in source but missing in NAS for the given table.
# Skips install_id (which is NAS-only and managed separately).
ensure_columns() {
local table="$1"
local label="schema: $table"
printf " %-36s " "$label"
if ! table_exists "$DB_SRC" "$table"; then
printf "✗ source missing — skipping\n"
return 0
fi
if ! table_exists "$NAS_DB" "$table"; then
printf "✓ fresh (created above)\n"
return 0
fi
local src_cols
src_cols=$(table_columns_with_types "$DB_SRC" "$table")
local nas_cols
nas_cols=$(table_columns "$NAS_DB" "$table")
local added=0
local added_names=""
while IFS='|' read -r name type; do
[ -z "$name" ] && continue
if ! echo "$nas_cols" | grep -Fxq "$name"; then
sqlite3 "$NAS_DB" "ALTER TABLE $table ADD COLUMN \"$name\" $type;"
added=$((added + 1))
added_names="$added_names $name"
fi
done <<< "$src_cols"
if [ "$added" -gt 0 ]; then
printf "✓ added %d:%s\n" "$added" "$added_names"
else
printf "✓ in sync\n"
fi
}
# Comma-separated, double-quoted column list for a table from source DB.
# Optional alias is prefixed (e.g. `o."col"`) for JOIN selects where column
# names would otherwise collide.
build_col_list() {
local table="$1"
local alias="${2:-}"
local prefix=""
[ -n "$alias" ] && prefix="${alias}."
table_columns "$DB_SRC" "$table" | awk 'NF' | awk -v p="$prefix" '{print p "\"" $0 "\""}' | paste -sd, -
}
# ──────────────────────────────────────────────────────────────────────────────
# ─── ARG HANDLING ─────────────────────────────────────────────────────────────
if [ "${1:-}" = "--reset-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
OLD=$(cat "$INSTALL_ID_FILE")
rm -f "$INSTALL_ID_FILE"
echo "Removed install_id: $OLD"
else
echo "No install_id file at $INSTALL_ID_FILE"
fi
echo "A new install_id will be generated on the next sync run."
exit 0
fi
if [ "${1:-}" = "--show-install-id" ]; then
if [ -f "$INSTALL_ID_FILE" ]; then
cat "$INSTALL_ID_FILE"
else
echo "(none — will be generated on next run)"
fi
exit 0
fi
if [ "${1:-}" = "today" ]; then
TARGET_DATE=$(date +%Y-%m-%d)
elif [ -n "${1:-}" ]; then
TARGET_DATE="$1"
if ! [[ "$TARGET_DATE" =~ ^[0-9]{4}-[0-9]{2}-[0-9]{2}$ ]]; then
echo "ERROR: Invalid date. Use YYYY-MM-DD, 'today', or no argument for yesterday."
exit 1
fi
else
TARGET_DATE=$(date -v-1d +%Y-%m-%d)
fi
# ─── INSTALL ID ───────────────────────────────────────────────────────────────
INSTALL_ID=""
if [ -f "$INSTALL_ID_FILE" ]; then
INSTALL_ID=$(tr -d '[:space:]' < "$INSTALL_ID_FILE")
fi
if [ -z "$INSTALL_ID" ]; then
INSTALL_ID=$(uuidgen | tr 'A-Z' 'a-z')
echo "$INSTALL_ID" > "$INSTALL_ID_FILE"
log "Generated new install_id: $INSTALL_ID"
fi
log "========================================"
log "Screenpipe sync starting for: $TARGET_DATE"
log "install_id: $INSTALL_ID"
log "========================================"
# ─── PREFLIGHT ────────────────────────────────────────────────────────────────
step "Preflight checks"
if [ ! -f "$DB_SRC" ]; then
log "ERROR: Source DB not found at $DB_SRC"; exit 1
fi
printf " %-25s %s (%s)\n" "Source DB:" "OK" "$(du -sh "$DB_SRC" | cut -f1)"
if [ ! -d "$NAS_MOUNT" ]; then
log "ERROR: NAS not mounted at $NAS_MOUNT"; exit 1
fi
printf " %-25s %s\n" "NAS mount:" "OK $NAS_MOUNT"
# Check if DB already synced for this date+install_id
DB_ALREADY_SYNCED=false
if [ -f "$NAS_DB" ]; then
if table_exists "$NAS_DB" "frames"; then
HAS_INSTALL_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('frames') WHERE name='install_id';")
if [ "$HAS_INSTALL_COL" -gt "0" ]; then
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE' AND install_id='$INSTALL_ID';" 2>/dev/null || echo "0")
else
EXISTING=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp)='$TARGET_DATE';" 2>/dev/null || echo "0")
fi
if [ "$EXISTING" -gt "0" ]; then
log "Date $TARGET_DATE / install $INSTALL_ID already has $EXISTING frames — skipping DB sync"
DB_ALREADY_SYNCED=true
else
printf " %-25s %s (%s)\n" "Archive DB:" "exists" "$(du -sh "$NAS_DB" | cut -f1)"
fi
else
printf " %-25s %s\n" "Archive DB:" "exists, no frames table yet"
fi
else
printf " %-25s %s\n" "Archive DB:" "will be created"
fi
# Source data dir for this date (video frames)
DATA_SRC="$HOME/.screenpipe/data/data/$TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
DATA_SIZE=$(du -sh "$DATA_SRC" | cut -f1)
DATA_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
printf " %-25s %s (%s files, %s)\n" "Frame data dir:" "OK" "$DATA_FILES" "$DATA_SIZE"
else
printf " %-25s %s\n" "Frame data dir:" "not found — skipping"
fi
# Audio files (flat in ~/.screenpipe/data/, dated by filename)
shopt -s nullglob
AUDIO_SRC_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_SRC_FILES[@]} -gt 0 ]; then
AUDIO_TOTAL=$(du -ch "${AUDIO_SRC_FILES[@]}" | tail -1 | cut -f1)
printf " %-25s %s (%s files, %s)\n" "Audio files:" "OK" "${#AUDIO_SRC_FILES[@]}" "$AUDIO_TOTAL"
else
printf " %-25s %s\n" "Audio files:" "none for this date"
fi
# ─── SCHEMA MIGRATION: install_id ─────────────────────────────────────────────
# Adds install_id column to existing NAS tables, backfills NULLs with a
# legacy tag, and creates the (install_id, id) unique index. Idempotent.
if [ -f "$NAS_DB" ]; then
step "Schema migration: install_id"
LEGACY_TAG="legacy-$(date +%Y%m%d)"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
if ! table_exists "$NAS_DB" "$tbl"; then
continue
fi
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
ROW_COUNT=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM $tbl;")
printf " %-36s adding install_id, backfill %s rows → %s\n" "$tbl" "$ROW_COUNT" "$LEGACY_TAG"
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
sqlite3 "$NAS_DB" "UPDATE $tbl SET install_id = '$LEGACY_TAG' WHERE install_id IS NULL;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
fi
# ─── DB SYNC ──────────────────────────────────────────────────────────────────
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── COUNT SOURCE ROWS ────────────────────────────────────────────────────
step "Counting source rows for $TARGET_DATE"
SRC_FRAMES=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ELEMENTS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM elements WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_UI=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE';")
SRC_OCR=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM ocr_text WHERE frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE');")
SRC_MEETINGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE';")
SRC_ACHUNKS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE';")
SRC_ATRANS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_transcriptions WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
SRC_ATAGS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM audio_tags WHERE audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE');")
# speakers + speaker_embeddings are install-global, not per-date; we sync all.
SRC_SPEAKERS=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speakers;")
SRC_SEMB=$(sqlite3 "$DB_SRC" "SELECT COUNT(*) FROM speaker_embeddings;")
printf " %-25s %s\n" "frames:" "$SRC_FRAMES"
printf " %-25s %s\n" "elements:" "$SRC_ELEMENTS"
printf " %-25s %s\n" "ui_events:" "$SRC_UI"
printf " %-25s %s\n" "ocr_text:" "$SRC_OCR"
printf " %-25s %s\n" "meetings:" "$SRC_MEETINGS"
printf " %-25s %s\n" "audio_chunks:" "$SRC_ACHUNKS"
printf " %-25s %s\n" "audio_transcriptions:" "$SRC_ATRANS"
printf " %-25s %s\n" "audio_tags:" "$SRC_ATAGS"
printf " %-25s %s (all-time)\n" "speakers:" "$SRC_SPEAKERS"
printf " %-25s %s (all-time)\n" "speaker_embeddings:" "$SRC_SEMB"
if [ "$SRC_FRAMES" -eq "0" ] && [ "$SRC_ACHUNKS" -eq "0" ]; then
log "No frames or audio chunks for $TARGET_DATE — skipping DB sync"
DB_ALREADY_SYNCED=true
fi
fi
if [ "$DB_ALREADY_SYNCED" = false ]; then
# ─── INIT TABLES ──────────────────────────────────────────────────────────
step "Initialising tables (CREATE IF NOT EXISTS)"
run_sqlite_heredoc "creating vision tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.frames AS SELECT * FROM main.frames WHERE 0;
CREATE TABLE IF NOT EXISTS nas.elements AS SELECT * FROM main.elements WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ui_events AS SELECT * FROM main.ui_events WHERE 0;
CREATE TABLE IF NOT EXISTS nas.ocr_text AS SELECT * FROM main.ocr_text WHERE 0;
CREATE TABLE IF NOT EXISTS nas.video_chunks AS SELECT * FROM main.video_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.meetings AS SELECT * FROM main.meetings WHERE 0;
DETACH nas;
"
run_sqlite_heredoc "creating audio tables" "
ATTACH '$NAS_DB' AS nas;
CREATE TABLE IF NOT EXISTS nas.audio_chunks AS SELECT * FROM main.audio_chunks WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_transcriptions AS SELECT * FROM main.audio_transcriptions WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speakers AS SELECT * FROM main.speakers WHERE 0;
CREATE TABLE IF NOT EXISTS nas.speaker_embeddings AS SELECT * FROM main.speaker_embeddings WHERE 0;
CREATE TABLE IF NOT EXISTS nas.audio_tags AS SELECT * FROM main.audio_tags WHERE 0;
DETACH nas;
"
# Re-run install_id + index setup so freshly-created tables get them too.
for tbl in "${ALL_SYNC_TABLES[@]}"; do
HAS_COL=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM pragma_table_info('$tbl') WHERE name='install_id';")
if [ "$HAS_COL" = "0" ]; then
sqlite3 "$NAS_DB" "ALTER TABLE $tbl ADD COLUMN install_id TEXT;"
fi
sqlite3 "$NAS_DB" "CREATE UNIQUE INDEX IF NOT EXISTS idx_${tbl}_install_pk ON ${tbl}(install_id, id);"
done
# ─── SCHEMA DRIFT ─────────────────────────────────────────────────────────
step "Reconciling NAS schema with source"
for tbl in "${ALL_SYNC_TABLES[@]}"; do
ensure_columns "$tbl"
done
run_sqlite_heredoc "creating indexes" "
ATTACH '$NAS_DB' AS nas;
-- vision
CREATE INDEX IF NOT EXISTS nas.idx_frames_timestamp ON frames(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_frames_app_name ON frames(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_window_name ON frames(window_name);
CREATE INDEX IF NOT EXISTS nas.idx_frames_video_chunk_id ON frames(video_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_frames_document_path ON frames(document_path) WHERE document_path IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_id ON elements(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_elements_frame_src_role ON elements(frame_id, source, role) WHERE text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_elements_onscreen_frame ON elements(frame_id) WHERE on_screen = 1 AND text IS NOT NULL;
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_timestamp ON ui_events(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_app_name ON ui_events(app_name);
CREATE INDEX IF NOT EXISTS nas.idx_ui_events_frame_id ON ui_events(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_ocr_text_frame_id ON ocr_text(frame_id);
CREATE INDEX IF NOT EXISTS nas.idx_meetings_start ON meetings(meeting_start);
CREATE INDEX IF NOT EXISTS nas.idx_video_chunks_device ON video_chunks(device_name);
-- audio
CREATE INDEX IF NOT EXISTS nas.idx_audio_chunks_timestamp ON audio_chunks(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_chunk_id ON audio_transcriptions(audio_chunk_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_timestamp ON audio_transcriptions(timestamp);
CREATE INDEX IF NOT EXISTS nas.idx_audio_trans_speaker ON audio_transcriptions(speaker_id, timestamp DESC);
CREATE INDEX IF NOT EXISTS nas.idx_speaker_emb_speaker_id ON speaker_embeddings(speaker_id);
CREATE INDEX IF NOT EXISTS nas.idx_audio_tags_chunk_id ON audio_tags(audio_chunk_id);
DETACH nas;
"
run_sqlite_heredoc "creating FTS tables" "
ATTACH '$NAS_DB' AS nas;
CREATE VIRTUAL TABLE IF NOT EXISTS nas.elements_fts USING fts5(
text, role, frame_id UNINDEXED,
content='elements', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.frames_fts USING fts5(
full_text, app_name, window_name, browser_url, id UNINDEXED,
tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.ui_events_fts USING fts5(
text_content, app_name, window_title, element_name,
content='ui_events', content_rowid='id', tokenize='unicode61'
);
CREATE VIRTUAL TABLE IF NOT EXISTS nas.audio_transcriptions_fts USING fts5(
transcription, device, speaker_id UNINDEXED, id UNINDEXED,
tokenize='unicode61'
);
DETACH nas;
"
# ─── BUILD EXPLICIT COLUMN LISTS ──────────────────────────────────────────
# Source columns + install_id appended. Same on both sides of the INSERT.
FRAMES_COLS=$(build_col_list frames)
ELEMENTS_COLS=$(build_col_list elements)
ELEMENTS_COLS_E=$(build_col_list elements e)
UI_EVENTS_COLS=$(build_col_list ui_events)
OCR_TEXT_COLS=$(build_col_list ocr_text)
OCR_TEXT_COLS_O=$(build_col_list ocr_text o)
VIDEO_CHUNKS_COLS=$(build_col_list video_chunks)
MEETINGS_COLS=$(build_col_list meetings)
ACHUNKS_COLS=$(build_col_list audio_chunks)
ATRANS_COLS=$(build_col_list audio_transcriptions)
ATRANS_COLS_T=$(build_col_list audio_transcriptions t)
SPEAKERS_COLS=$(build_col_list speakers)
SEMB_COLS=$(build_col_list speaker_embeddings)
ATAGS_COLS=$(build_col_list audio_tags)
ATAGS_COLS_AT=$(build_col_list audio_tags at)
# ─── SYNC VISION DATA ─────────────────────────────────────────────────────
step "Syncing vision data for $TARGET_DATE"
run_sqlite_heredoc "video_chunks" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.video_chunks ($VIDEO_CHUNKS_COLS, install_id)
SELECT $VIDEO_CHUNKS_COLS, '$INSTALL_ID' FROM main.video_chunks
WHERE id IN (
SELECT DISTINCT video_chunk_id FROM main.frames
WHERE date(timestamp) = '$TARGET_DATE' AND video_chunk_id IS NOT NULL
);
DETACH nas;
"
run_sqlite_heredoc "frames ($SRC_FRAMES rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames ($FRAMES_COLS, install_id)
SELECT $FRAMES_COLS, '$INSTALL_ID' FROM main.frames WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ocr_text ($SRC_OCR rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ocr_text ($OCR_TEXT_COLS, install_id)
SELECT $OCR_TEXT_COLS_O, '$INSTALL_ID' FROM main.ocr_text o
JOIN main.frames f ON o.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "ui_events ($SRC_UI rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events ($UI_EVENTS_COLS, install_id)
SELECT $UI_EVENTS_COLS, '$INSTALL_ID' FROM main.ui_events WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "elements ($SRC_ELEMENTS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements ($ELEMENTS_COLS, install_id)
SELECT $ELEMENTS_COLS_E, '$INSTALL_ID' FROM main.elements e
JOIN main.frames f ON e.frame_id = f.id
WHERE date(f.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "meetings ($SRC_MEETINGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.meetings ($MEETINGS_COLS, install_id)
SELECT $MEETINGS_COLS, '$INSTALL_ID' FROM main.meetings WHERE date(meeting_start) = '$TARGET_DATE';
DETACH nas;
"
# ─── SYNC AUDIO DATA ──────────────────────────────────────────────────────
step "Syncing audio data for $TARGET_DATE"
# Speakers + embeddings are install-global, not per-date. Sync everything...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
__pycache__
consumers
static
.env
cli.py
M
db.py
docker-compose.yml
Dockerfile
M
main.py
M
requirements.txt
M
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Shell Script
Editor Language Status: Inline Suggestions, next: $(copilot) No inline suggestion available, Inline suggestions
LF
UTF-8
Spaces: 2
Ln 691, Col 77
No results found
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard
OUT
activity.py
audio.py
elements.py
__init__.py
interactions.py
ocr.py
__pycache__
report.py
summary.py
"""AI summary generator using Claude."""
import os
from datetime import date
from typing import Any
from consumers.activity import app_time, summary_stats, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
SUMMARY_TYPES = {
"day_recap": {
"title": "Day Recap",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a concise day recap using EXACTLY this format:
## Summary
One sentence: what I mainly worked on today.
## Accomplishments
- Top 3 things I finished, with approximate times
- Name specific apps, files, or projects
## Key Moments
- Important things seen or heard, with timestamps
## Unfinished Work
- What to continue tomorrow
## Patterns
- Apps used most, topics that came up repeatedly
Only report what you can verify from the data. End with:
**Next step:** [most important thing to continue]""",
},
"standup": {
"title": "Standup Update",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a brief standup update:
**Yesterday:** What I worked on (2-3 bullets)
**Today:** What I plan to continue (1-2 bullets)
**Blockers:** Any issues observed (or "None")
Keep it under 150 words. Only report what's verifiable from the data.""",
},
"focus_time": {
"title": "Focus & Distraction Analysis",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
Analyze focus vs distraction patterns:
## Deep Work Periods
- When and how long were sustained focus sessions?
- What was being worked on?
## Distractions
- Which apps/sites were used recreationally?
- How much time was lost to context switching?
## Recommendations
- 2-3 concrete suggestions to improve focus tomorrow
Be specific with times and app names.""",
},
}
def _build_context(for_date: date) -> dict[str, str]:
apps = app_time(for_date)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(for_date, limit=20)
url_lines = "\n".join(
f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls
)
ui_rows = ui_event_summary(for_date)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(for_date, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(for_date)
return {
"app_time": app_lines or "(no app data)",
"browser_urls": url_lines or "(no browser data)",
"ui_events": ui_lines or "(no UI event data)",
"ocr_text": ocr_lines or "(no screen data)",
"audio": audio[:3000] if len(audio) > 3000 else audio,
}
def generate(
summary_type: str = "day_recap",
for_date: date | None = None,
api_key: str | None = None,
) -> dict[str, Any]:
"""Generate an AI summary. Returns dict with 'title', 'content', 'date'."""
import anthropic
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {
"title": "Error",
"content": "ANTHROPIC_API_KEY not set. Add it to your .env file.",
"date": str(d),
}
spec = SUMMARY_TYPES.get(summary_type)
if not spec:
return {"title": "Error", "content": f"Unknown summary type: {summary_type}", "date": str(d)}
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {
"title": spec["title"],
"content": f"No screen data found for {d}.",
"date": str(d),
}
ctx = _build_context(d)
prompt = spec["prompt"].format(**ctx)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return {
"title": spec["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Thinking
Thinking
Bash
Read all pipe.md files
IN
for f in /volume1/screenpipe/pipes/*/pipe.md; do echo "=== $f ==="; cat "$f"; echo; done
Copy code to clipboard
OUT
=== /volume1/screenpipe/pipes/ai-habits/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: AI Habits
description: "How you use AI tools — patterns and insights"
icon: "🤖"
featured: true
---
Search my recordings from the last 24 hours for AI tool usage. Use app_name filter for each tool separately: ChatGPT, Claude, Copilot, Cursor, Gemini, Perplexity. Use limit=5 per search, max 6 searches total.
Read screenpipe skill first.
Use this exact format:
## AI Tools Used
- List each tool with approximate time spent (e.g. "Claude: ~45min")
## What I Used Them For
- For each tool: coding, writing, research, or brainstorming
## Usage Patterns
- Do I switch between tools? Use them in bursts or steadily?
## Effectiveness
- Which tool appeared alongside completed work vs. abandoned attempts
If no AI usage is found, say so clearly. End with: "**Tip:** [one suggestion to use AI tools more effectively]"
=== /volume1/screenpipe/pipes/day-recap/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
=== /volume1/screenpipe/pipes/meeting-summary/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
=== /volume1/screenpipe/pipes/standup-update/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Standup Update
description: "What you did, what's next, and any blockers"
icon: "🏢"
featured: true
---
Based on my recordings from the last 24 hours, generate a standup update. Use limit=10 per search, max 3 searches total.
Read screenpipe skill first.
Use this exact format:
## Yesterday
- What I worked on (name specific projects, files, tools, PRs)
## Today
- What I will work on next (based on unfinished tasks and recent activity)
## Blockers
- Issues I hit — errors, slow builds, waiting on someone
- If no blockers, write "None"
Keep it under 150 words. Copy-paste ready for a team standup.
=== /volume1/screenpipe/pipes/time-breakdown/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Time Breakdown
description: "Where your time went — by app, project, and category"
icon: "⏱"
featured: false
---
Analyze my app usage from today (last 12 hours). Use limit=10 per search, max 4 searches. Prefer /raw_sql with COUNT/GROUP BY queries.
Read screenpipe skill first.
Use this exact format with durations and percentages:
## By Application
- List each app with duration and percentage, sorted by time (e.g. "VS Code: 2h 15min (28%)")
## By Category
- Group into: coding, meetings, browsing, writing, communication, other
- Show hours and percentage per category
## By Project
- Group related activities by project/topic. Name specific repos or tasks.
## Productivity Score
- Calculate: focused_work_hours / total_hours as a percentage
- Focused = coding + writing. Unfocused = browsing + switching.
End with: "**Suggestion:** [one specific change to improve tomorrow's productivity]"
=== /volume1/screenpipe/pipes/video-export/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Export Video Clip
description: "Create a video of your recent screen activity"
icon: "🎬"
featured: false
---
Export a video of my screen activity from the last 5 minutes.
Read screenpipe skill first.
Use the POST /frames/export endpoint with the time range and fps=1.0. Then show me the exported video file path as an inline code block so I can watch it.
If the export is large, suggest a lower fps or shorter time range.
Thinking
Thinking
Now I have a clear picture. The pipes are markdown prompt templates in
/volume1/screenpipe/pipes/...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
__pycache__
consumers
static
.env
cli.py
M
db.py
docker-compose.yml
Dockerfile
M
main.py
M
requirements.txt
M
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
screenpipe_sync_updated.sh, preview, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
# the source currently has; INSERT OR IGNORE handles the duplicate case.
run_sqlite_heredoc "speakers ($SRC_SPEAKERS rows, all-time)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speakers ($SPEAKERS_COLS, install_id)
SELECT $SPEAKERS_COLS, '$INSTALL_ID' FROM main.speakers;
DETACH nas;
"
run_sqlite_heredoc "speaker_embeddings ($SRC_SEMB rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.speaker_embeddings ($SEMB_COLS, install_id)
SELECT $SEMB_COLS, '$INSTALL_ID' FROM main.speaker_embeddings;
DETACH nas;
"
run_sqlite_heredoc "audio_chunks ($SRC_ACHUNKS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_chunks ($ACHUNKS_COLS, install_id)
SELECT $ACHUNKS_COLS, '$INSTALL_ID' FROM main.audio_chunks WHERE date(timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions ($SRC_ATRANS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions ($ATRANS_COLS, install_id)
SELECT $ATRANS_COLS_T, '$INSTALL_ID' FROM main.audio_transcriptions t
JOIN main.audio_chunks c ON t.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
run_sqlite_heredoc "audio_tags ($SRC_ATAGS rows)" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_tags ($ATAGS_COLS, install_id)
SELECT $ATAGS_COLS_AT, '$INSTALL_ID' FROM main.audio_tags at
JOIN main.audio_chunks c ON at.audio_chunk_id = c.id
WHERE date(c.timestamp) = '$TARGET_DATE';
DETACH nas;
"
# ─── FTS UPDATE ───────────────────────────────────────────────────────────
step "Updating FTS indexes"
run_sqlite_heredoc "elements_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.elements_fts(rowid, text, role)
SELECT e.id, e.text, e.role
FROM nas.elements e
JOIN nas.frames f ON e.frame_id = f.id AND e.install_id = f.install_id
WHERE date(f.timestamp) = '$TARGET_DATE'
AND e.install_id = '$INSTALL_ID'
AND e.text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "frames_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.frames_fts(rowid, full_text, app_name, window_name, browser_url, id)
SELECT id, full_text, app_name, window_name, browser_url, id
FROM nas.frames
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND full_text IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "ui_events_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.ui_events_fts(rowid, text_content, app_name, window_title, element_name)
SELECT id, text_content, app_name, window_title, element_name
FROM nas.ui_events
WHERE date(timestamp) = '$TARGET_DATE'
AND install_id = '$INSTALL_ID'
AND text_content IS NOT NULL;
DETACH nas;
"
run_sqlite_heredoc "audio_transcriptions_fts" "
ATTACH '$NAS_DB' AS nas;
INSERT OR IGNORE INTO nas.audio_transcriptions_fts(rowid, transcription, device, speaker_id, id)
SELECT t.id, t.transcription, COALESCE(t.device,''), t.speaker_id, t.id
FROM nas.audio_transcriptions t
JOIN nas.audio_chunks c ON t.audio_chunk_id = c.id AND t.install_id = c.install_id
WHERE date(c.timestamp) = '$TARGET_DATE'
AND t.install_id = '$INSTALL_ID'
AND t.transcription IS NOT NULL AND t.transcription != '';
DETACH nas;
"
# ─── VERIFY ───────────────────────────────────────────────────────────────
step "Verifying DB"
V_FRAMES=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ELEMENTS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM elements WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_UI=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ui_events WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_OCR=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM ocr_text WHERE install_id='$INSTALL_ID' AND frame_id IN (SELECT id FROM frames WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_MEETINGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM meetings WHERE date(meeting_start) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ACHUNKS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID';")
V_ATRANS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_transcriptions WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
V_ATAGS=$(sqlite3 "$NAS_DB" "SELECT COUNT(*) FROM audio_tags WHERE install_id='$INSTALL_ID' AND audio_chunk_id IN (SELECT id FROM audio_chunks WHERE date(timestamp) = '$TARGET_DATE' AND install_id='$INSTALL_ID');")
check "frames" "$V_FRAMES" "$SRC_FRAMES"
check "elements" "$V_ELEMENTS" "$SRC_ELEMENTS"
check "ui_events" "$V_UI" "$SRC_UI"
check "ocr_text" "$V_OCR" "$SRC_OCR"
check "meetings" "$V_MEETINGS" "$SRC_MEETINGS"
check "audio_chunks" "$V_ACHUNKS" "$SRC_ACHUNKS"
check "audio_transcriptions" "$V_ATRANS" "$SRC_ATRANS"
check "audio_tags" "$V_ATAGS" "$SRC_ATAGS"
fi
# ─── COPY FRAME DATA FOLDER ──────────────────────────────────────────────────
# Always runs regardless of DB sync status.
step "Copying frame data folder for $TARGET_DATE"
if [ -d "$DATA_SRC" ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync frames → NAS"
rsync -a --ignore-existing "$DATA_SRC/" "$NAS_DATA/$TARGET_DATE/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_FILES=$(ls "$NAS_DATA/$TARGET_DATE" 2>/dev/null | grep -v '^audio$' | wc -l | tr -d ' ')
SRC_FILES=$(ls "$DATA_SRC" | wc -l | tr -d ' ')
COPIED_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE" | cut -f1)
if [ "$COPIED_FILES" -ge "$SRC_FILES" ]; then
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync frames → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_FILES" "$COPIED_SIZE" | tee -a "$LOG_FILE"
else
printf "\r %-36s ✗ %s / %s files\n" \
"rsync frames → NAS" "$COPIED_FILES" "$SRC_FILES" | tee -a "$LOG_FILE"
fi
else
printf " %-36s %s\n" "rsync frames → NAS" "skipped (no source dir)"
fi
# ─── COPY AUDIO FILES ────────────────────────────────────────────────────────
# Audio is flat in ~/.screenpipe/data/ with the date in the filename, e.g.
# System Audio (output)_2026-05-11_13-48-12.mp4
# soundcore AeroClip (input)_2026-05-10_11-10-32.mp4
# Mirrored to $NAS_DATA/<date>/audio/ so each day's archive is self-contained.
step "Copying audio files for $TARGET_DATE"
shopt -s nullglob
AUDIO_FILES=( "$HOME/.screenpipe/data/"*_"${TARGET_DATE}"_*.mp4 )
shopt -u nullglob
if [ ${#AUDIO_FILES[@]} -gt 0 ]; then
mkdir -p "$NAS_DATA/$TARGET_DATE/audio"
RSYNC_START=$(date +%s)
printf " %-36s " "rsync audio → NAS"
rsync -a --ignore-existing "${AUDIO_FILES[@]}" "$NAS_DATA/$TARGET_DATE/audio/" 2>>"$LOG_FILE"
RSYNC_DUR=$(( $(date +%s) - RSYNC_START ))
COPIED_AUDIO=$(ls "$NAS_DATA/$TARGET_DATE/audio" | wc -l | tr -d ' ')
AUDIO_SIZE=$(du -sh "$NAS_DATA/$TARGET_DATE/audio" | cut -f1)
printf "\r %-36s ✓ %dm%02ds (%s files, %s)\n" \
"rsync audio → NAS" \
"$(( RSYNC_DUR / 60 ))" "$(( RSYNC_DUR % 60 ))" \
"$COPIED_AUDIO" "$AUDIO_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync audio → NAS" "skipped (no audio for date)"
fi
# ─── COPY LOGS ────────────────────────────────────────────────────────────────
step "Copying screenpipe logs for $TARGET_DATE"
NAS_LOGS="$NAS_MOUNT/logs"
mkdir -p "$NAS_LOGS"
shopt -s nullglob
LOG_FILES=( "$HOME/.screenpipe/screenpipe.$TARGET_DATE."*.log )
shopt -u nullglob
if [ ${#LOG_FILES[@]} -gt 0 ]; then
printf " %-36s " "rsync logs → NAS"
rsync -a "${LOG_FILES[@]}" "$NAS_LOGS/" 2>>"$LOG_FILE"
TOTAL_SIZE=$(du -ch "${LOG_FILES[@]}" | tail -1 | cut -f1)
printf "✓ %d file(s), %s\n" "${#LOG_FILES[@]}" "$TOTAL_SIZE" | tee -a "$LOG_FILE"
else
printf " %-36s %s\n" "rsync logs → NAS" "skipped (no matching logs)"
fi
# ─── SUMMARY ──────────────────────────────────────────────────────────────────
TOTAL_ELAPSED=$(( $(date +%s) - SCRIPT_START ))
DB_SIZE=$(du -sh "$NAS_DB" | cut -f1)
echo ""
log "Archive DB size: $DB_SIZE"
log "Total time: $(( TOTAL_ELAPSED / 60 ))m$(( TOTAL_ELAPSED % 60 ))s"
log "Sync complete for $TARGET_DATE (install $INSTALL_ID)"
log "========================================"
rsync -a "$HOME/.screenpipe/sync.log" "$NAS_LOGS/sync.log" 2>/dev/null || true
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Shell Script
Editor Language Status: Inline Suggestions, next: $(copilot) No inline suggestion available, Inline suggestions
LF
UTF-8
Spaces: 2
Ln 691, Col 77
No results found
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard...
|
Code
|
screenpipe_sync.sh — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
__pycache__
consumers
static
.env
cli.py
M
db.py
docker-compose.yml
Dockerfile
M
main.py
M
requirements.txt
M
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
[ENV_SECRET]
[ENV_SECRET]
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Info: Setting up SSH Host nas: Setting up SSH tunnel
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard
OUT
activity.py
audio.py
elements.py
__init__.py
interactions.py
ocr.py
__pycache__
report.py
summary.py
"""AI summary generator using Claude."""
import os
from datetime import date
from typing import Any
from consumers.activity import app_time, summary_stats, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
SUMMARY_TYPES = {
"day_recap": {
"title": "Day Recap",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a concise day recap using EXACTLY this format:
## Summary
One sentence: what I mainly worked on today.
## Accomplishments
- Top 3 things I finished, with approximate times
- Name specific apps, files, or projects
## Key Moments
- Important things seen or heard, with timestamps
## Unfinished Work
- What to continue tomorrow
## Patterns
- Apps used most, topics that came up repeatedly
Only report what you can verify from the data. End with:
**Next step:** [most important thing to continue]""",
},
"standup": {
"title": "Standup Update",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a brief standup update:
**Yesterday:** What I worked on (2-3 bullets)
**Today:** What I plan to continue (1-2 bullets)
**Blockers:** Any issues observed (or "None")
Keep it under 150 words. Only report what's verifiable from the data.""",
},
"focus_time": {
"title": "Focus & Distraction Analysis",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
Analyze focus vs distraction patterns:
## Deep Work Periods
- When and how long were sustained focus sessions?
- What was being worked on?
## Distractions
- Which apps/sites were used recreationally?
- How much time was lost to context switching?
## Recommendations
- 2-3 concrete suggestions to improve focus tomorrow
Be specific with times and app names.""",
},
}
def _build_context(for_date: date) -> dict[str, str]:
apps = app_time(for_date)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(for_date, limit=20)
url_lines = "\n".join(
f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls
)
ui_rows = ui_event_summary(for_date)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(for_date, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(for_date)
return {
"app_time": app_lines or "(no app data)",
"browser_urls": url_lines or "(no browser data)",
"ui_events": ui_lines or "(no UI event data)",
"ocr_text": ocr_lines or "(no screen data)",
"audio": audio[:3000] if len(audio) > 3000 else audio,
}
def generate(
summary_type: str = "day_recap",
for_date: date | None = None,
api_key: str | None = None,
) -> dict[str, Any]:
"""Generate an AI summary. Returns dict with 'title', 'content', 'date'."""
import anthropic
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {
"title": "Error",
"content": "ANTHROPIC_API_KEY not set. Add it to your .env file.",
"date": str(d),
}
spec = SUMMARY_TYPES.get(summary_type)
if not spec:
return {"title": "Error", "content": f"Unknown summary type: {summary_type}", "date": str(d)}
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {
"title": spec["title"],
"content": f"No screen data found for {d}.",
"date": str(d),
}
ctx = _build_context(d)
prompt = spec["prompt"].format(**ctx)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return {
"title": spec["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Thinking
Thinking
Bash
Read all pipe.md files
IN
for f in /volume1/screenpipe/pipes/*/pipe.md; do echo "=== $f ==="; cat "$f"; echo; done
Copy code to clipboard
OUT
=== /volume1/screenpipe/pipes/ai-habits/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: AI Habits
description: "How you use AI tools — patterns and insights"
icon: "🤖"
featured: true
---
Search my recordings from the last 24 hours for AI tool usage. Use app_name filter for each tool separately: ChatGPT, Claude, Copilot, Cursor, Gemini, Perplexity. Use limit=5 per search, max 6 searches total.
Read screenpipe skill first.
Use this exact format:
## AI Tools Used
- List each tool with approximate time spent (e.g. "Claude: ~45min")
## What I Used Them For
- For each tool: coding, writing, research, or brainstorming
## Usage Patterns
- Do I switch between tools? Use them in bursts or steadily?
## Effectiveness
- Which tool appeared alongside completed work vs. abandoned attempts
If no AI usage is found, say so clearly. End with: "**Tip:** [one suggestion to use AI tools more effectively]"
=== /volume1/screenpipe/pipes/day-recap/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
=== /volume1/screenpipe/pipes/meeting-summary/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
=== /volume1/screenpipe/pipes/standup-update/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Standup Update
description: "What you did, what's next, and any blockers"
icon: "🏢"
featured: true
---
Based on my recordings from the last 24 hours, generate a standup update. Use limit=10 per search, max 3 searches total.
Read screenpipe skill first.
Use this exact format:
## Yesterday
- What I worked on (name specific projects, files, tools, PRs)
## Today
- What I will work on next (based on unfinished tasks and recent activity)
## Blockers
- Issues I hit — errors, slow builds, waiting on someone
- If no blockers, write "None"
Keep it under 150 words. Copy-paste ready for a team standup.
=== /volume1/screenpipe/pipes/time-breakdown/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Time Breakdown
description: "Where your time went — by app, project, and category"
icon: "⏱"
featured: false
---
Analyze my app usage from today (last 12 hours). Use limit=10 per search, max 4 searches. Prefer /raw_sql with COUNT/GROUP BY queries.
Read screenpipe skill first.
Use this exact format with durations and percentages:
## By Application
- List each app with duration and percentage, sorted by time (e.g. "VS Code: 2h 15min (28%)")
## By Category
- Group into: coding, meetings, browsing, writing, communication, other
- Show hours and percentage per category
## By Project
- Group related activities by project/topic. Name specific repos or tasks.
## Productivity Score
- Calculate: focused_work_hours / total_hours as a percentage
- Focused = coding + writing. Unfocused = browsing + switching.
End with: "**Suggestion:** [one specific change to improve tomorrow's productivity]"
=== /volume1/screenpipe/pipes/video-export/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Export Video Clip
description: "Create a video of your recent screen activity"
icon: "🎬"
featured: false
---
Export a video of my screen activity from the last 5 minutes.
Read screenpipe skill first.
Use the POST /frames/export endpoint with the time range and fps=1.0. Then show me the exported video file path as an inline code block so I can watch it.
If the export is large, suggest a lower fps or shorter time range.
Thinking
Thinking
Now I have a clear picture. The pipes are markdown prompt templates in
/volume1/screenpipe/pipes/
, but there's no code to actually load and run them —
consumers/summary.py
has its own hardcoded prompts instead.
Here's what I'll implement:
consumers/pipes.py
— dynamically loads pipe.md files and runs them via Claude (data injected as system message, pipe body as user prompt)
main.py
— two new endpoints:
GET /api/pipes
and
POST /api/pipes/{pipe_id}/run
cli.py
— a
pipe
command to run pipes from the terminal
Let me build it:
Bash
Check browser_urls signature
IN
cat /volume1/screenpipe/app/consumers/activity.py | grep "def browser_urls" -A 5
Copy code to clipboard
OUT
def browser_urls(
for_date: date | None = None,
app_name: str | None = None,
ts_from: str | None = None,
ts_to: str | None = None,
limit: int = 30,
Write
pipes.py
pipes.py
143 lines
"""Dynamic pipe loader and runner."""
import os
import re
from datetime import date
from pathlib import Path
from typing import Any
def _parse_frontmatter(text: str) -> tuple[dict, str]:
if not text.startswith("---"):
return {}, text
try:
end = text.index("---", 3)
except ValueError:
return {}, text
fm_text = text[3:end].strip()
body = text[end + 3:].strip()
meta: dict = {}
for line in fm_text.splitlines():
if ":" in line:
k, _, v = line.partition(":")
meta[k.strip()] = v.strip().strip('"')
return meta, body
def load_pipes(pipes_dir: Path) -> list[dict]:
if not pipes_dir.exists():
return []
pipes = []
for pipe_dir in sorted(pipes_dir.iterdir()):
if not pipe_dir.is_dir() or pipe_dir.name.startswith("."):
continue
pipe_md = pipe_dir / "pipe.md"
if not pipe_md.exists():
continue
text = pipe_md.read_text(encoding="utf-8")
meta, body = _parse_frontmatter(text)
if meta.get("enabled", "true").lower() == "false":
continue
body = re.sub(r"Read screenpipe skill first\.?\n?", "", body).strip()
pipes.append({
"id": pipe_dir.name,
"title": meta.get("title", pipe_dir.name),
"description": meta.get("description", ""),
"icon": meta.get("icon", ""),
"schedule": meta.get("schedule", "manual"),
"featured": meta.get("featured", "false").lower() == "true",
"prompt": body,
})
return pipes
def _build_system(d: date, stats: dict) -> str:
from consumers.activity import app_time, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
apps = app_time(d)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(d, limit=20)
url_lines = "\n".join(f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls)
ui_rows = ui_event_summary(d)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(d, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(d)
if len(audio) > 3000:
audio = audio[:3000]
return f"""You are analyzing screenpipe data — recordings of a person's computer activity.
## App usage (minutes per app):
{app_lines or '(no app data)'}
## Browser URLs visited:
{url_lines or '(no browser data)'}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_lines or '(no UI event data)'}
## Screen content (sampled every 5 minutes):
{ocr_lines or '(no screen data)'}
## Audio / speech:
{audio or '(no audio data)'}
Date: {d}
Total frames recorded: {stats['total_frames']}"""
def run_pipe(
pipe_id: str,
for_date: date | None = None,
api_key: str | None = None,
pipes_dir: Path | None = None,
) -> dict[str, Any]:
import anthropic
from consumers.activity import summary_stats
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {"title": "Error", "content": "ANTHROPIC_API_KEY not set.", "date": str(d)}
_pipes_dir = pipes_dir or Path(os.environ.get("PIPES_DIR", "/data/pipes"))
all_pipes = {p["id"]: p for p in load_pipes(_pipes_dir)}
if pipe_id not in all_pipes:
return {"title": "Error", "content": f"Pipe not found: {pipe_id}", "date": str(d)}
pipe = all_pipes[pipe_id]
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {"title": pipe["title"], "content": f"No screen data found for {d}.", "date": str(d)}
system = _build_system(d, stats)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
system=system,
messages=[{"role": "user", "content": pipe["prompt"]}],
)
return {
"title": pipe["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Now add the API endpoints to
main.py
:
Edit
main.py
main.py
Added 1 line
Edit
main.py
main.py
Added 1 line
Now add the pipe endpoints at the end of main.py, before the closing summary section:
Edit
main.py
main.py
Added 14 lines
Now add the CLI command to
cli.py
cli.py
:
Edit
cli.py
cli.py
Added 40 lines
Done. Here's how to run pipes:
How to run pipes
How to run pipes
Via Docker (inside the running container):
Copy code to clipboard
docker exec -it screenpipe-app python cli.py pipe --list
docker exec -it screenpipe-app python cli.py pipe day-recap
docker exec -it screenpipe-app python cli.py pipe standup-update --date yesterday
Via the API
(once the container is running on port 8766):
Copy code to clipboard
# List pipes
curl http://localhost:8766/api/pipes
# Run a pipe
curl -X POST "http://localhost:8766/api/pipes/day-recap/run"
curl -X POST "http://localhost:8766/api/pipes/standup-update/run?d=2026-04-25"
Rebuild and restart
to pick up the changes:
Copy code to clipboard
cd /volume1/screenpipe/app
docker compose up --build -d
How it works:
Each
pipe.md
file is loaded dynamically — the frontmatter provides metadata and the body becomes the user prompt. Screenpipe data (app usage, browser URLs, OCR, audio) is injected as a Claude system message so the pipe prompt can reference it naturally. The "Read screenpipe skill first" line is stripped automatically.
The pipes directory inside Docker is
/data/pipes
(since
/volume1/screenpipe
→
/data
). You can override it with a
PIPES_DIR
env var if needed.
Message actions
pipe.md
pipe.md
where to place api key and can I use different keys
Thinking
Thinking
Bash
Read .env file
IN
cat /volume1/screenpipe/app/.env
Copy code to clipboard
OUT
[ENV_SECRET]
Where to place the key:
app/.env
app/.env
— replace
your_key_here
with your actual key:
Copy code to clipboard
[ENV_SECRET]
That file is loaded by
load_dotenv()
at startup and is excluded from git via
.gitignore
.
Can you use different keys per pipe?
Not currently — all pipes share the one key from
.env
. But you can add per-pipe key support by adding an
api_key
field to the pipe's frontmatter:
Copy code to clipboard
---
title: Meeting Summary
api_key_env: ANTHROPIC_KEY_MEETINGS
---
Then in
consumers/pipes.py
consumers/pipes.py
, read it in
_parse_frontmatter
and pass it through
run_pipe
. Want me to add that?
Message actions
pipe.md
pipe.md
can I use mistral or other llm? or is is build for antrophic only
Thinking
Thinking
Currently it's Anthropic-only — the code calls the
anthropic
SDK directly in both
consumers/pipes.py
consumers/pipes.py
and
consumers/summary.py
consumers/summary.py
.
Adding other providers is straightforward since Mistral, Ollama, OpenRouter, Groq, and most modern LLMs expose an OpenAI-compatible API. The cleanest approach: add the
openai
package and let you configure the provider via
.env
:
Copy code to clipboard
# Anthropic (default)
[ENV_SECRET]
# OR: any OpenAI-compatible provider
LLM_PROVIDER=openai-compatible
LLM_BASE_URL=https://api.mistral.ai/v1
[ENV_SECRET]
LLM_MODEL=mistral-small-latest
This would cover Mistral, Ollama (
http://localhost:11434/v1
), OpenRouter, Groq, etc. with one code path.
Want me to implement that? It's a small change — a
_call_llm(system, user_prompt)
helper in
pipes.py
that branches on
LLM_PROVIDER
, and
openai
added to
requirements.txt
.
Message actions
lets test it with anthropic first, how to run pipe
Thinking
Thinking
Bash
Check if container is running
IN
cd /volume1/screenpipe/app && docker compose ps 2>/dev/null || docker ps --filter name=screenpipe
Copy code to clipboard
OUT
Exit code 1
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.45/containers/json?filters=%7B%22name%22%3A%7B%22screenpipe%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
I don't have Docker socket access, so you'll need to run these yourself. Here are your options:
Option A — via Docker (if container is already running):
Copy code to clipboard
# List available pipes
docker exec -it screenpipe-app python cli.py pipe --list
# Run a specific pipe
docker exec -it screenpipe-app python cli.py pipe day-recap
Option B — rebuild first (since we added new files), then run:
Copy code to clipboard
cd /volume1/screenpipe/app
docker compose up --build -d
docker exec -it screenpipe-app python cli.py pipe day-recap
Option C — via the API
(if container is up on port 8766):
Copy code to clipboard
# List pipes
curl http://localhost:8766/api/pipes
# Run a pipe
curl -X POST http://localhost:8766/api/pipes/day-recap/run
The
day-recap
or
standup-update
pipes are the best ones to test first since they use all data sources. Which option works for you?
Message actions
how ot run for specifiday
Thinking
Thinking
Docker exec:
Copy code to clipboard
docker exec -it screenpipe-app python cli.py pipe day-recap --date 2026-04-25
# or shortcuts:
docker exec -it screenpipe-app python cli.py pipe day-recap --date yesterday
docker exec -it screenpipe-app python cli.py pipe day-recap --date today
API:
Copy code to clipboard...
|
Code
|
Implement and run pipe l… — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
__pycache__
consumers
static
.env
cli.py
M
db.py
docker-compose.yml
Dockerfile
M
main.py
M
requirements.txt
M
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
[ENV_SECRET]
[ENV_SECRET]
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Info: Setting up SSH Host nas: Setting up SSH tunnel
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard
OUT
activity.py
audio.py
elements.py
__init__.py
interactions.py
ocr.py
__pycache__
report.py
summary.py
"""AI summary generator using Claude."""
import os
from datetime import date
from typing import Any
from consumers.activity import app_time, summary_stats, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
SUMMARY_TYPES = {
"day_recap": {
"title": "Day Recap",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a concise day recap using EXACTLY this format:
## Summary
One sentence: what I mainly worked on today.
## Accomplishments
- Top 3 things I finished, with approximate times
- Name specific apps, files, or projects
## Key Moments
- Important things seen or heard, with timestamps
## Unfinished Work
- What to continue tomorrow
## Patterns
- Apps used most, topics that came up repeatedly
Only report what you can verify from the data. End with:
**Next step:** [most important thing to continue]""",
},
"standup": {
"title": "Standup Update",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a brief standup update:
**Yesterday:** What I worked on (2-3 bullets)
**Today:** What I plan to continue (1-2 bullets)
**Blockers:** Any issues observed (or "None")
Keep it under 150 words. Only report what's verifiable from the data.""",
},
"focus_time": {
"title": "Focus & Distraction Analysis",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
Analyze focus vs distraction patterns:
## Deep Work Periods
- When and how long were sustained focus sessions?
- What was being worked on?
## Distractions
- Which apps/sites were used recreationally?
- How much time was lost to context switching?
## Recommendations
- 2-3 concrete suggestions to improve focus tomorrow
Be specific with times and app names.""",
},
}
def _build_context(for_date: date) -> dict[str, str]:
apps = app_time(for_date)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(for_date, limit=20)
url_lines = "\n".join(
f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls
)
ui_rows = ui_event_summary(for_date)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(for_date, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(for_date)
return {
"app_time": app_lines or "(no app data)",
"browser_urls": url_lines or "(no browser data)",
"ui_events": ui_lines or "(no UI event data)",
"ocr_text": ocr_lines or "(no screen data)",
"audio": audio[:3000] if len(audio) > 3000 else audio,
}
def generate(
summary_type: str = "day_recap",
for_date: date | None = None,
api_key: str | None = None,
) -> dict[str, Any]:
"""Generate an AI summary. Returns dict with 'title', 'content', 'date'."""
import anthropic
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {
"title": "Error",
"content": "ANTHROPIC_API_KEY not set. Add it to your .env file.",
"date": str(d),
}
spec = SUMMARY_TYPES.get(summary_type)
if not spec:
return {"title": "Error", "content": f"Unknown summary type: {summary_type}", "date": str(d)}
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {
"title": spec["title"],
"content": f"No screen data found for {d}.",
"date": str(d),
}
ctx = _build_context(d)
prompt = spec["prompt"].format(**ctx)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return {
"title": spec["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Thinking
Thinking
Bash
Read all pipe.md files
IN
for f in /volume1/screenpipe/pipes/*/pipe.md; do echo "=== $f ==="; cat "$f"; echo; done
Copy code to clipboard
OUT
=== /volume1/screenpipe/pipes/ai-habits/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: AI Habits
description: "How you use AI tools — patterns and insights"
icon: "🤖"
featured: true
---
Search my recordings from the last 24 hours for AI tool usage. Use app_name filter for each tool separately: ChatGPT, Claude, Copilot, Cursor, Gemini, Perplexity. Use limit=5 per search, max 6 searches total.
Read screenpipe skill first.
Use this exact format:
## AI Tools Used
- List each tool with approximate time spent (e.g. "Claude: ~45min")
## What I Used Them For
- For each tool: coding, writing, research, or brainstorming
## Usage Patterns
- Do I switch between tools? Use them in bursts or steadily?
## Effectiveness
- Which tool appeared alongside completed work vs. abandoned attempts
If no AI usage is found, say so clearly. End with: "**Tip:** [one suggestion to use AI tools more effectively]"
=== /volume1/screenpipe/pipes/day-recap/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
=== /volume1/screenpipe/pipes/meeting-summary/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
=== /volume1/screenpipe/pipes/standup-update/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Standup Update
description: "What you did, what's next, and any blockers"
icon: "🏢"
featured: true
---
Based on my recordings from the last 24 hours, generate a standup update. Use limit=10 per search, max 3 searches total.
Read screenpipe skill first.
Use this exact format:
## Yesterday
- What I worked on (name specific projects, files, tools, PRs)
## Today
- What I will work on next (based on unfinished tasks and recent activity)
## Blockers
- Issues I hit — errors, slow builds, waiting on someone
- If no blockers, write "None"
Keep it under 150 words. Copy-paste ready for a team standup.
=== /volume1/screenpipe/pipes/time-breakdown/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Time Breakdown
description: "Where your time went — by app, project, and category"
icon: "⏱"
featured: false
---
Analyze my app usage from today (last 12 hours). Use limit=10 per search, max 4 searches. Prefer /raw_sql with COUNT/GROUP BY queries.
Read screenpipe skill first.
Use this exact format with durations and percentages:
## By Application
- List each app with duration and percentage, sorted by time (e.g. "VS Code: 2h 15min (28%)")
## By Category
- Group into: coding, meetings, browsing, writing, communication, other
- Show hours and percentage per category
## By Project
- Group related activities by project/topic. Name specific repos or tasks.
## Productivity Score
- Calculate: focused_work_hours / total_hours as a percentage
- Focused = coding + writing. Unfocused = browsing + switching.
End with: "**Suggestion:** [one specific change to improve tomorrow's productivity]"
=== /volume1/screenpipe/pipes/video-export/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Export Video Clip
description: "Create a video of your recent screen activity"
icon: "🎬"
featured: false
---
Export a video of my screen activity from the last 5 minutes.
Read screenpipe skill first.
Use the POST /frames/export endpoint with the time range and fps=1.0. Then show me the exported video file path as an inline code block so I can watch it.
If the export is large, suggest a lower fps or shorter time range.
Thinking
Thinking
Now I have a clear picture. The pipes are markdown prompt templates in
/volume1/screenpipe/pipes/
, but there's no code to actually load and run them —
consumers/summary.py
has its own hardcoded prompts instead.
Here's what I'll implement:
consumers/pipes.py
— dynamically loads pipe.md files and runs them via Claude (data injected as system message, pipe body as user prompt)
main.py
— two new endpoints:
GET /api/pipes
and
POST /api/pipes/{pipe_id}/run
cli.py
— a
pipe
command to run pipes from the terminal
Let me build it:
Bash
Check browser_urls signature
IN
cat /volume1/screenpipe/app/consumers/activity.py | grep "def browser_urls" -A 5
Copy code to clipboard
OUT
def browser_urls(
for_date: date | None = None,
app_name: str | None = None,
ts_from: str | None = None,
ts_to: str | None = None,
limit: int = 30,
Write
pipes.py
pipes.py
143 lines
"""Dynamic pipe loader and runner."""
import os
import re
from datetime import date
from pathlib import Path
from typing import Any
def _parse_frontmatter(text: str) -> tuple[dict, str]:
if not text.startswith("---"):
return {}, text
try:
end = text.index("---", 3)
except ValueError:
return {}, text
fm_text = text[3:end].strip()
body = text[end + 3:].strip()
meta: dict = {}
for line in fm_text.splitlines():
if ":" in line:
k, _, v = line.partition(":")
meta[k.strip()] = v.strip().strip('"')
return meta, body
def load_pipes(pipes_dir: Path) -> list[dict]:
if not pipes_dir.exists():
return []
pipes = []
for pipe_dir in sorted(pipes_dir.iterdir()):
if not pipe_dir.is_dir() or pipe_dir.name.startswith("."):
continue
pipe_md = pipe_dir / "pipe.md"
if not pipe_md.exists():
continue
text = pipe_md.read_text(encoding="utf-8")
meta, body = _parse_frontmatter(text)
if meta.get("enabled", "true").lower() == "false":
continue
body = re.sub(r"Read screenpipe skill first\.?\n?", "", body).strip()
pipes.append({
"id": pipe_dir.name,
"title": meta.get("title", pipe_dir.name),
"description": meta.get("description", ""),
"icon": meta.get("icon", ""),
"schedule": meta.get("schedule", "manual"),
"featured": meta.get("featured", "false").lower() == "true",
"prompt": body,
})
return pipes
def _build_system(d: date, stats: dict) -> str:
from consumers.activity import app_time, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
apps = app_time(d)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(d, limit=20)
url_lines = "\n".join(f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls)
ui_rows = ui_event_summary(d)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(d, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(d)
if len(audio) > 3000:
audio = audio[:3000]
return f"""You are analyzing screenpipe data — recordings of a person's computer activity.
## App usage (minutes per app):
{app_lines or '(no app data)'}
## Browser URLs visited:
{url_lines or '(no browser data)'}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_lines or '(no UI event data)'}
## Screen content (sampled every 5 minutes):
{ocr_lines or '(no screen data)'}
## Audio / speech:
{audio or '(no audio data)'}
Date: {d}
Total frames recorded: {stats['total_frames']}"""
def run_pipe(
pipe_id: str,
for_date: date | None = None,
api_key: str | None = None,
pipes_dir: Path | None = None,
) -> dict[str, Any]:
import anthropic
from consumers.activity import summary_stats
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {"title": "Error", "content": "ANTHROPIC_API_KEY not set.", "date": str(d)}
_pipes_dir = pipes_dir or Path(os.environ.get("PIPES_DIR", "/data/pipes"))
all_pipes = {p["id"]: p for p in load_pipes(_pipes_dir)}
if pipe_id not in all_pipes:
return {"title": "Error", "content": f"Pipe not found: {pipe_id}", "date": str(d)}
pipe = all_pipes[pipe_id]
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {"title": pipe["title"], "content": f"No screen data found for {d}.", "date": str(d)}
system = _build_system(d, stats)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
system=system,
messages=[{"role": "user", "content": pipe["prompt"]}],
)
return {
"title": pipe["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Now add the API endpoints to
main.py
:
Edit
main.py
main.py
Added 1 line
Edit
main.py
main.py
Added 1 line
Now add the pipe endpoints at the end of main.py, before the closing summary section:
Edit
main.py
main.py
Added 14 lines
Now add the CLI command to
cli.py
cli.py
:
Edit
cli.py
cli.py
Added 40 lines
Done. Here's how to run pipes:
How to run pipes
How to run pipes
Via Docker (inside the running container):
Copy code to clipboard
docker exec -it screenpipe-app python cli.py pipe --list
docker exec -it screenpipe-app python cli.py pipe day-recap
docker exec -it screenpipe-app python cli.py pipe standup-update --date yesterday
Via the API
(once the container is running on port 8766):
Copy code to clipboard
# List pipes
curl http://localhost:8766/api/pipes
# Run a pipe
curl -X POST "http://localhost:8766/api/pipes/day-recap/run"
curl -X POST "http://localhost:8766/api/pipes/standup-update/run?d=2026-04-25"
Rebuild and restart
to pick up the changes:
Copy code to clipboard
cd /volume1/screenpipe/app
docker compose up --build -d
How it works:
Each
pipe.md
file is loaded dynamically — the frontmatter provides metadata and the body becomes the user prompt. Screenpipe data (app usage, browser URLs, OCR, audio) is injected as a Claude system message so the pipe prompt can reference it naturally. The "Read screenpipe skill first" line is stripped automatically.
The pipes directory inside Docker is
/data/pipes
(since
/volume1/screenpipe
→
/data
). You can override it with a
PIPES_DIR
env var if needed.
Message actions
pipe.md
pipe.md
where to place api key and can I use different keys
Thinking
Thinking
Bash
Read .env file
IN
cat /volume1/screenpipe/app/.env
Copy code to clipboard
OUT
[ENV_SECRET]
Where to place the key:
app/.env
app/.env
— replace
your_key_here
with your actual key:
Copy code to clipboard
[ENV_SECRET]
That file is loaded by
load_dotenv()
at startup and is excluded from git via
.gitignore
.
Can you use different keys per pipe?
Not currently — all pipes share the one key from
.env
. But you can add per-pipe key support by adding an
api_key
field to the pipe's frontmatter:
Copy code to clipboard
---
title: Meeting Summary
api_key_env: ANTHROPIC_KEY_MEETINGS
---
Then in
consumers/pipes.py
consumers/pipes.py
, read it in
_parse_frontmatter
and pass it through
run_pipe
. Want me to add that?
Message actions
pipe.md
pipe.md
can I use mistral or other llm? or is is build for antrophic only
Thinking
Thinking
Currently it's Anthropic-only — the code calls the
anthropic
SDK directly in both
consumers/pipes.py
consumers/pipes.py
and
consumers/summary.py
consumers/summary.py
.
Adding other providers is straightforward since Mistral, Ollama, OpenRouter, Groq, and most modern LLMs expose an OpenAI-compatible API. The cleanest approach: add the
openai
package and let you configure the provider via
.env
:
Copy code to clipboard
# Anthropic (default)
[ENV_SECRET]
# OR: any OpenAI-compatible provider
LLM_PROVIDER=openai-compatible
LLM_BASE_URL=https://api.mistral.ai/v1
[ENV_SECRET]
LLM_MODEL=mistral-small-latest
This would cover Mistral, Ollama (
http://localhost:11434/v1
), OpenRouter, Groq, etc. with one code path.
Want me to implement that? It's a small change — a
_call_llm(system, user_prompt)
helper in
pipes.py
that branches on
LLM_PROVIDER
, and
openai
added to
requirements.txt
.
Message actions
lets test it with anthropic first, how to run pipe
Thinking
Thinking
Bash
Check if container is running
IN
cd /volume1/screenpipe/app && docker compose ps 2>/dev/null || docker ps --filter name=screenpipe
Copy code to clipboard
OUT
Exit code 1
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.45/containers/json?filters=%7B%22name%22%3A%7B%22screenpipe%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied
I don't have Docker socket access, so you'll need to run these yourself. Here are your options:
Option A — via Docker (if container is already running):
Copy code to clipboard
# List available pipes
docker exec -it screenpipe-app python cli.py pipe --list
# Run a specific pipe
docker exec -it screenpipe-app python cli.py pipe day-recap
Option B — rebuild first (since we added new files), then run:
Copy code to clipboard
cd /volume1/screenpipe/app
docker compose up --build -d
docker exec -it screenpipe-app python cli.py pipe day-recap
Option C — via the API
(if container is up on port 8766):
Copy code to clipboard
# List pipes
curl http://localhost:8766/api/pipes
# Run a pipe
curl -X POST http://localhost:8766/api/pipes/day-recap/run
The...
|
Code
|
Implement and run pipe l… — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
__pycache__
consumers
static
.env
cli.py
M
db.py
docker-compose.yml
Dockerfile
M
main.py
M
requirements.txt
M
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
[ENV_SECRET]
[ENV_SECRET]
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Info: Setting up SSH Host nas: Setting up SSH tunnel
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard
OUT
activity.py
audio.py
elements.py
__init__.py
interactions.py
ocr.py
__pycache__
report.py
summary.py
"""AI summary generator using Claude."""
import os
from datetime import date
from typing import Any
from consumers.activity import app_time, summary_stats, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
SUMMARY_TYPES = {
"day_recap": {
"title": "Day Recap",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a concise day recap using EXACTLY this format:
## Summary
One sentence: what I mainly worked on today.
## Accomplishments
- Top 3 things I finished, with approximate times
- Name specific apps, files, or projects
## Key Moments
- Important things seen or heard, with timestamps
## Unfinished Work
- What to continue tomorrow
## Patterns
- Apps used most, topics that came up repeatedly
Only report what you can verify from the data. End with:
**Next step:** [most important thing to continue]""",
},
"standup": {
"title": "Standup Update",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a brief standup update:
**Yesterday:** What I worked on (2-3 bullets)
**Today:** What I plan to continue (1-2 bullets)
**Blockers:** Any issues observed (or "None")
Keep it under 150 words. Only report what's verifiable from the data.""",
},
"focus_time": {
"title": "Focus & Distraction Analysis",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
Analyze focus vs distraction patterns:
## Deep Work Periods
- When and how long were sustained focus sessions?
- What was being worked on?
## Distractions
- Which apps/sites were used recreationally?
- How much time was lost to context switching?
## Recommendations
- 2-3 concrete suggestions to improve focus tomorrow
Be specific with times and app names.""",
},
}
def _build_context(for_date: date) -> dict[str, str]:
apps = app_time(for_date)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(for_date, limit=20)
url_lines = "\n".join(
f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls
)
ui_rows = ui_event_summary(for_date)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(for_date, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(for_date)
return {
"app_time": app_lines or "(no app data)",
"browser_urls": url_lines or "(no browser data)",
"ui_events": ui_lines or "(no UI event data)",
"ocr_text": ocr_lines or "(no screen data)",
"audio": audio[:3000] if len(audio) > 3000 else audio,
}
def generate(
summary_type: str = "day_recap",
for_date: date | None = None,
api_key: str | None = None,
) -> dict[str, Any]:
"""Generate an AI summary. Returns dict with 'title', 'content', 'date'."""
import anthropic
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {
"title": "Error",
"content": "ANTHROPIC_API_KEY not set. Add it to your .env file.",
"date": str(d),
}
spec = SUMMARY_TYPES.get(summary_type)
if not spec:
return {"title": "Error", "content": f"Unknown summary type: {summary_type}", "date": str(d)}
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {
"title": spec["title"],
"content": f"No screen data found for {d}.",
"date": str(d),
}
ctx = _build_context(d)
prompt = spec["prompt"].format(**ctx)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return {
"title": spec["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Thinking
Thinking
Bash
Read all pipe.md files
IN
for f in /volume1/screenpipe/pipes/*/pipe.md; do echo "=== $f ==="; cat "$f"; echo; done
Copy code to clipboard
OUT
=== /volume1/screenpipe/pipes/ai-habits/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: AI Habits
description: "How you use AI tools — patterns and insights"
icon: "🤖"
featured: true
---
Search my recordings from the last 24 hours for AI tool usage. Use app_name filter for each tool separately: ChatGPT, Claude, Copilot, Cursor, Gemini, Perplexity. Use limit=5 per search, max 6 searches total.
Read screenpipe skill first.
Use this exact format:
## AI Tools Used
- List each tool with approximate time spent (e.g. "Claude: ~45min")
## What I Used Them For
- For each tool: coding, writing, research, or brainstorming
## Usage Patterns
- Do I switch between tools? Use them in bursts or steadily?
## Effectiveness
- Which tool appeared alongside completed work vs. abandoned attempts
If no AI usage is found, say so clearly. End with: "**Tip:** [one suggestion to use AI tools more effectively]"
=== /volume1/screenpipe/pipes/day-recap/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
=== /volume1/screenpipe/pipes/meeting-summary/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
=== /volume1/screenpipe/pipes/standup-update/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Standup Update
description: "What you did, what's next, and any blockers"
icon: "🏢"
featured: true
---
Based on my recordings from the last 24 hours, generate a standup update. Use limit=10 per search, max 3 searches total.
Read screenpipe skill first.
Use this exact format:
## Yesterday
- What I worked on (name specific projects, files, tools, PRs)
## Today
- What I will work on next (based on unfinished tasks and recent activity)
## Blockers
- Issues I hit — errors, slow builds, waiting on someone
- If no blockers, write "None"
Keep it under 150 words. Copy-paste ready for a team standup.
=== /volume1/screenpipe/pipes/time-breakdown/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Time Breakdown
description: "Where your time went — by app, project, and category"
icon: "⏱"
featured: false
---
Analyze my app usage from today (last 12 hours). Use limit=10 per search, max 4 searches. Prefer /raw_sql with COUNT/GROUP BY queries.
Read screenpipe skill first.
Use this exact format with durations and percentages:
## By Application
- List each app with duration and percentage, sorted by time (e.g. "VS Code: 2h 15min (28%)")
## By Category
- Group into: coding, meetings, browsing, writing, communication, other
- Show hours and percentage per category
## By Project
- Group related activities by project/topic. Name specific repos or tasks.
## Productivity Score
- Calculate: focused_work_hours / total_hours as a percentage
- Focused = coding + writing. Unfocused = browsing + switching.
End with: "**Suggestion:** [one specific change to improve tomorrow's productivity]"
=== /volume1/screenpipe/pipes/video-export/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Export Video Clip
description: "Create a video of your recent screen activity"
icon: "🎬"
featured: false
---
Export a video of my screen activity from the last 5 minutes.
Read screenpipe skill first.
Use the POST /frames/export endpoint with the time range and fps=1.0. Then show me the exported video file path as an inline code block so I can watch it.
If the export is large, suggest a lower fps or shorter time range.
Thinking
Thinking
Now I have a clear picture. The pipes are markdown prompt templates in
/volume1/screenpipe/pipes/
, but there's no code to actually load and run them —
consumers/summary.py
has its own hardcoded prompts instead.
Here's what I'll implement:
consumers/pipes.py
— dynamically loads pipe.md files and runs them via Claude (data injected as system message, pipe body as user prompt)
main.py
— two new endpoints:
GET /api/pipes
and
POST /api/pipes/{pipe_id}/run
cli.py
— a
pipe
command to run pipes from the terminal
Let me build it:
Bash
Check browser_urls signature
IN
cat /volume1/screenpipe/app/consumers/activity.py | grep "def browser_urls" -A 5
Copy code to clipboard
OUT
def browser_urls(
for_date: date | None = None,
app_name: str | None = None,
ts_from: str | None = None,
ts_to: str | None = None,
limit: int = 30,
Write
pipes.py
pipes.py
143 lines
"""Dynamic pipe loader and runner."""
import os
import re
from datetime import date
from pathlib import Path
from typing import Any
def _parse_frontmatter(text: str) -> tuple[dict, str]:
if not text.startswith("---"):
return {}, text
try:
end = text.index("---", 3)
except ValueError:
return {}, text
fm_text = text[3:end].strip()
body = text[end + 3:].strip()
meta: dict = {}
for line in fm_text.splitlines():
if ":" in line:
k, _, v = line.partition(":")
meta[k.strip()] = v.strip().strip('"')
return meta, body
def load_pipes(pipes_dir: Path) -> list[dict]:
if not pipes_dir.exists():
return []
pipes = []
for pipe_dir in sorted(pipes_dir.iterdir()):
if not pipe_dir.is_dir() or pipe_dir.name.startswith("."):
continue
pipe_md = pipe_dir / "pipe.md"
if not pipe_md.exists():
continue
text = pipe_md.read_text(encoding="utf-8")
meta, body = _parse_frontmatter(text)
if meta.get("enabled", "true").lower() == "false":
continue
body = re.sub(r"Read screenpipe skill first\.?\n?", "", body).strip()
pipes.append({
"id": pipe_dir.name,
"title": meta.get("title", pipe_dir.name),
"description": meta.get("description", ""),
"icon": meta.get("icon", ""),
"schedule": meta.get("schedule", "manual"),
"featured": meta.get("featured", "false").lower() == "true",
"prompt": body,
})
return pipes
def _build_system(d: date, stats: dict) -> str:
from consumers.activity import app_time, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
apps = app_time(d)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(d, limit=20)
url_lines = "\n".join(f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls)
ui_rows = ui_event_summary(d)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(d, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(d)
if len(audio) > 3000:
audio = audio[:3000]
return f"""You are analyzing screenpipe data — recordings of a person's computer activity.
## App usage (minutes per app):
{app_lines or '(no app data)'}
## Browser URLs visited:
{url_lines or '(no browser data)'}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_lines or '(no UI event data)'}
## Screen content (sampled every 5 minutes):
{ocr_lines or '(no screen data)'}
## Audio / speech:
{audio or '(no audio data)'}
Date: {d}
Total frames recorded: {stats['total_frames']}"""
def run_pipe(
pipe_id: str,
for_date: date | None = None,
api_key: str | None = None,
pipes_dir: Path | None = None,
) -> dict[str, Any]:
import anthropic
from consumers.activity import summary_stats
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {"title": "Error", "content": "ANTHROPIC_API_KEY not set.", "date": str(d)}
_pipes_dir = pipes_dir or Path(os.environ.get("PIPES_DIR", "/data/pipes"))
all_pipes = {p["id"]: p for p in load_pipes(_pipes_dir)}
if pipe_id not in all_pipes:
return {"title": "Error", "content": f"Pipe not found: {pipe_id}", "date": str(d)}
pipe = all_pipes[pipe_id]
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {"title": pipe["title"], "content": f"No screen data found for {d}.", "date": str(d)}
system = _build_system(d, stats)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
system=system,
messages=[{"role": "user", "content": pipe["prompt"]}],
)
return {
"title": pipe["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Now add the API endpoints to
main.py
:
Edit
main.py
main.py
Added 1 line
Edit
main.py
main.py
Added 1 line
Now add the pipe endpoints at the end of main.py, before the closing summary section:
Edit
main.py
main.py
Added 14 lines
Now add the CLI command to
cli.py
cli.py
:
Edit
cli.py
cli.py
Added 40 lines
Done. Here's how to run pipes:
How to run pipes
How to run pipes
Via Docker (inside the running container):
Copy code to clipboard
docker exec -it screenpipe-app python cli.py pipe --list
docker exec -it screenpipe-app python cli.py pipe day-recap
docker exec -it screenpipe-app python cli.py pipe standup-update --date yesterday
Via the API
(once the container is running on port 8766):
Copy code to clipboard
# List pipes
curl http://localhost:8766/api/pipes
# Run a pipe
curl -X POST "http://localhost:8766/api/pipes/day-recap/run"
curl -X POST "http://localhost:8766/api/pipes/standup-update/run?d=2026-04-25"
Rebuild and restart
to pick up the changes:
Copy code to clipboard
cd /volume1/screenpipe/app
docker compose up --build -d
How it works:
Each
pipe.md
file is loaded dynamically — the frontmatter provides metadata and the body becomes the user prompt. Screenpipe data (app usage, browser URLs, OCR, audio) is injected as a Claude system message so the pipe prompt can reference it naturally. The "Read screenpipe skill first" line is stripped automatically.
The pipes directory inside Docker is
/data/pipes
(since
/volume1/screenpipe
→
/data
). You can override it with a
PIPES_DIR
env var if needed.
Message actions
pipe.md
pipe.md
where to place api key and can I use different keys
Thinking
Thinking
Bash
Read .env file
IN
cat /volume1/screenpipe/app/.env
Copy code to clipboard
OUT
[ENV_SECRET]
Where to place the key:
app/.env
app/.env
— replace
your_key_here
with your actual key:
Copy code to clipboard
[ENV_SECRET]
That file is loaded by
load_dotenv()
at startup and is excluded from git via
.gitignore
.
Can you use different keys per pipe?
Not currently — all pipes share the one key from
.env
. But you can add per-pipe key support by adding an
api_key
field to the pipe's frontmatter:
Copy code to clipboard
---
title: Meeting Summary
api_key_env: ANTHROPIC_KEY_MEETINGS
---
Then in
consumers/pipes.py
consumers/pipes.py
, read it in
_parse_frontmatter
and pass it through
run_pipe
. Want me to add that?
Message actions
pipe.md
pipe.md
can I use mistral or other llm? or is is build for antrophic only
Thinking
Thinking
Currently it's Anthropic-only — the code calls the
anthropic
SDK directly in both
consumers/pipes.py
consumers/pipes.py
and
consumers/summary.py
consumers/summary.py
.
Adding other providers is straightforward since Mistral, Ollama, OpenRouter, Groq, and most modern LLMs expose an OpenAI-compatible API. The cleanest approach: add the
openai
package and let you configure the provider via
.env
:
Copy code to clipboard
# Anthropic (default)
[ENV_SECRET]
# OR: any OpenAI-compatible provider
LLM_PROVIDER=openai-compatible
LLM_BASE_URL=https://api.mistral.ai/v1
[ENV_SECRET]
LLM_MODEL=mistral-small-latest
This would cover Mistral, Ollama (
http://localhost:11434/v1
), OpenRouter, Groq, etc. with one code path.
Want me to implement that? It's a small change — a
_call_llm(system, user_prompt)
helper in
pipes.py
that branches on
LLM_PROVIDER...
|
Code
|
Implement and run pipe l… — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
__pycache__
consumers
static
.env
cli.py
M
db.py
docker-compose.yml
Dockerfile
M
main.py
M
requirements.txt
M
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
[ENV_SECRET]
[ENV_SECRET]
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Info: Setting up SSH Host nas: Setting up SSH tunnel
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard
OUT
activity.py
audio.py
elements.py
__init__.py
interactions.py
ocr.py
__pycache__
report.py
summary.py
"""AI summary generator using Claude."""
import os
from datetime import date
from typing import Any
from consumers.activity import app_time, summary_stats, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
SUMMARY_TYPES = {
"day_recap": {
"title": "Day Recap",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a concise day recap using EXACTLY this format:
## Summary
One sentence: what I mainly worked on today.
## Accomplishments
- Top 3 things I finished, with approximate times
- Name specific apps, files, or projects
## Key Moments
- Important things seen or heard, with timestamps
## Unfinished Work
- What to continue tomorrow
## Patterns
- Apps used most, topics that came up repeatedly
Only report what you can verify from the data. End with:
**Next step:** [most important thing to continue]""",
},
"standup": {
"title": "Standup Update",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a brief standup update:
**Yesterday:** What I worked on (2-3 bullets)
**Today:** What I plan to continue (1-2 bullets)
**Blockers:** Any issues observed (or "None")
Keep it under 150 words. Only report what's verifiable from the data.""",
},
"focus_time": {
"title": "Focus & Distraction Analysis",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
Analyze focus vs distraction patterns:
## Deep Work Periods
- When and how long were sustained focus sessions?
- What was being worked on?
## Distractions
- Which apps/sites were used recreationally?
- How much time was lost to context switching?
## Recommendations
- 2-3 concrete suggestions to improve focus tomorrow
Be specific with times and app names.""",
},
}
def _build_context(for_date: date) -> dict[str, str]:
apps = app_time(for_date)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(for_date, limit=20)
url_lines = "\n".join(
f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls
)
ui_rows = ui_event_summary(for_date)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(for_date, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(for_date)
return {
"app_time": app_lines or "(no app data)",
"browser_urls": url_lines or "(no browser data)",
"ui_events": ui_lines or "(no UI event data)",
"ocr_text": ocr_lines or "(no screen data)",
"audio": audio[:3000] if len(audio) > 3000 else audio,
}
def generate(
summary_type: str = "day_recap",
for_date: date | None = None,
api_key: str | None = None,
) -> dict[str, Any]:
"""Generate an AI summary. Returns dict with 'title', 'content', 'date'."""
import anthropic
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {
"title": "Error",
"content": "ANTHROPIC_API_KEY not set. Add it to your .env file.",
"date": str(d),
}
spec = SUMMARY_TYPES.get(summary_type)
if not spec:
return {"title": "Error", "content": f"Unknown summary type: {summary_type}", "date": str(d)}
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {
"title": spec["title"],
"content": f"No screen data found for {d}.",
"date": str(d),
}
ctx = _build_context(d)
prompt = spec["prompt"].format(**ctx)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return {
"title": spec["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Thinking
Thinking
Bash
Read all pipe.md files
IN
for f in /volume1/screenpipe/pipes/*/pipe.md; do echo "=== $f ==="; cat "$f"; echo; done
Copy code to clipboard
OUT
=== /volume1/screenpipe/pipes/ai-habits/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: AI Habits
description: "How you use AI tools — patterns and insights"
icon: "🤖"
featured: true
---
Search my recordings from the last 24 hours for AI tool usage. Use app_name filter for each tool separately: ChatGPT, Claude, Copilot, Cursor, Gemini, Perplexity. Use limit=5 per search, max 6 searches total.
Read screenpipe skill first.
Use this exact format:
## AI Tools Used
- List each tool with approximate time spent (e.g. "Claude: ~45min")
## What I Used Them For
- For each tool: coding, writing, research, or brainstorming
## Usage Patterns
- Do I switch between tools? Use them in bursts or steadily?
## Effectiveness
- Which tool appeared alongside completed work vs. abandoned attempts
If no AI usage is found, say so clearly. End with: "**Tip:** [one suggestion to use AI tools more effectively]"
=== /volume1/screenpipe/pipes/day-recap/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
=== /volume1/screenpipe/pipes/meeting-summary/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
=== /volume1/screenpipe/pipes/standup-update/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Standup Update
description: "What you did, what's next, and any blockers"
icon: "🏢"
featured: true
---
Based on my recordings from the last 24 hours, generate a standup update. Use limit=10 per search, max 3 searches total.
Read screenpipe skill first.
Use this exact format:
## Yesterday
- What I worked on (name specific projects, files, tools, PRs)
## Today
- What I will work on next (based on unfinished tasks and recent activity)
## Blockers
- Issues I hit — errors, slow builds, waiting on someone
- If no blockers, write "None"
Keep it under 150 words. Copy-paste ready for a team standup.
=== /volume1/screenpipe/pipes/time-breakdown/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Time Breakdown
description: "Where your time went — by app, project, and category"
icon: "⏱"
featured: false
---
Analyze my app usage from today (last 12 hours). Use limit=10 per search, max 4 searches. Prefer /raw_sql with COUNT/GROUP BY queries.
Read screenpipe skill first.
Use this exact format with durations and percentages:
## By Application
- List each app with duration and percentage, sorted by time (e.g. "VS Code: 2h 15min (28%)")
## By Category
- Group into: coding, meetings, browsing, writing, communication, other
- Show hours and percentage per category
## By Project
- Group related activities by project/topic. Name specific repos or tasks.
## Productivity Score
- Calculate: focused_work_hours / total_hours as a percentage
- Focused = coding + writing. Unfocused = browsing + switching.
End with: "**Suggestion:** [one specific change to improve tomorrow's productivity]"
=== /volume1/screenpipe/pipes/video-export/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Export Video Clip
description: "Create a video of your recent screen activity"
icon: "🎬"
featured: false
---
Export a video of my screen activity from the last 5 minutes.
Read screenpipe skill first.
Use the POST /frames/export endpoint with the time range and fps=1.0. Then show me the exported video file path as an inline code block so I can watch it.
If the export is large, suggest a lower fps or shorter time range.
Thinking
Thinking
Now I have a clear picture. The pipes are markdown prompt templates in
/volume1/screenpipe/pipes/
, but there's no code to actually load and run them —
consumers/summary.py
has its own hardcoded prompts instead.
Here's what I'll implement:
consumers/pipes.py
— dynamically loads pipe.md files and runs them via Claude (data injected as system message, pipe body as user prompt)
main.py
— two new endpoints:
GET /api/pipes
and
POST /api/pipes/{pipe_id}/run
cli.py
— a
pipe
command to run pipes from the terminal
Let me build it:
Bash
Check browser_urls signature
IN
cat /volume1/screenpipe/app/consumers/activity.py | grep "def browser_urls" -A 5
Copy code to clipboard
OUT
def browser_urls(
for_date: date | None = None,
app_name: str | None = None,
ts_from: str | None = None,
ts_to: str | None = None,
limit: int = 30,
Write
pipes.py
pipes.py
143 lines
"""Dynamic pipe loader and runner."""
import os
import re
from datetime import date
from pathlib import Path
from typing import Any
def _parse_frontmatter(text: str) -> tuple[dict, str]:
if not text.startswith("---"):
return {}, text
try:
end = text.index("---", 3)
except ValueError:
return {}, text
fm_text = text[3:end].strip()
body = text[end + 3:].strip()
meta: dict = {}
for line in fm_text.splitlines():
if ":" in line:
k, _, v = line.partition(":")
meta[k.strip()] = v.strip().strip('"')
return meta, body
def load_pipes(pipes_dir: Path) -> list[dict]:
if not pipes_dir.exists():
return []
pipes = []
for pipe_dir in sorted(pipes_dir.iterdir()):
if not pipe_dir.is_dir() or pipe_dir.name.startswith("."):
continue
pipe_md = pipe_dir / "pipe.md"
if not pipe_md.exists():
continue
text = pipe_md.read_text(encoding="utf-8")
meta, body = _parse_frontmatter(text)
if meta.get("enabled", "true").lower() == "false":
continue
body = re.sub(r"Read screenpipe skill first\.?\n?", "", body).strip()
pipes.append({
"id": pipe_dir.name,
"title": meta.get("title", pipe_dir.name),
"description": meta.get("description", ""),
"icon": meta.get("icon", ""),
"schedule": meta.get("schedule", "manual"),
"featured": meta.get("featured", "false").lower() == "true",
"prompt": body,
})
return pipes
def _build_system(d: date, stats: dict) -> str:
from consumers.activity import app_time, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
apps = app_time(d)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(d, limit=20)
url_lines = "\n".join(f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls)
ui_rows = ui_event_summary(d)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(d, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(d)
if len(audio) > 3000:
audio = audio[:3000]
return f"""You are analyzing screenpipe data — recordings of a person's computer activity.
## App usage (minutes per app):
{app_lines or '(no app data)'}
## Browser URLs visited:
{url_lines or '(no browser data)'}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_lines or '(no UI event data)'}
## Screen content (sampled every 5 minutes):
{ocr_lines or '(no screen data)'}
## Audio / speech:
{audio or '(no audio data)'}
Date: {d}
Total frames recorded: {stats['total_frames']}"""
def run_pipe(
pipe_id: str,
for_date: date | None = None,
api_key: str | None = None,
pipes_dir: Path | None = None,
) -> dict[str, Any]:
import anthropic
from consumers.activity import summary_stats
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {"title": "Error", "content": "ANTHROPIC_API_KEY not set.", "date": str(d)}
_pipes_dir = pipes_dir or Path(os.environ.get("PIPES_DIR", "/data/pipes"))
all_pipes = {p["id"]: p for p in load_pipes(_pipes_dir)}
if pipe_id not in all_pipes:
return {"title": "Error", "content": f"Pipe not found: {pipe_id}", "date": str(d)}
pipe = all_pipes[pipe_id]
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {"title": pipe["title"], "content": f"No screen data found for {d}.", "date": str(d)}
system = _build_system(d, stats)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
system=system,
messages=[{"role": "user", "content": pipe["prompt"]}],
)
return {
"title": pipe["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Now add the API endpoints to
main.py
:
Edit
main.py
main.py
Added 1 line
Edit
main.py
main.py
Added 1 line
Now add the pipe endpoints at the end of main.py, before the closing summary section:
Edit
main.py
main.py
Added 14 lines
Now add the CLI command to
cli.py
cli.py
:
Edit
cli.py
cli.py
Added 40 lines
Done. Here's how to run pipes:
How to run pipes
How to run pipes
Via Docker (inside the running container):
Copy code to clipboard
docker exec -it screenpipe-app python cli.py pipe --list
docker exec -it screenpipe-app python cli.py pipe day-recap
docker exec -it screenpipe-app python cli.py pipe standup-update --date yesterday
Via the API
(once the container is running on port 8766):
Copy code to clipboard
# List pipes
curl http://localhost:8766/api/pipes
# Run a pipe
curl -X POST "http://localhost:8766/api/pipes/day-recap/run"
curl -X POST "http://localhost:8766/api/pipes/standup-update/run?d=2026-04-25"
Rebuild and restart
to pick up the changes:
Copy code to clipboard
cd /volume1/screenpipe/app
docker compose up --build -d
How it works:
Each
pipe.md
file is loaded dynamically — the frontmatter provides metadata and the body becomes the user prompt. Screenpipe data (app usage, browser URLs, OCR, audio) is injected as a Claude system message so the pipe prompt can reference it naturally. The "Read screenpipe skill first" line is stripped automatically.
The pipes directory inside Docker is
/data/pipes
(since
/volume1/screenpipe
→
/data
). You can override it with a
PIPES_DIR
env var if needed.
Message actions
pipe.md
pipe.md
where to place api key and can I use different keys
Thinking
Thinking
Bash
Read .env file
IN
cat /volume1/screenpipe/app/.env
Copy code to clipboard
OUT
[ENV_SECRET]
Where to place the key:
app/.env
app/.env
— replace
your_key_here
with your actual key:
Copy code to clipboard
[ENV_SECRET]
That file is loaded by
load_dotenv()
at startup and is excluded from git via
.gitignore
.
Can you use different keys per pipe?
Not currently — all pipes share the one key from
.env
. But you can add per-pipe key support by adding an
api_key
field to the pipe's frontmatter:
Copy code to clipboard
---
title: Meeting Summary
api_key_env: ANTHROPIC_KEY_MEETINGS
---
Then in
consumers/pipes.py
consumers/pipes.py
, read it in
_parse_frontmatter
and pass it through
run_pipe
. Want me to add that?
Message actions
pipe.md
pipe.md
can I use mistral or other llm? or is is build for antrophic only
Thinking
Thinking
Currently it's Anthropic-only — the code calls the
anthropic
SDK directly in both
consumers/pipes.py
consumers/pipes.py
and
consumers/summary.py
consumers/summary.py
.
Adding other providers is straightforward since Mistral, Ollama, OpenRouter, Groq, and most modern LLMs expose an OpenAI-compatible API. The cleanest approach: add the
openai
package and let you configure the provider via
.env
:...
|
Code
|
Implement and run pipe l… — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
__pycache__
consumers
static
.env
cli.py
M
db.py
docker-compose.yml
Dockerfile
M
main.py
M
requirements.txt
M
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
[ENV_SECRET]
[ENV_SECRET]
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Info: Setting up SSH Host nas: Setting up SSH tunnel
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard
OUT
activity.py
audio.py
elements.py
__init__.py
interactions.py
ocr.py
__pycache__
report.py
summary.py
"""AI summary generator using Claude."""
import os
from datetime import date
from typing import Any
from consumers.activity import app_time, summary_stats, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
SUMMARY_TYPES = {
"day_recap": {
"title": "Day Recap",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a concise day recap using EXACTLY this format:
## Summary
One sentence: what I mainly worked on today.
## Accomplishments
- Top 3 things I finished, with approximate times
- Name specific apps, files, or projects
## Key Moments
- Important things seen or heard, with timestamps
## Unfinished Work
- What to continue tomorrow
## Patterns
- Apps used most, topics that came up repeatedly
Only report what you can verify from the data. End with:
**Next step:** [most important thing to continue]""",
},
"standup": {
"title": "Standup Update",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a brief standup update:
**Yesterday:** What I worked on (2-3 bullets)
**Today:** What I plan to continue (1-2 bullets)
**Blockers:** Any issues observed (or "None")
Keep it under 150 words. Only report what's verifiable from the data.""",
},
"focus_time": {
"title": "Focus & Distraction Analysis",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
Analyze focus vs distraction patterns:
## Deep Work Periods
- When and how long were sustained focus sessions?
- What was being worked on?
## Distractions
- Which apps/sites were used recreationally?
- How much time was lost to context switching?
## Recommendations
- 2-3 concrete suggestions to improve focus tomorrow
Be specific with times and app names.""",
},
}
def _build_context(for_date: date) -> dict[str, str]:
apps = app_time(for_date)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(for_date, limit=20)
url_lines = "\n".join(
f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls
)
ui_rows = ui_event_summary(for_date)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(for_date, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(for_date)
return {
"app_time": app_lines or "(no app data)",
"browser_urls": url_lines or "(no browser data)",
"ui_events": ui_lines or "(no UI event data)",
"ocr_text": ocr_lines or "(no screen data)",
"audio": audio[:3000] if len(audio) > 3000 else audio,
}
def generate(
summary_type: str = "day_recap",
for_date: date | None = None,
api_key: str | None = None,
) -> dict[str, Any]:
"""Generate an AI summary. Returns dict with 'title', 'content', 'date'."""
import anthropic
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {
"title": "Error",
"content": "ANTHROPIC_API_KEY not set. Add it to your .env file.",
"date": str(d),
}
spec = SUMMARY_TYPES.get(summary_type)
if not spec:
return {"title": "Error", "content": f"Unknown summary type: {summary_type}", "date": str(d)}
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {
"title": spec["title"],
"content": f"No screen data found for {d}.",
"date": str(d),
}
ctx = _build_context(d)
prompt = spec["prompt"].format(**ctx)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return {
"title": spec["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Thinking
Thinking
Bash
Read all pipe.md files
IN
for f in /volume1/screenpipe/pipes/*/pipe.md; do echo "=== $f ==="; cat "$f"; echo; done
Copy code to clipboard
OUT
=== /volume1/screenpipe/pipes/ai-habits/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: AI Habits
description: "How you use AI tools — patterns and insights"
icon: "🤖"
featured: true
---
Search my recordings from the last 24 hours for AI tool usage. Use app_name filter for each tool separately: ChatGPT, Claude, Copilot, Cursor, Gemini, Perplexity. Use limit=5 per search, max 6 searches total.
Read screenpipe skill first.
Use this exact format:
## AI Tools Used
- List each tool with approximate time spent (e.g. "Claude: ~45min")
## What I Used Them For
- For each tool: coding, writing, research, or brainstorming
## Usage Patterns
- Do I switch between tools? Use them in bursts or steadily?
## Effectiveness
- Which tool appeared alongside completed work vs. abandoned attempts
If no AI usage is found, say so clearly. End with: "**Tip:** [one suggestion to use AI tools more effectively]"
=== /volume1/screenpipe/pipes/day-recap/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
=== /volume1/screenpipe/pipes/meeting-summary/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
=== /volume1/screenpipe/pipes/standup-update/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Standup Update
description: "What you did, what's next, and any blockers"
icon: "🏢"
featured: true
---
Based on my recordings from the last 24 hours, generate a standup update. Use limit=10 per search, max 3 searches total.
Read screenpipe skill first.
Use this exact format:
## Yesterday
- What I worked on (name specific projects, files, tools, PRs)
## Today
- What I will work on next (based on unfinished tasks and recent activity)
## Blockers
- Issues I hit — errors, slow builds, waiting on someone
- If no blockers, write "None"
Keep it under 150 words. Copy-paste ready for a team standup.
=== /volume1/screenpipe/pipes/time-breakdown/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Time Breakdown
description: "Where your time went — by app, project, and category"
icon: "⏱"
featured: false
---
Analyze my app usage from today (last 12 hours). Use limit=10 per search, max 4 searches. Prefer /raw_sql with COUNT/GROUP BY queries.
Read screenpipe skill first.
Use this exact format with durations and percentages:
## By Application
- List each app with duration and percentage, sorted by time (e.g. "VS Code: 2h 15min (28%)")
## By Category
- Group into: coding, meetings, browsing, writing, communication, other
- Show hours and percentage per category
## By Project
- Group related activities by project/topic. Name specific repos or tasks.
## Productivity Score
- Calculate: focused_work_hours / total_hours as a percentage
- Focused = coding + writing. Unfocused = browsing + switching.
End with: "**Suggestion:** [one specific change to improve tomorrow's productivity]"
=== /volume1/screenpipe/pipes/video-export/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Export Video Clip
description: "Create a video of your recent screen activity"
icon: "🎬"
featured: false
---
Export a video of my screen activity from the last 5 minutes.
Read screenpipe skill first.
Use the POST /frames/export endpoint with the time range and fps=1.0. Then show me the exported video file path as an inline code block so I can watch it.
If the export is large, suggest a lower fps or shorter time range.
Thinking
Thinking
Now I have a clear picture. The pipes are markdown prompt templates in
/volume1/screenpipe/pipes/
, but there's no code to actually load and run them —
consumers/summary.py
has its own hardcoded prompts instead.
Here's what I'll implement:
consumers/pipes.py
— dynamically loads pipe.md files and runs them via Claude (data injected as system message, pipe body as user prompt)
main.py
— two new endpoints:
GET /api/pipes
and
POST /api/pipes/{pipe_id}/run
cli.py
— a
pipe
command to run pipes from the terminal
Let me build it:
Bash
Check browser_urls signature
IN
cat /volume1/screenpipe/app/consumers/activity.py | grep "def browser_urls" -A 5
Copy code to clipboard
OUT
def browser_urls(
for_date: date | None = None,
app_name: str | None = None,
ts_from: str | None = None,
ts_to: str | None = None,
limit: int = 30,
Write
pipes.py
pipes.py
143 lines
"""Dynamic pipe loader and runner."""
import os
import re
from datetime import date
from pathlib import Path
from typing import Any
def _parse_frontmatter(text: str) -> tuple[dict, str]:
if not text.startswith("---"):
return {}, text
try:
end = text.index("---", 3)
except ValueError:
return {}, text
fm_text = text[3:end].strip()
body = text[end + 3:].strip()
meta: dict = {}
for line in fm_text.splitlines():
if ":" in line:
k, _, v = line.partition(":")
meta[k.strip()] = v.strip().strip('"')
return meta, body
def load_pipes(pipes_dir: Path) -> list[dict]:
if not pipes_dir.exists():
return []
pipes = []
for pipe_dir in sorted(pipes_dir.iterdir()):
if not pipe_dir.is_dir() or pipe_dir.name.startswith("."):
continue
pipe_md = pipe_dir / "pipe.md"
if not pipe_md.exists():
continue
text = pipe_md.read_text(encoding="utf-8")
meta, body = _parse_frontmatter(text)
if meta.get("enabled", "true").lower() == "false":
continue
body = re.sub(r"Read screenpipe skill first\.?\n?", "", body).strip()
pipes.append({
"id": pipe_dir.name,
"title": meta.get("title", pipe_dir.name),
"description": meta.get("description", ""),
"icon": meta.get("icon", ""),
"schedule": meta.get("schedule", "manual"),
"featured": meta.get("featured", "false").lower() == "true",
"prompt": body,
})
return pipes
def _build_system(d: date, stats: dict) -> str:
from consumers.activity import app_time, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
apps = app_time(d)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(d, limit=20)
url_lines = "\n".join(f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls)
ui_rows = ui_event_summary(d)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(d, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(d)
if len(audio) > 3000:
audio = audio[:3000]
return f"""You are analyzing screenpipe data — recordings of a person's computer activity.
## App usage (minutes per app):
{app_lines or '(no app data)'}
## Browser URLs visited:
{url_lines or '(no browser data)'}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_lines or '(no UI event data)'}
## Screen content (sampled every 5 minutes):
{ocr_lines or '(no screen data)'}
## Audio / speech:
{audio or '(no audio data)'}
Date: {d}
Total frames recorded: {stats['total_frames']}"""
def run_pipe(
pipe_id: str,
for_date: date | None = None,
api_key: str | None = None,
pipes_dir: Path | None = None,
) -> dict[str, Any]:
import anthropic
from consumers.activity import summary_stats
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {"title": "Error", "content": "ANTHROPIC_API_KEY not set.", "date": str(d)}
_pipes_dir = pipes_dir or Path(os.environ.get("PIPES_DIR", "/data/pipes"))
all_pipes = {p["id"]: p for p in load_pipes(_pipes_dir)}
if pipe_id not in all_pipes:
return {"title": "Error", "content": f"Pipe not found: {pipe_id}", "date": str(d)}
pipe = all_pipes[pipe_id]
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {"title": pipe["title"], "content": f"No screen data found for {d}.", "date": str(d)}
system = _build_system(d, stats)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
system=system,
messages=[{"role": "user", "content": pipe["prompt"]}],
)
return {
"title": pipe["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Now add the API endpoints to
main.py
:
Edit
main.py
main.py
Added 1 line
Edit
main.py
main.py
Added 1 line
Now add the pipe endpoints at the end of main.py, before the closing summary section:
Edit
main.py
main.py
Added 14 lines
Now add the CLI command to
cli.py
cli.py
:
Edit
cli.py
cli.py
Added 40 lines
Done. Here's how to run pipes:
How to run pipes
How to run pipes
Via Docker (inside the running container):
Copy code to clipboard
docker exec -it screenpipe-app python cli.py pipe --list
docker exec -it screenpipe-app python cli.py pipe day-recap
docker exec -it screenpipe-app python cli.py pipe standup-update --date yesterday
Via the API
(once the container is running on port 8766):
Copy code to clipboard
# List pipes
curl http://localhost:8766/api/pipes
# Run a pipe
curl -X POST "http://localhost:8766/api/pipes/day-recap/run"
curl -X POST "http://localhost:8766/api/pipes/standup-update/run?d=2026-04-25"
Rebuild and restart
to pick up the changes:
Copy code to clipboard
cd /volume1/screenpipe/app
docker compose up --build -d
How it works:
Each
pipe.md
file is loaded dynamically — the frontmatter provides metadata and the body becomes the user prompt. Screenpipe data (app usage, browser URLs, OCR, audio) is injected as a Claude system message so the pipe prompt can reference it naturally. The "Read screenpipe skill first" line is stripped automatically.
The pipes directory inside Docker is
/data/pipes
(since
/volume1/screenpipe
→
/data
). You can override it with a
PIPES_DIR
env var if needed.
Message actions
pipe.md
pipe.md
where to place api key and can I use different keys
Thinking
Thinking
Bash
Read .env file
IN
cat /volume1/screenpipe/app/.env
Copy code to clipboard
OUT
[ENV_SECRET]
Where to place the key:
app/.env
app/.env
— replace
your_key_here
with your actual key:
Copy code to clipboard
[ENV_SECRET]
That file is loaded by
load_dotenv()
at startup and is excluded from git via
.gitignore
.
Can you use different keys per pipe?
Not currently — all pipes share the one key from
.env
. But you can add per-pipe key support by adding an
api_key
field to the pipe's frontmatter:
Copy code to clipboard
---
title: Meeting Summary
api_key_env: ANTHROPIC_KEY_MEETINGS
---
Then in
consumers/pipes.py
consumers/pipes.py
, read it in
_parse_frontmatter
and pass it through
run_pipe
. Want me to add that?
Message actions
pipe.md
pipe.md
can I use mistral or other llm? or is is build for antrophic only
Thinking
Thinking
Currently it's Anthropic-only — the code calls the
anthropic
SDK directly in both
consumers/pipes.py
consumers/pipes.py
and
consumers/summary.py
consumers/summary.py
.
Adding other providers is straightforward since Mistral, Ollama, OpenRouter, Groq, and most modern LLMs expose an OpenAI-compatible API. The cleanest approach: add the...
|
Code
|
Implement and run pipe l… — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
__pycache__
consumers
static
.env
cli.py
M
db.py
docker-compose.yml
Dockerfile
M
main.py
M
requirements.txt
M
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
[ENV_SECRET]
[ENV_SECRET]
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Info: Setting up SSH Host nas: Setting up SSH tunnel
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard
OUT
activity.py
audio.py
elements.py
__init__.py
interactions.py
ocr.py
__pycache__
report.py
summary.py
"""AI summary generator using Claude."""
import os
from datetime import date
from typing import Any
from consumers.activity import app_time, summary_stats, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
SUMMARY_TYPES = {
"day_recap": {
"title": "Day Recap",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a concise day recap using EXACTLY this format:
## Summary
One sentence: what I mainly worked on today.
## Accomplishments
- Top 3 things I finished, with approximate times
- Name specific apps, files, or projects
## Key Moments
- Important things seen or heard, with timestamps
## Unfinished Work
- What to continue tomorrow
## Patterns
- Apps used most, topics that came up repeatedly
Only report what you can verify from the data. End with:
**Next step:** [most important thing to continue]""",
},
"standup": {
"title": "Standup Update",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a brief standup update:
**Yesterday:** What I worked on (2-3 bullets)
**Today:** What I plan to continue (1-2 bullets)
**Blockers:** Any issues observed (or "None")
Keep it under 150 words. Only report what's verifiable from the data.""",
},
"focus_time": {
"title": "Focus & Distraction Analysis",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
Analyze focus vs distraction patterns:
## Deep Work Periods
- When and how long were sustained focus sessions?
- What was being worked on?
## Distractions
- Which apps/sites were used recreationally?
- How much time was lost to context switching?
## Recommendations
- 2-3 concrete suggestions to improve focus tomorrow
Be specific with times and app names.""",
},
}
def _build_context(for_date: date) -> dict[str, str]:
apps = app_time(for_date)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(for_date, limit=20)
url_lines = "\n".join(
f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls
)
ui_rows = ui_event_summary(for_date)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(for_date, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(for_date)
return {
"app_time": app_lines or "(no app data)",
"browser_urls": url_lines or "(no browser data)",
"ui_events": ui_lines or "(no UI event data)",
"ocr_text": ocr_lines or "(no screen data)",
"audio": audio[:3000] if len(audio) > 3000 else audio,
}
def generate(
summary_type: str = "day_recap",
for_date: date | None = None,
api_key: str | None = None,
) -> dict[str, Any]:
"""Generate an AI summary. Returns dict with 'title', 'content', 'date'."""
import anthropic
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {
"title": "Error",
"content": "ANTHROPIC_API_KEY not set. Add it to your .env file.",
"date": str(d),
}
spec = SUMMARY_TYPES.get(summary_type)
if not spec:
return {"title": "Error", "content": f"Unknown summary type: {summary_type}", "date": str(d)}
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {
"title": spec["title"],
"content": f"No screen data found for {d}.",
"date": str(d),
}
ctx = _build_context(d)
prompt = spec["prompt"].format(**ctx)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return {
"title": spec["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Thinking
Thinking
Bash
Read all pipe.md files
IN
for f in /volume1/screenpipe/pipes/*/pipe.md; do echo "=== $f ==="; cat "$f"; echo; done
Copy code to clipboard
OUT
=== /volume1/screenpipe/pipes/ai-habits/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: AI Habits
description: "How you use AI tools — patterns and insights"
icon: "🤖"
featured: true
---
Search my recordings from the last 24 hours for AI tool usage. Use app_name filter for each tool separately: ChatGPT, Claude, Copilot, Cursor, Gemini, Perplexity. Use limit=5 per search, max 6 searches total.
Read screenpipe skill first.
Use this exact format:
## AI Tools Used
- List each tool with approximate time spent (e.g. "Claude: ~45min")
## What I Used Them For
- For each tool: coding, writing, research, or brainstorming
## Usage Patterns
- Do I switch between tools? Use them in bursts or steadily?
## Effectiveness
- Which tool appeared alongside completed work vs. abandoned attempts
If no AI usage is found, say so clearly. End with: "**Tip:** [one suggestion to use AI tools more effectively]"
=== /volume1/screenpipe/pipes/day-recap/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
=== /volume1/screenpipe/pipes/meeting-summary/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
=== /volume1/screenpipe/pipes/standup-update/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Standup Update
description: "What you did, what's next, and any blockers"
icon: "🏢"
featured: true
---
Based on my recordings from the last 24 hours, generate a standup update. Use limit=10 per search, max 3 searches total.
Read screenpipe skill first.
Use this exact format:
## Yesterday
- What I worked on (name specific projects, files, tools, PRs)
## Today
- What I will work on next (based on unfinished tasks and recent activity)
## Blockers
- Issues I hit — errors, slow builds, waiting on someone
- If no blockers, write "None"
Keep it under 150 words. Copy-paste ready for a team standup.
=== /volume1/screenpipe/pipes/time-breakdown/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Time Breakdown
description: "Where your time went — by app, project, and category"
icon: "⏱"
featured: false
---
Analyze my app usage from today (last 12 hours). Use limit=10 per search, max 4 searches. Prefer /raw_sql with COUNT/GROUP BY queries.
Read screenpipe skill first.
Use this exact format with durations and percentages:
## By Application
- List each app with duration and percentage, sorted by time (e.g. "VS Code: 2h 15min (28%)")
## By Category
- Group into: coding, meetings, browsing, writing, communication, other
- Show hours and percentage per category
## By Project
- Group related activities by project/topic. Name specific repos or tasks.
## Productivity Score
- Calculate: focused_work_hours / total_hours as a percentage
- Focused = coding + writing. Unfocused = browsing + switching.
End with: "**Suggestion:** [one specific change to improve tomorrow's productivity]"
=== /volume1/screenpipe/pipes/video-export/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Export Video Clip
description: "Create a video of your recent screen activity"
icon: "🎬"
featured: false
---
Export a video of my screen activity from the last 5 minutes.
Read screenpipe skill first.
Use the POST /frames/export endpoint with the time range and fps=1.0. Then show me the exported video file path as an inline code block so I can watch it.
If the export is large, suggest a lower fps or shorter time range.
Thinking
Thinking
Now I have a clear picture. The pipes are markdown prompt templates in
/volume1/screenpipe/pipes/
, but there's no code to actually load and run them —
consumers/summary.py
has its own hardcoded prompts instead.
Here's what I'll implement:
consumers/pipes.py
— dynamically loads pipe.md files and runs them via Claude (data injected as system message, pipe body as user prompt)
main.py
— two new endpoints:
GET /api/pipes
and
POST /api/pipes/{pipe_id}/run
cli.py
— a
pipe
command to run pipes from the terminal
Let me build it:
Bash
Check browser_urls signature
IN
cat /volume1/screenpipe/app/consumers/activity.py | grep "def browser_urls" -A 5
Copy code to clipboard
OUT
def browser_urls(
for_date: date | None = None,
app_name: str | None = None,
ts_from: str | None = None,
ts_to: str | None = None,
limit: int = 30,
Write
pipes.py
pipes.py
143 lines
"""Dynamic pipe loader and runner."""
import os
import re
from datetime import date
from pathlib import Path
from typing import Any
def _parse_frontmatter(text: str) -> tuple[dict, str]:
if not text.startswith("---"):
return {}, text
try:
end = text.index("---", 3)
except ValueError:
return {}, text
fm_text = text[3:end].strip()
body = text[end + 3:].strip()
meta: dict = {}
for line in fm_text.splitlines():
if ":" in line:
k, _, v = line.partition(":")
meta[k.strip()] = v.strip().strip('"')
return meta, body
def load_pipes(pipes_dir: Path) -> list[dict]:
if not pipes_dir.exists():
return []
pipes = []
for pipe_dir in sorted(pipes_dir.iterdir()):
if not pipe_dir.is_dir() or pipe_dir.name.startswith("."):
continue
pipe_md = pipe_dir / "pipe.md"
if not pipe_md.exists():
continue
text = pipe_md.read_text(encoding="utf-8")
meta, body = _parse_frontmatter(text)
if meta.get("enabled", "true").lower() == "false":
continue
body = re.sub(r"Read screenpipe skill first\.?\n?", "", body).strip()
pipes.append({
"id": pipe_dir.name,
"title": meta.get("title", pipe_dir.name),
"description": meta.get("description", ""),
"icon": meta.get("icon", ""),
"schedule": meta.get("schedule", "manual"),
"featured": meta.get("featured", "false").lower() == "true",
"prompt": body,
})
return pipes
def _build_system(d: date, stats: dict) -> str:
from consumers.activity import app_time, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
apps = app_time(d)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(d, limit=20)
url_lines = "\n".join(f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls)
ui_rows = ui_event_summary(d)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(d, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(d)
if len(audio) > 3000:
audio = audio[:3000]
return f"""You are analyzing screenpipe data — recordings of a person's computer activity.
## App usage (minutes per app):
{app_lines or '(no app data)'}
## Browser URLs visited:
{url_lines or '(no browser data)'}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_lines or '(no UI event data)'}
## Screen content (sampled every 5 minutes):
{ocr_lines or '(no screen data)'}
## Audio / speech:
{audio or '(no audio data)'}
Date: {d}
Total frames recorded: {stats['total_frames']}"""
def run_pipe(
pipe_id: str,
for_date: date | None = None,
api_key: str | None = None,
pipes_dir: Path | None = None,
) -> dict[str, Any]:
import anthropic
from consumers.activity import summary_stats
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {"title": "Error", "content": "ANTHROPIC_API_KEY not set.", "date": str(d)}
_pipes_dir = pipes_dir or Path(os.environ.get("PIPES_DIR", "/data/pipes"))
all_pipes = {p["id"]: p for p in load_pipes(_pipes_dir)}
if pipe_id not in all_pipes:
return {"title": "Error", "content": f"Pipe not found: {pipe_id}", "date": str(d)}
pipe = all_pipes[pipe_id]
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {"title": pipe["title"], "content": f"No screen data found for {d}.", "date": str(d)}
system = _build_system(d, stats)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
system=system,
messages=[{"role": "user", "content": pipe["prompt"]}],
)
return {
"title": pipe["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Now add the API endpoints to
main.py
:
Edit
main.py
main.py
Added 1 line
Edit
main.py
main.py
Added 1 line
Now add the pipe endpoints at the end of main.py, before the closing summary section:
Edit
main.py
main.py
Added 14 lines
Now add the CLI command to
cli.py
cli.py
:
Edit
cli.py
cli.py
Added 40 lines
Done. Here's how to run pipes:
How to run pipes
How to run pipes
Via Docker (inside the running container):
Copy code to clipboard
docker exec -it screenpipe-app python cli.py pipe --list
docker exec -it screenpipe-app python cli.py pipe day-recap
docker exec -it screenpipe-app python cli.py pipe standup-update --date yesterday
Via the API
(once the container is running on port 8766):
Copy code to clipboard
# List pipes
curl http://localhost:8766/api/pipes
# Run a pipe
curl -X POST "http://localhost:8766/api/pipes/day-recap/run"
curl -X POST "http://localhost:8766/api/pipes/standup-update/run?d=2026-04-25"
Rebuild and restart
to pick up the changes:
Copy code to clipboard
cd /volume1/screenpipe/app
docker compose up --build -d
How it works:
Each
pipe.md
file is loaded dynamically — the frontmatter provides metadata and the body becomes the user prompt. Screenpipe data (app usage, browser URLs, OCR, audio) is injected as a Claude system message so the pipe prompt can reference it naturally. The "Read screenpipe skill first" line is stripped automatically.
The pipes directory inside Docker is
/data/pipes
(since
/volume1/screenpipe
→
/data
). You can override it with a
PIPES_DIR
env var if needed.
Message actions
pipe.md
pipe.md
where to place api key and can I use different keys
Thinking
Thinking
Bash
Read .env file
IN
cat /volume1/screenpipe/app/.env
Copy code to clipboard
OUT
[ENV_SECRET]
Where to place the key:
app/.env
app/.env
— replace
your_key_here
with your actual key:
Copy code to clipboard
[ENV_SECRET]
That file is loaded by
load_dotenv()
at startup and is excluded from git via
.gitignore
.
Can you use different keys per pipe?
Not currently — all pipes share the one key from
.env
. But you can add per-pipe key support by adding an
api_key
field to the pipe's frontmatter:
Copy code to clipboard
---
title: Meeting Summary
api_key_env: ANTHROPIC_KEY_MEETINGS
---
Then in
consumers/pipes.py
consumers/pipes.py
, read it in
_parse_frontmatter
and pass it through
run_pipe
. Want me to add that?
Message actions
pipe.md
pipe.md
can I use mistral or other llm? or is is build for antrophic only
Thinking
Thinking
Currently it's Anthropic-only — the code calls the
anthropic
SDK directly in both...
|
Code
|
Implement and run pipe l… — screenpipe [SSH: nas]
|
NULL
|
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 10 pending changes
10
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Claude Code
Containers
EXPLORER
EXPLORER
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
SCREENPIPE [SSH: NAS]
#recycle
app
__pycache__
consumers
static
.env
cli.py
M
db.py
docker-compose.yml
Dockerfile
M
main.py
M
requirements.txt
M
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
screenpipe.db
Outline Section
OUTLINE
OUTLINE
Timeline Section
TIMELINE
TIMELINE
app_settings.json, Editor Group 1
screenpipe_sync.sh, Editor Group 1
pipe.md, Editor Group 1
.env, Editor Group 1
[ENV_SECRET]
[ENV_SECRET]
Implement and run pipe l…, Editor Group 2
Problems (⇧⌘M)
PROBLEMS
Output (⇧⌘U)
OUTPUT
Debug Console (⇧⌘Y)
DEBUG CONSOLE
Terminal (⌃`)
TERMINAL
Ports - 1 forwarded port
PORTS
1
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 8766
1
Notifications
Sign In
Sign In
Info: Setting up SSH Host nas: Setting up SSH tunnel
Implement and run pipe logic
Session history
New session
Message actions
main.py
main.py
lets implement logic with pipes. how can I run it?
Thinking
Thinking
Let me look at the current project structure to understand what's already in place.
Bash
List project files
IN
find /volume1/screenpipe -type f | head -50 && echo "---" && ls /volume1/screenpipe
Copy code to clipboard
OUT
/volume1/screenpipe/.DS_Store
/volume1/screenpipe/screenpipe.db
/volume1/screenpipe/screenpipe_sync.sh
/volume1/screenpipe/app_settings.json
/volume1/screenpipe/.gitignore
/volume1/screenpipe/archive.db
/volume1/screenpipe/db.sqlite
/volume1/screenpipe/db.sqlite-wal
/volume1/screenpipe/db.sqlite-shm
/volume1/screenpipe/#recycle/desktop.ini
/volume1/screenpipe/#recycle/db.sqlite
/volume1/screenpipe/#recycle/Copy #1 of db.sqlite
/volume1/screenpipe/#recycle/.DS_Store
/volume1/screenpipe/#recycle/Copy #2 of db.sqlite
/volume1/screenpipe/#recycle/archive.db
/volume1/screenpipe/#recycle/Copy #1 of archive.db
/volume1/screenpipe/#recycle/Copy #2 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of archive.db
/volume1/screenpipe/#recycle/Copy #4 of archive.db
/volume1/screenpipe/#recycle/Copy #5 of archive.db
/volume1/screenpipe/#recycle/Copy #3 of db.sqlite
/volume1/screenpipe/#recycle/Copy #4 of db.sqlite
/volume1/screenpipe/#recycle/Copy #6 of archive.db
/volume1/screenpipe/#recycle/test.txt
/volume1/screenpipe/#recycle/Copy #7 of archive.db
/volume1/screenpipe/#recycle/Copy #8 of archive.db
/volume1/screenpipe/#recycle/archive.db-journal
/volume1/screenpipe/#recycle/Copy #1 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #2 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #3 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #4 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #6 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #7 of archive.db-journal
/volume1/screenpipe/#recycle/Copy #5 of db.sqlite
/volume1/screenpipe/pipes/.DS_Store
/volume1/screenpipe/app/.env
/volume1/screenpipe/app/cli.py
/volume1/screenpipe/app/.DS_Store
/volume1/screenpipe/app/db.py
/volume1/screenpipe/app/docker-compose.yml
/volume1/screenpipe/app/main.py
/volume1/screenpipe/app/requirements.txt
/volume1/screenpipe/app/Dockerfile
/volume1/screenpipe/data/.DS_Store
/volume1/screenpipe/.git/description
/volume1/screenpipe/.git/HEAD
/volume1/screenpipe/.git/index
/volume1/screenpipe/.git/COMMIT_EDITMSG
/volume1/screenpipe/#recycle/data/.DS_Store
---
app
app_settings.json
archive.db
data
db.sqlite
db.sqlite-shm
db.sqlite-wal
pipes
#recycle
screenpipe.db
screenpipe_sync.sh
Read
main.py
main.py
Bash
Check pipes directory and docker-compose
IN
ls /volume1/screenpipe/pipes/ && ls /volume1/screenpipe/app/ && cat /volume1/screenpipe/app/docker-compose.yml
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
cli.py
consumers
db.py
docker-compose.yml
Dockerfile
main.py
__pycache__
requirements.txt
static
services:
screenpipe-app:
build: .
container_name: screenpipe-app
restart: unless-stopped
ports:
- "8766:8000"
volumes:
- /volume1/screenpipe:/data
env_file:
- .env
environment:
TZ: Europe/Sofia
DB_PATH: /data/db.sqlite
sqlite-web-archive:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-archive
restart: unless-stopped
ports:
- "8767:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/archive.db"]
sqlite-web-live:
image: ghcr.io/coleifer/sqlite-web:latest
container_name: screenpipe-sqlweb-live
restart: unless-stopped
ports:
- "8768:8080"
volumes:
- /volume1/screenpipe:/data
command: ["-H", "[IP_ADDRESS]", "-r", "/data/db.sqlite"]
Read
requirements.txt
requirements.txt
Bash
Inspect a pipe directory
IN
ls /volume1/screenpipe/pipes/day-recap/ && cat /volume1/screenpipe/pipes/day-recap/pipe.json 2>/dev/null || ls /volume1/screenpipe/pipes/day-recap/
Copy code to clipboard
OUT
pipe.md
pipe.md
Read
cli.py
cli.py
Bash
List all pipes and their contents
IN
ls /volume1/screenpipe/pipes/ && for d in /volume1/screenpipe/pipes/*/; do echo "=== $d ==="; ls "$d"; done
Copy code to clipboard
OUT
ai-habits
day-recap
meeting-summary
standup-update
time-breakdown
video-export
=== /volume1/screenpipe/pipes/ai-habits/ ===
pipe.md
=== /volume1/screenpipe/pipes/day-recap/ ===
pipe.md
=== /volume1/screenpipe/pipes/meeting-summary/ ===
pipe.md
=== /volume1/screenpipe/pipes/standup-update/ ===
pipe.md
=== /volume1/screenpipe/pipes/time-breakdown/ ===
pipe.md
=== /volume1/screenpipe/pipes/video-export/ ===
pipe.md
Bash
Read pipe markdown files
IN
cat /volume1/screenpipe/pipes/day-recap/pipe.md && echo "===" && cat /volume1/screenpipe/pipes/meeting-summary/pipe.md 2>/dev/null | head -50
Copy code to clipboard
OUT
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
Thinking
Thinking
Bash
Read summary consumer
IN
ls /volume1/screenpipe/app/consumers/ && cat /volume1/screenpipe/app/consumers/summary.py
Copy code to clipboard
OUT
activity.py
audio.py
elements.py
__init__.py
interactions.py
ocr.py
__pycache__
report.py
summary.py
"""AI summary generator using Claude."""
import os
from datetime import date
from typing import Any
from consumers.activity import app_time, summary_stats, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
SUMMARY_TYPES = {
"day_recap": {
"title": "Day Recap",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a concise day recap using EXACTLY this format:
## Summary
One sentence: what I mainly worked on today.
## Accomplishments
- Top 3 things I finished, with approximate times
- Name specific apps, files, or projects
## Key Moments
- Important things seen or heard, with timestamps
## Unfinished Work
- What to continue tomorrow
## Patterns
- Apps used most, topics that came up repeatedly
Only report what you can verify from the data. End with:
**Next step:** [most important thing to continue]""",
},
"standup": {
"title": "Standup Update",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
## Audio/speech:
{audio}
Write a brief standup update:
**Yesterday:** What I worked on (2-3 bullets)
**Today:** What I plan to continue (1-2 bullets)
**Blockers:** Any issues observed (or "None")
Keep it under 150 words. Only report what's verifiable from the data.""",
},
"focus_time": {
"title": "Focus & Distraction Analysis",
"prompt": """\
You are analyzing a person's computer activity for the day.
## App usage (minutes per app):
{app_time}
## Browser URLs visited:
{browser_urls}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_events}
## Screen content (sampled every 5 minutes):
{ocr_text}
Analyze focus vs distraction patterns:
## Deep Work Periods
- When and how long were sustained focus sessions?
- What was being worked on?
## Distractions
- Which apps/sites were used recreationally?
- How much time was lost to context switching?
## Recommendations
- 2-3 concrete suggestions to improve focus tomorrow
Be specific with times and app names.""",
},
}
def _build_context(for_date: date) -> dict[str, str]:
apps = app_time(for_date)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(for_date, limit=20)
url_lines = "\n".join(
f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls
)
ui_rows = ui_event_summary(for_date)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(for_date, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(for_date)
return {
"app_time": app_lines or "(no app data)",
"browser_urls": url_lines or "(no browser data)",
"ui_events": ui_lines or "(no UI event data)",
"ocr_text": ocr_lines or "(no screen data)",
"audio": audio[:3000] if len(audio) > 3000 else audio,
}
def generate(
summary_type: str = "day_recap",
for_date: date | None = None,
api_key: str | None = None,
) -> dict[str, Any]:
"""Generate an AI summary. Returns dict with 'title', 'content', 'date'."""
import anthropic
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {
"title": "Error",
"content": "ANTHROPIC_API_KEY not set. Add it to your .env file.",
"date": str(d),
}
spec = SUMMARY_TYPES.get(summary_type)
if not spec:
return {"title": "Error", "content": f"Unknown summary type: {summary_type}", "date": str(d)}
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {
"title": spec["title"],
"content": f"No screen data found for {d}.",
"date": str(d),
}
ctx = _build_context(d)
prompt = spec["prompt"].format(**ctx)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return {
"title": spec["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Thinking
Thinking
Bash
Read all pipe.md files
IN
for f in /volume1/screenpipe/pipes/*/pipe.md; do echo "=== $f ==="; cat "$f"; echo; done
Copy code to clipboard
OUT
=== /volume1/screenpipe/pipes/ai-habits/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: AI Habits
description: "How you use AI tools — patterns and insights"
icon: "🤖"
featured: true
---
Search my recordings from the last 24 hours for AI tool usage. Use app_name filter for each tool separately: ChatGPT, Claude, Copilot, Cursor, Gemini, Perplexity. Use limit=5 per search, max 6 searches total.
Read screenpipe skill first.
Use this exact format:
## AI Tools Used
- List each tool with approximate time spent (e.g. "Claude: ~45min")
## What I Used Them For
- For each tool: coding, writing, research, or brainstorming
## Usage Patterns
- Do I switch between tools? Use them in bursts or steadily?
## Effectiveness
- Which tool appeared alongside completed work vs. abandoned attempts
If no AI usage is found, say so clearly. End with: "**Tip:** [one suggestion to use AI tools more effectively]"
=== /volume1/screenpipe/pipes/day-recap/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Day Recap
description: "Today's accomplishments, key moments, and unfinished work"
icon: "📋"
featured: true
---
Analyze my screen and audio recordings from today (last 16 hours only).
Read screenpipe skill first.
Use this exact format:
## Summary
One sentence: what I mainly did today.
## Accomplishments
- Top 3 things I finished, with timestamps (e.g. "2:30 PM")
- Name specific apps, files, or projects
## Key Moments
- Important things I saw, said, or heard — with timestamps
## Unfinished Work
- What I should continue tomorrow — name the app/file/task
## Patterns
- Apps I used most, topics that came up repeatedly
Only report what you can verify from the data. End with: "**Next step:** [most important thing to continue]"
=== /volume1/screenpipe/pipes/meeting-summary/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Meeting Summary
description: "Summarize meeting transcript with key takeaways and action items"
icon: "🤝"
featured: false
---
Summarize the meeting transcript provided in the context. Include key takeaways and action items. If the meeting is marked as ongoing, note that and summarize what's available so far.
Read screenpipe skill first.
Use this exact format:
## Meeting Summary
One sentence: what this meeting was about.
## Key Takeaways
- Top 3-5 important points discussed
- Include who said what when relevant
## Action Items
- [ ] Task — assigned to whom, deadline if mentioned
- [ ] Task — assigned to whom
## Decisions Made
- List any decisions or agreements reached
## Open Questions
- Anything unresolved or needing follow-up
Keep it concise and actionable.
=== /volume1/screenpipe/pipes/standup-update/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Standup Update
description: "What you did, what's next, and any blockers"
icon: "🏢"
featured: true
---
Based on my recordings from the last 24 hours, generate a standup update. Use limit=10 per search, max 3 searches total.
Read screenpipe skill first.
Use this exact format:
## Yesterday
- What I worked on (name specific projects, files, tools, PRs)
## Today
- What I will work on next (based on unfinished tasks and recent activity)
## Blockers
- Issues I hit — errors, slow builds, waiting on someone
- If no blockers, write "None"
Keep it under 150 words. Copy-paste ready for a team standup.
=== /volume1/screenpipe/pipes/time-breakdown/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Time Breakdown
description: "Where your time went — by app, project, and category"
icon: "⏱"
featured: false
---
Analyze my app usage from today (last 12 hours). Use limit=10 per search, max 4 searches. Prefer /raw_sql with COUNT/GROUP BY queries.
Read screenpipe skill first.
Use this exact format with durations and percentages:
## By Application
- List each app with duration and percentage, sorted by time (e.g. "VS Code: 2h 15min (28%)")
## By Category
- Group into: coding, meetings, browsing, writing, communication, other
- Show hours and percentage per category
## By Project
- Group related activities by project/topic. Name specific repos or tasks.
## Productivity Score
- Calculate: focused_work_hours / total_hours as a percentage
- Focused = coding + writing. Unfocused = browsing + switching.
End with: "**Suggestion:** [one specific change to improve tomorrow's productivity]"
=== /volume1/screenpipe/pipes/video-export/pipe.md ===
---
schedule: manual
enabled: true
template: true
title: Export Video Clip
description: "Create a video of your recent screen activity"
icon: "🎬"
featured: false
---
Export a video of my screen activity from the last 5 minutes.
Read screenpipe skill first.
Use the POST /frames/export endpoint with the time range and fps=1.0. Then show me the exported video file path as an inline code block so I can watch it.
If the export is large, suggest a lower fps or shorter time range.
Thinking
Thinking
Now I have a clear picture. The pipes are markdown prompt templates in
/volume1/screenpipe/pipes/
, but there's no code to actually load and run them —
consumers/summary.py
has its own hardcoded prompts instead.
Here's what I'll implement:
consumers/pipes.py
— dynamically loads pipe.md files and runs them via Claude (data injected as system message, pipe body as user prompt)
main.py
— two new endpoints:
GET /api/pipes
and
POST /api/pipes/{pipe_id}/run
cli.py
— a
pipe
command to run pipes from the terminal
Let me build it:
Bash
Check browser_urls signature
IN
cat /volume1/screenpipe/app/consumers/activity.py | grep "def browser_urls" -A 5
Copy code to clipboard
OUT
def browser_urls(
for_date: date | None = None,
app_name: str | None = None,
ts_from: str | None = None,
ts_to: str | None = None,
limit: int = 30,
Write
pipes.py
pipes.py
143 lines
"""Dynamic pipe loader and runner."""
import os
import re
from datetime import date
from pathlib import Path
from typing import Any
def _parse_frontmatter(text: str) -> tuple[dict, str]:
if not text.startswith("---"):
return {}, text
try:
end = text.index("---", 3)
except ValueError:
return {}, text
fm_text = text[3:end].strip()
body = text[end + 3:].strip()
meta: dict = {}
for line in fm_text.splitlines():
if ":" in line:
k, _, v = line.partition(":")
meta[k.strip()] = v.strip().strip('"')
return meta, body
def load_pipes(pipes_dir: Path) -> list[dict]:
if not pipes_dir.exists():
return []
pipes = []
for pipe_dir in sorted(pipes_dir.iterdir()):
if not pipe_dir.is_dir() or pipe_dir.name.startswith("."):
continue
pipe_md = pipe_dir / "pipe.md"
if not pipe_md.exists():
continue
text = pipe_md.read_text(encoding="utf-8")
meta, body = _parse_frontmatter(text)
if meta.get("enabled", "true").lower() == "false":
continue
body = re.sub(r"Read screenpipe skill first\.?\n?", "", body).strip()
pipes.append({
"id": pipe_dir.name,
"title": meta.get("title", pipe_dir.name),
"description": meta.get("description", ""),
"icon": meta.get("icon", ""),
"schedule": meta.get("schedule", "manual"),
"featured": meta.get("featured", "false").lower() == "true",
"prompt": body,
})
return pipes
def _build_system(d: date, stats: dict) -> str:
from consumers.activity import app_time, browser_urls, ui_event_summary
from consumers.ocr import deduplicated_text
from consumers.audio import formatted_transcript
apps = app_time(d)
app_lines = "\n".join(f"- {a['app_name']}: {a['minutes']} min" for a in apps[:20])
urls = browser_urls(d, limit=20)
url_lines = "\n".join(f"- {u['browser_url'][:100]} ({u['minutes']} min)" for u in urls)
ui_rows = ui_event_summary(d)
ui_lines = "\n".join(
f"- {r['app_name']}: {r['clicks']} clicks, {r['keystrokes']} keys, {r['switches']} switches"
for r in ui_rows[:15]
)
ocr_rows = deduplicated_text(d, chunk_minutes=5)
ocr_lines = "\n".join(
f"[{r['bucket']}] {r['app_name']} / {r['window_name'] or ''}: {r['text'][:200]}"
for r in ocr_rows[:100]
)
audio = formatted_transcript(d)
if len(audio) > 3000:
audio = audio[:3000]
return f"""You are analyzing screenpipe data — recordings of a person's computer activity.
## App usage (minutes per app):
{app_lines or '(no app data)'}
## Browser URLs visited:
{url_lines or '(no browser data)'}
## UI activity per app (clicks / keystrokes / app-switches):
{ui_lines or '(no UI event data)'}
## Screen content (sampled every 5 minutes):
{ocr_lines or '(no screen data)'}
## Audio / speech:
{audio or '(no audio data)'}
Date: {d}
Total frames recorded: {stats['total_frames']}"""
def run_pipe(
pipe_id: str,
for_date: date | None = None,
api_key: str | None = None,
pipes_dir: Path | None = None,
) -> dict[str, Any]:
import anthropic
from consumers.activity import summary_stats
d = for_date or date.today()
key = api_key or os.environ.get("ANTHROPIC_API_KEY", "")
if not key:
return {"title": "Error", "content": "ANTHROPIC_API_KEY not set.", "date": str(d)}
_pipes_dir = pipes_dir or Path(os.environ.get("PIPES_DIR", "/data/pipes"))
all_pipes = {p["id"]: p for p in load_pipes(_pipes_dir)}
if pipe_id not in all_pipes:
return {"title": "Error", "content": f"Pipe not found: {pipe_id}", "date": str(d)}
pipe = all_pipes[pipe_id]
stats = summary_stats(d)
if stats["total_frames"] == 0:
return {"title": pipe["title"], "content": f"No screen data found for {d}.", "date": str(d)}
system = _build_system(d, stats)
client = anthropic.Anthropic(api_key=key)
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=1024,
system=system,
messages=[{"role": "user", "content": pipe["prompt"]}],
)
return {
"title": pipe["title"],
"content": message.content[0].text,
"date": str(d),
"stats": stats,
}
Now add the API endpoints to
main.py
:
Edit
main.py
main.py
Added 1 line
Edit
main.py
main.py
Added 1 line
Now add the pipe endpoints at the end of main.py, before the closing summary section:
Edit
main.py
main.py
Added 14 lines
Now add the CLI command to
cli.py
cli.py
:
Edit
cli.py
cli.py
Added 40 lines
Done. Here's how to run pipes:
How to run pipes
How to run pipes
Via Docker (inside the running container):
Copy code to clipboard
docker exec -it screenpipe-app python cli.py pipe --list
docker exec -it screenpipe-app python cli.py pipe day-recap
docker exec -it screenpipe-app python cli.py pipe standup-update --date yesterday
Via the API
(once the container is running on port 8766):
Copy code to clipboard
# List pipes
curl http://localhost:8766/api/pipes
# Run a pipe
curl -X POST "http://localhost:8766/api/pipes/day-recap/run"
curl -X POST "http://localhost:8766/api/pipes/standup-update/run?d=2026-04-25"
Rebuild and restart
to pick up the changes:
Copy code to clipboard
cd /volume1/screenpipe/app
docker compose up --build -d
How it works:
Each
pipe.md
file is loaded dynamically — the frontmatter provides metadata and the body becomes the user prompt. Screenpipe data (app usage, browser URLs, OCR, audio) is injected as a Claude system message so the pipe prompt can reference it naturally. The "Read screenpipe skill first" line is stripped automatically.
The pipes directory inside Docker is
/data/pipes
(since
/volume1/screenpipe
→
/data
). You can override it with a
PIPES_DIR
env var if needed.
Message actions
pipe.md
pipe.md
where to place api key and can I use different keys
Thinking
Thinking
Bash
Read .env file
IN
cat /volume1/screenpipe/app/.env
Copy code to clipboard
OUT
[ENV_SECRET]
Where to place the key:
app/.env
app/.env
— replace
your_key_here
with your actual key:
Copy code to clipboard
[ENV_SECRET]
That file is loaded by
load_dotenv()
at startup and is excluded from git via
.gitignore
.
Can you use different keys per pipe?
Not currently — all pipes share the one key from
.env
. But you can add per-pipe key support by adding an
api_key
field to the pipe's frontmatter:
Copy code to clipboard
---
title: Meeting Summary
api_key_env: ANTHROPIC_KEY_MEETINGS
---
Then in
consumers/pipes.py
consumers/pipes.py
, read it in...
|
Code
|
Implement and run pipe l… — screenpipe [SSH: nas]
|
NULL
|