Vedant Misra
CogniLink
mlActive prototype / MVP-like system

CogniLink

Workspace Intelligence Platform - Hybrid RAG, Knowledge Graphs, and Operational Intelligence

Unified fragmented engineering context into a queryable graph-and-vector memory layer with cited answers, graph exploration, expert discovery, alerts, and intelligence workflows.

Role

Full-stack AI platform engineering across ingestion, retrieval, backend APIs, and product surfaces

Timeline

2025-2026

Context

Workspace intelligence and engineering memory

Architecture

FastAPI + Neo4j + Pinecone + Vue 3

Focus

Grounded retrieval, citations, and graph UX

Neo4j

Graph Memory

Entities, relationships, traversal

Hybrid

Retrieval

Structured + graph + vector

SSE

Query UX

Streaming cited answers

FastAPIVue 3RAGNeo4jPineconeLangGraphLiteLLMOAuthRedisPostgresKnowledge Graph

Overview

CogniLink is a full-stack workspace intelligence platform for making fragmented engineering knowledge searchable, navigable, and explainable. It combines connector-driven ingestion, Neo4j graph storage, Pinecone vector retrieval, a FastAPI backend, LiteLLM-based model abstraction, and a Vue 3 interface to turn scattered workplace artifacts into a shared memory layer.

The central idea is that useful workplace AI needs more than document search. A question like "who has context on the authentication refactor?" may require GitHub reviews, Slack discussions, linked documents, project ownership, meeting notes, and incident history. CogniLink models those artifacts as entities and relationships, then uses hybrid retrieval to answer from both semantic content and graph structure.

The project is best framed as a sophisticated active prototype. It demonstrates deep end-to-end architecture and many implemented product surfaces, while leaving production questions such as connector completeness, tenancy, deployment hardening, and retrieval-quality evaluation as explicit limitations.

Problem and Motivation

Engineering context rarely lives in one system. Code history is in GitHub, informal decisions are in Slack, project docs are in Drive or Notion, tickets live in Jira, meetings happen in Calendar, and follow-ups disappear into email. Each tool has its own search box, but most real questions cross tool boundaries:

  • Which pull requests, documents, and Slack discussions led to a decision?
  • Who has worked near this service recently?
  • What incidents are related to this bug or project?
  • What changed since the last meeting?
  • Which projects look blocked, stale, or operationally risky?

Plain vector search helps find similar text, but it does not naturally understand ownership, authorship, dependency, review, or temporal relationships. Pure graph lookup preserves relationships, but it misses the rich language inside documents, tickets, and messages. CogniLink was designed around that tension: combine graph-native structure with semantic retrieval so the system can answer questions that are both content-heavy and relationship-heavy.

Product Scope

CogniLink spans both a user-facing product and the platform machinery behind it.

The user-facing scope includes:

  • natural-language querying with streaming responses and citations
  • graph exploration with filtering, layouts, shortest-path tracing, and entity drilldowns
  • connector management and sync-status visibility
  • expert discovery for topics, services, or projects
  • natural-language alerts translated into graph queries
  • daily briefings, incident context, and meeting intelligence
  • dashboard-style operational triage around blocked PRs, reopened bugs, anomalies, and status changes

The platform scope includes:

  • OAuth-based connectors for systems such as GitHub, Slack, Jira, Google Drive, Gmail, Google Calendar, and Notion
  • normalized graph entities and relationships written into Neo4j
  • chunking, embedding, and Pinecone indexing for text-bearing content
  • a stateful RAG orchestration pipeline for query reformulation, classification, retrieval, ranking, answer generation, citation verification, and enrichment
  • operational state for sync runs, connector status, action audit logs, artifacts, health, and model budgets

System Architecture

The architecture separates ingestion, storage, retrieval, orchestration, and presentation.

System Architecture Diagram

  • External tools (GitHub, Slack, Jira, Google Drive, Gmail, Calendar, Notion)
    • Synced via Connector layer
      • OAuth and token refresh
      • Source-specific sync logic
      • Normalized entities, relationships, and chunks
    • Processed by Ingestion pipeline
      • Graph upserts into Neo4j
      • Text and code chunking
      • Embedding generation
      • Pinecone namespace upserts
    • Served by Retrieval and intelligence backend
      • FastAPI routers
      • Structured entity lookup
      • Vector retrieval & Graph traversal
      • Evidence ranking & LiteLLM answer synthesis
      • Citation verification
    • Presented in Vue 3 frontend
      • Streaming query UI & Graph explorer
      • Integrations, experts, alerts, briefings, incidents, meetings

Neo4j stores typed entities and relationships such as people, teams, projects, bugs, decisions, documents, meetings, services, incidents, code reviews, discussions, sprints, communications, alerts, graph snapshots, and data sources. Pinecone stores citation-ready chunks with metadata such as source type, source ID, title, URL, author, freshness, and sync timestamp.

FastAPI exposes the backend product surface through routers for queries, entities, graph exploration, experts, connectors, alerts, intelligence workflows, and dashboard endpoints. The Vue frontend consumes JSON APIs plus a streaming fetch-based SSE interface for the query experience.

Key Engineering Decisions

Hybrid retrieval instead of one search mode

CogniLink intentionally combines structured lookup, graph traversal, and vector retrieval. Structured lookup helps identify known entities. Vector retrieval finds relevant text snippets. Graph traversal expands from those seeds to related people, decisions, projects, services, and artifacts.

That choice adds complexity, but it matches the domain. Workplace questions often depend on both "what was said?" and "how is it connected?" A purely semantic system can retrieve similar content while missing the author, reviewer, owner, or dependency chain that makes the answer useful.

Citation verification as a trust boundary

The answer-generation path builds prompts from numbered evidence references, then verifies citations after generation. The verifier strips invalid citation markers and emits structured citation objects so the UI can show source-backed answers instead of opaque model output.

This does not prove every claim is correct, but it is an important guardrail. It makes unsupported references easier to detect and gives the product a path toward stronger evidence auditing.

Local-first development fallbacks

The system includes pragmatic fallbacks for incomplete local environments. Embeddings can use deterministic local hash vectors to avoid API cost during iteration. Operational state prefers Postgres but can fall back to SQLite. Request budgeting prefers Redis but can fall back to in-memory state.

Those decisions make the prototype easier to run and test without every external service configured. The tradeoff is that local behavior can diverge from production-quality retrieval, persistence, and rate-limit behavior.

In-app scheduled intelligence jobs

APScheduler runs jobs for alert checks, weekly graph diffs, daily briefings, and token refresh. Keeping this inside the FastAPI process is simple and appropriate for a prototype, but a production version would likely move scheduled work into a dedicated worker or queue.

Explicitly evolving orchestration

The codebase contains both a newer active RAG orchestration path and an older workflow/approval path. That history is useful: it shows iteration from a simpler action workflow toward a richer retrieval-and-intelligence engine. It also creates a documentation responsibility, because future maintainers need to know which path is canonical.

Implementation Details

The active query path is organized as a stateful orchestration graph. A user question is reformulated, classified, enriched with extracted entities, sent through structured lookup, vector retrieval, and graph retrieval, then ranked before answer generation. The final response is citation-verified and enriched with related entities, freshness signals, recommended actions, risk level, and dashboard signals.

At a high level, the retrieval flow looks like this:

Retrieval Flow

  1. User question
  2. Reformulate and classify
  3. Extract entities
  4. Structured Neo4j lookup
  5. Pinecone vector search
  6. Entity-centered graph traversal
  7. Rank and deduplicate evidence
  8. Generate answer with numbered references
  9. Verify citations
  10. Enrich response for the UI

Connectors implement a shared interface for OAuth, sync, disconnect, and token refresh. Sync results can include normalized graph entities, relationships, and prebuilt chunks. When connectors provide raw text-bearing entities, a post-sync ingestion path chunks the content, generates embeddings, and indexes records into Pinecone.

The frontend is a Vue 3 SPA with Pinia state and route-level views for the dashboard, query interface, graph explorer, entity details, integrations, alerts, expert discovery, and briefings. The query UI keeps optimistic local messages, consumes SSE chunks incrementally, supports aborting in-flight generation, and refreshes dashboard data after completion.

Core Features

Connector-driven organizational memory

CogniLink registers connectors for multiple workplace systems and normalizes their outputs into a common graph representation. GitHub is one of the deepest connector paths, with support for repositories, issues, pull requests, README/wiki documents, source files, and code-aware chunking.

Streaming cited answers

The /api/query path streams thread creation, progress updates, and a final answer payload over SSE. When Supabase is configured, conversation history and messages can be persisted; when it is not configured, the chat layer degrades gracefully.

Graph explorer and entity drilldowns

The frontend includes a graph canvas with layouts, filtering, path tracing, context panels, and entity-type-specific rendering. Users can search entities, inspect neighbors grouped by relationship type, and move between answer citations and graph context.

Expert discovery

The expert-finding workflow scores people through graph proximity and relationship evidence, then augments results with LLM-generated explanations and confidence dimensions. This makes "who should I ask?" a first-class retrieval problem rather than a static directory lookup.

Natural-language alerts

Users can describe alerts in plain English. The backend translates those alert definitions into Cypher, stores alert nodes, and evaluates them on a scheduled cadence. This is a useful bridge between natural-language UX and graph-native monitoring.

Daily briefing, incident, and meeting intelligence

CogniLink includes services for daily operational briefings, incident context, and meeting preparation. These workflows query the graph for blocked PRs, reopened bugs, follow-ups, status changes, related services, nearest engineers, similar incidents, and recent activity, then synthesize a user-facing narrative.

Operational telemetry

The system exposes health checks, sync-run tracking, connector status, action audit logs, workspace artifacts, model policy, and budget status. These features matter because an AI workspace system is only useful if users can understand what was synced, what is healthy, and where data may be stale.

Validation and Results

The strongest result is the breadth of implemented end-to-end behavior: CogniLink is not just a retrieval script. It includes ingestion, graph modeling, vector indexing, orchestration, cited response generation, frontend exploration, connector management, alerts, briefings, and operational state.

Documented automated validation covers:

  • request-budget enforcement behavior
  • deterministic local embedding generation
  • health-report aggregation and readiness behavior
  • operational-store fallback behavior for sync runs, audit logs, and artifacts
  • SSE query streaming final-payload behavior
  • workspace action recommendation and freshness summarization

The source note reports a local backend test run where 12 tests passed. That is useful evidence of targeted reliability work, but it should be read honestly: the repository does not show broad live connector integration tests, end-to-end browser tests, load testing, or formal retrieval-quality benchmarks.

Challenges and Tradeoffs

The hardest engineering challenge is that the system joins several difficult problems at once: external API ingestion, graph modeling, vector retrieval, model orchestration, streaming UI state, local infrastructure, and operational observability.

Important tradeoffs include:

  • Relationship fidelity versus schema complexity: Typed relationships make graph retrieval more useful, but they require careful connector normalization and ongoing schema discipline.
  • Prototype ergonomics versus production parity: SQLite, in-memory budgets, and local hash embeddings make development smoother, but they are not substitutes for production-grade persistence, budgeting, and semantic embeddings.
  • Citations versus full factual guarantees: Citation verification reduces invalid references, but it does not fully prove every generated sentence is supported.
  • Broad connector ambition versus connector depth: The architecture supports many integrations, while individual connector completeness varies based on available OAuth configuration and implementation depth.
  • In-process scheduling versus operational robustness: APScheduler is simple for an active prototype, but production scheduling should be isolated from the API process.

What This Demonstrates

CogniLink demonstrates practical AI systems engineering across product, infrastructure, and retrieval architecture:

  • designing a graph-and-vector memory layer for fragmented workplace data
  • building connector interfaces and ingestion pipelines for heterogeneous APIs
  • modeling entities and relationships in Neo4j for cross-system reasoning
  • indexing citation-ready chunks into Pinecone for semantic retrieval
  • orchestrating multi-step RAG flows with classification, ranking, generation, verification, and enrichment
  • exposing streaming backend responses through a polished frontend experience
  • adding health checks, sync tracking, audit logs, budgets, and fallback stores for operational confidence

The project is strongest as evidence of system design maturity. It treats AI answers as one piece of a larger product loop: source ingestion, evidence retrieval, citation display, graph navigation, expert discovery, alerts, and operational visibility all reinforce each other.

Future Work

The next phase would focus on turning the prototype into a harder production system:

  • document the canonical orchestration path and retire or quarantine legacy workflow code
  • validate each connector against live external APIs and add integration test coverage
  • add end-to-end browser tests for query, graph exploration, alerts, and connector flows
  • evaluate retrieval quality, citation accuracy, and expert ranking with repeatable benchmark fixtures
  • define a stronger multi-tenant security and authorization model
  • replace local development embedding defaults with production-grade embeddings in deployed environments
  • move scheduled intelligence jobs into a worker or queue
  • add deployment documentation, CI, and production observability

These limitations do not weaken the case study; they clarify the project stage. CogniLink shows a serious architecture for workspace intelligence, with a clear path from active prototype to production-grade knowledge platform.