Relationship to research BRD:
Visual_Asset_Similarity_Search_BRD.mddefines the broad problem space and research plan. This document narrows the scope to a concrete, buildable utility and provides enough specification for a developer to implement it.
A web-based tool that lets a user upload an image and find the most visually similar images from a pre-indexed repository. Similarity is decomposed into four independent dimensions — Color, Shape, Composition, and Style — that the user can select in any combination to control the search.
┌──────────────────────────────────────────────────────────────┐
│ 1. Upload Image │
│ User drags, pastes, or file-picks a single image. │
│ │
│ 2. Select Similarity Dimensions (optional) │
│ Multi-select dropdown: Color | Shape | Composition | │
│ Style. Default = all four selected. │
│ │
│ 3. Click "Search" │
│ System generates query embedding(s) for the selected │
│ dimensions, queries the vector index, and computes │
│ per-dimension + overall scores. │
│ │
│ 4. Review Results │
│ Grid of matching assets with thumbnail previews. │
│ When >1 dimension is selected, a sortable table view │
│ shows individual dimension scores and an overall score. │
│ User can click any result to see a larger preview │
│ and metadata. │
└──────────────────────────────────────────────────────────────┘
| Step | User action | System behavior |
|---|---|---|
| 1 — Upload | Drag-drop, paste, or file-pick an image (PNG, JPG, WEBP, SVG, or PDF rendered to raster). | Validate file type and size. Display a thumbnail preview of the uploaded image. |
| 2 — Configure | Optionally open the similarity dropdown and select/deselect dimensions (Color, Shape, Composition, Style). | Update the UI to reflect the active dimensions. Default state: all four selected. |
| 3 — Search | Click the "Search" button. | Generate embedding(s) for the query image across the selected dimensions. Execute ANN query against the index. Compute per-dimension similarity scores and an overall (combined) score. Return ranked results. |
| 4 — Results | Browse the results grid. Sort by any score column. Click a result for detail view. | Display results as a grid of thumbnail previews. Show sortable score columns when multiple dimensions are active. Provide a detail panel with full-size preview and metadata. |
| ID | Requirement | Details |
|---|---|---|
| SI-1 | Image upload | Accept a single image via drag-drop, clipboard paste, or file picker. Supported formats: PNG, JPG/JPEG, WEBP, SVG, PDF (first page). Max file size: 20 MB. |
| SI-2 | Upload preview | Display a thumbnail of the uploaded image before search is executed. |
| SI-3 | Similarity dimension selector | Multi-select dropdown with options: Color, Shape, Composition, Style. Default: all four selected. At least one must be selected to search. |
| SI-4 | Search action | A "Search" button that submits the query. Disabled until an image is uploaded and at least one dimension is selected. |
| SI-5 | Loading state | Show a progress indicator while the search is in progress. |
Each dimension captures a distinct aspect of visual similarity. The implementation must produce an independent score (0–1, where 1 = identical) for each dimension.
| Dimension | Definition | Examples of what it captures |
|---|---|---|
| Color | Similarity of color palette, distribution, and dominant hues. | Two icons that both use the same blue-and-white palette score high even if their shapes differ. |
| Shape | Similarity of geometric forms, contours, and silhouettes. | A circle-based logo and another circle-based logo score high regardless of color. |
| Composition | Similarity of spatial layout — how elements are arranged within the frame. | Two images with a centered subject over a bottom bar score high even if the subjects differ. |
| Style | Similarity of visual treatment — line weight, shading, gradients, flat vs. skeuomorphic, corner radii, texture. | Two flat-design icons with thin strokes score high; a flat icon vs. a 3D-rendered icon scores low. |
| ID | Requirement | Details |
|---|---|---|
| RD-1 | Results grid | Display matching assets as a grid of thumbnail previews, ordered by overall similarity score (descending). |
| RD-2 | Score columns | When more than one similarity dimension is selected, display a sortable table/column for each active dimension's score plus an Overall score column. Clicking a column header sorts results by that score. |
| RD-3 | Overall score | Computed as the mean of the active dimension scores (equal weighting). |
| RD-4 | Result count | Display the total number of results. Return a maximum of 100 results per query. |
| RD-5 | Detail view | Clicking a result opens a detail panel showing: full-size preview, all dimension scores, and available metadata (filename, file path, file type, image dimensions, indexed timestamp). |
| RD-6 | No-results state | If no results meet the minimum similarity threshold, display a clear "No similar assets found" message. |
| ID | Requirement | Details |
|---|---|---|
| IX-1 | Batch folder scan | Accept a root directory path. Recursively scan for image files (PNG, JPG/JPEG, WEBP, SVG, PDF). |
| IX-2 | Embedding generation | For each discovered image, generate embeddings for all four similarity dimensions and store them in the vector index along with file metadata. |
| IX-3 | Idempotent re-indexing | If an image has already been indexed and its content has not changed (e.g., same file hash), skip re-processing. If content has changed, update the index entry. |
| IX-4 | Scale target | Index 10,000 images within hours (not days) on commodity hardware. The pipeline should support parallelism/batching to meet this target. |
| IX-5 | Progress reporting | Expose progress metrics during indexing: total files found, files processed, files skipped, errors encountered. |
| IX-6 | Error handling | Log and skip unreadable/corrupt files without halting the batch. Produce a summary report at completion listing any failures. |
The system is composed of five logical components. This section defines their responsibilities and interfaces without prescribing specific technologies.
┌─────────────┐ ┌─────────────────┐ ┌────────────────────┐
│ Frontend │──────▶│ Backend API │──────▶│ Embedding Engine │
│ (Web UI) │◀──────│ │◀──────│ │
└─────────────┘ └────────┬────────┘ └────────────────────┘
│
▼
┌─────────────────┐
│ Vector Index / │
│ Storage │
└─────────────────┘
▲
│
┌─────────────────┐
│ Indexing │
│ Pipeline │
└─────────────────┘
Responsibility: Provide the browser-based interface for uploading images, configuring search parameters, and viewing results.
Responsibility: Orchestrate search and indexing requests. Acts as the single entry point for the frontend.
Responsibility: Convert an image into one or more embedding vectors, one per similarity dimension.
{ dimension → embedding vector }.Responsibility: Persist embedding vectors and support fast approximate nearest-neighbor (ANN) retrieval.
Responsibility: Scan a directory of images, generate embeddings, and populate the vector index.
Stack-agnostic interface definitions. Implementations may use REST, gRPC, or any protocol that satisfies these contracts.
Request: Search
{
"image": <binary image data or base64-encoded string>,
"dimensions": ["color", "shape", "composition", "style"], // 1–4 selected
"max_results": 50 // optional, default 50, max 100
}
Response: SearchResults
{
"query_image_preview": "<url or data URI of uploaded image thumbnail>",
"dimensions_searched": ["color", "shape", "composition", "style"],
"total_results": 42,
"results": [
{
"image_id": "abc-123",
"filename": "logo_v2.png",
"file_path": "/assets/logos/logo_v2.png",
"file_type": "png",
"image_width": 512,
"image_height": 512,
"thumbnail_url": "<url or data URI>",
"scores": {
"color": 0.92,
"shape": 0.78,
"composition": 0.85,
"style": 0.64
},
"overall_score": 0.7975,
"indexed_at": "2026-01-15T10:30:00Z"
}
// ... more results
]
}
Score semantics:
[0.0, 1.0] where 1.0 = maximum similarity.overall_score = arithmetic mean of the active dimension scores.overall_score descending by default.scores map.Request: StartIndexing
{
"directory_path": "/path/to/asset/folder",
"recursive": true // optional, default true
}
Response: IndexingJob
{
"job_id": "job-456",
"status": "running",
"directory_path": "/path/to/asset/folder",
"started_at": "2026-02-10T14:00:00Z"
}
Request: GetIndexingStatus
{
"job_id": "job-456"
}
Response: IndexingStatus
{
"job_id": "job-456",
"status": "running" | "completed" | "failed",
"total_files_found": 10000,
"files_processed": 6500,
"files_skipped": 120,
"errors": 3,
"started_at": "2026-02-10T14:00:00Z",
"completed_at": null, // null while running
"error_details": [
{
"file_path": "/path/to/corrupt.png",
"error": "Unable to decode image"
}
]
}
| Category | Requirement | Target |
|---|---|---|
| Search latency | Time from "Search" click to first results rendered. | < 3 seconds for a 10K-image index. |
| Indexing throughput | Time to index a folder of images. | 10,000 images in < 4 hours on commodity hardware (4-core CPU, 16 GB RAM, SSD). |
| Concurrent users | Number of simultaneous search users supported. | At least 5 concurrent searches without degradation. |
| Image size handling | Maximum input image dimensions and file size. | Up to 20 MB file size; images larger than 2048px on the longest edge are resized before embedding. |
| Index size | Number of images the index supports without architectural changes. | Up to 100,000 images. |
| Availability | Uptime target for the search service. | Best-effort (internal tool); graceful error messaging when service is unavailable. |
| Browser support | Supported browsers for the frontend. | Latest versions of Chrome, Firefox, Edge, and Safari. |
The following are explicitly deferred and are not part of this development phase. They may be addressed in future iterations (see the research BRD's Phase 2/3).
| Item | Why deferred |
|---|---|
| Access control / ACL-filtered results | Adds significant complexity; not needed for an internal prototype. |
| User authentication | Same as above; prototype operates as a single-user or trusted-network tool. |
| Region/crop-based search | Valuable but adds UX and algorithmic complexity; better suited for Phase 2. |
| "More like this" refinement | Requires session state and re-query logic; deferred to Phase 2. |
| Metadata / keyword search | This BRD covers image-only similarity; hybrid search is a Phase 2 concern. |
| Continuous / real-time indexing | Batch indexing is sufficient for the initial version. |
| Feedback loop ("good match / bad match") | Requires a data collection pipeline and model retraining strategy; Phase 3. |
| Usage analytics and monitoring dashboards | Phase 3 concern. |
| Multi-user collaboration features | Not needed for prototype. |
| Deployment automation (CI/CD, containerization) | Operational concern, not a functional requirement for the prototype. |
| Term | Definition |
|---|---|
| ANN | Approximate Nearest Neighbor — an algorithm that finds vectors close to a query vector in sub-linear time. |
| Embedding | A fixed-length numeric vector that represents an image's visual properties in a high-dimensional space. |
| Dimension (similarity) | One of the four independent axes of visual similarity defined in this BRD: Color, Shape, Composition, Style. |
| Overall score | The arithmetic mean of the active similarity dimension scores for a query–result pair. |
| Vector index | A data structure optimized for storing and querying embedding vectors via ANN search. |