Business Requirements Document (BRD) --- Visual Asset Similarity Search Utility (High-Level / Research-Oriented)
1) Purpose
Design and evaluate viable approaches for a utility that helps UI/UX
designers quickly find visually similar existing assets (logos, icons,
illustrations, UI fragments, brand elements) from a large internal
repository---so teams can reuse work and maintain design consistency
across products.
This BRD is intentionally high-level to keep the research space open
(algorithms, vendors, architectures, and UI patterns are all in-scope
for exploration).
2) Background & Problem Statement
Designers often produce or iterate on a logo/icon/visual artifact during
prototyping and then need to answer: - "Do we already have something
like this somewhere?" - "Is there an approved asset with similar
geometry/composition?" - "What's the closest match that follows the same
brand visual language?"
Traditional keyword/tag search fails because similarity is often
about: - Color palette (but not only color) - Structural similarity
(shapes, edges, layout/composition) - Stylistic similarity (flat
vs. skeuomorphic, stroke weights, corner radii, gradients, etc.) -
Partial similarity (a symbol inside a logo, or an icon within a UI
screenshot)
A dedicated similarity tool should reduce duplicate creation, speed up
reuse, and improve cross-product consistency.
3) Goals and Desired Outcomes
Primary goals
- Enable designers to search by image input (upload, paste, drag-drop,
or select a region) and retrieve ranked similar assets.
- Support similarity beyond color: shape/layout/composition and
"visual language" similarity.
- Provide fast iteration loops: query → results → refine → reuse.
Business outcomes (examples)
- Reduced duplicate asset creation
- Faster discovery of reusable components
- Improved adherence to brand / design system conventions
4) Non-Goals (to keep scope flexible)
- Not prescribing a single algorithm (research should compare
multiple).
- Not requiring perfect "semantic understanding" (e.g., "this is a
fox") unless it helps.
- Not mandating a specific repository platform (DAM, drive, git,
design system tool).
- Not limiting results to a single asset type (icons/logos/UI
screenshots can all be considered).
5) Stakeholders & Users
Primary users
- UI/UX designers, brand designers, design system contributors
Secondary users
- Product teams, marketers, developers (occasionally searching for
approved assets)
Stakeholders
- Design leadership, brand governance, platform/search team,
security/compliance, IT/infra
6) Key Use Cases (Illustrative)
- Find similar logo marks
Input: a draft logo sketch/export → Output: similar symbols, marks,
and compositions.
- Find icons with similar geometry
Input: new icon → Output: icons with similar stroke style/shape
proportions.
- Find assets that match a visual style
Input: example illustration → Output: assets with similar style
(line weight, shading, palette).
- Find partial matches
Input: cropped area of an image / symbol inside a bigger image →
Output: assets containing that motif.
- Consistency check across products
Input: UI screenshot or component image → Output: similar UI visuals
from other products.
7) Functional Requirements (High-Level)
Querying
- Search by image (upload/paste/drag-drop)
- Optional: search by selecting a region (crop/box select)
- Optional: search by "more like this" from a result set
- Optional: filters (asset type, product, brand, date, owner,
license/approval status)
Results & interaction
- Ranked results with similarity score (or relative "more/less
similar" indicators)
- Quick preview + metadata (source product, tags, owner, usage rights,
last updated)
- "Open original," "Copy link," "Download," "Request access"
(depending on permissions)
- Save searches / collections for reuse workflows
Ingestion / indexing
- Continuous or scheduled indexing of new/updated assets
- Metadata ingestion (existing tags, product mapping, ownership,
approval status)
Administration / governance
- Permission-aware search results (don't leak restricted assets)
- Monitoring: usage, most-searched patterns, gaps (where designers
create new assets because none exist)
8) Non-Functional Requirements (NFRs)
- Performance: interactive response time suitable for design workflows
(target defined during research)
- Scalability: large asset repositories; growth over time
- Security & privacy: enforce ACLs; audit access; avoid leaking
restricted brand materials
- Reliability: graceful degradation if an embedding/index service is
down
- Explainability (practical): provide lightweight "why this matched"
signals (e.g., "shape similarity high", "palette similarity
moderate")
- Extensibility: ability to swap/upgrade embedding models and indexes
as better approaches emerge
9) Data & Content Considerations
- Asset types: raster (PNG/JPG), vector (SVG/PDF), design files
(Figma/Sketch exports), UI screenshots
- Metadata: product, component name, design system token usage,
approval status, authorship, licensing
- Quality issues: duplicates, near-duplicates, multiple sizes,
transparent backgrounds, watermarks, outdated brand versions
10) Solution Options (Build vs. Buy vs. Hybrid)
This initiative is naturally suited to vector similarity search using
embeddings + nearest-neighbor search.
Option A --- Build (custom pipeline)
- Generate embeddings for each asset (and optionally multi-embeddings
per asset: whole image + regions).
- Store embeddings in a vector index and retrieve nearest neighbors.
- Outsource parts of scaling/ops to a managed vector DB.
- Still requires an embedding strategy and ingestion pipeline.
Option C --- Hybrid
- Use existing enterprise search plus vector search for similarity,
combining keyword + metadata + vectors.
11) Algorithm / Technique Research Areas (Keep Wide)
A) Embedding-based similarity (recommended baseline)
- Use modern image embeddings so visually similar items are close in
vector space; nearest-neighbor search retrieves candidates.
B) Approximate Nearest Neighbor (ANN) indexing strategies
- Compare multiple ANN approaches and engines for scalability and
performance.
C) Multi-stage retrieval & reranking
- Stage 1: fast vector retrieval (top N)
- Stage 2: rerank with a stronger model or additional heuristics
(style similarity, shape emphasis, palette alignment)
D) Classical computer vision signals (useful as complements)
- Perceptual hashing for near-duplicate detection
- Keypoint/feature matching methods for geometric similarity
- Edge/contour-based features to emphasize structure over color
E) Region-based / component-based similarity
- Index whole-image embeddings + embeddings of regions (tiles/crops)
to enable motif-level similarity.
12) UX Research Directions (Non-prescriptive)
- "Search by example" vs "search by region" workflows
- Controls that help designers steer similarity:
- weight color vs structure vs style
- "more like this / less like this"
- filtering by product / design system version / approval status
- Result presentation options:
- grid view with hover-compare
- side-by-side overlay comparison
- "closest approved alternative" callout
13) Success Metrics (Research-Friendly)
- Time-to-find reusable asset (median)
- Reuse rate / downloads / "open source file" actions
- Duplicate creation rate (before vs after)
- Search satisfaction (designer rating)
- Precision@K / Recall@K on a curated evaluation set
14) Risks & Open Questions
Risks
- "Similarity" is subjective and may vary by designer intent.
- Models may overweight semantics and underweight geometry (or vice
versa).
- Permission/ACL complexity may be a larger engineering challenge than
modeling.
Open research questions
- What similarity definition best matches designer intent
(geometry-first, style-first, hybrid)?
- Do we need separate models for logos/icons vs UI screenshots?
- Should the system learn from internal feedback loops ("this match is
good/bad")?
15) Recommended Research Plan (Phased)
Phase 0 --- Discovery
- Audit repositories, formats, metadata quality, access control model.
- Collect a small "golden set" of similarity examples from designers.
Phase 1 --- Prototype baseline
- Embedding + vector search proof-of-concept.
- Compare at least 2 storage/index options.
Phase 2 --- Improve relevance
- Add reranking and/or multi-signal scoring (structure/palette/style)
- Add region search
Phase 3 --- Productization
- Governance (approved assets surfaced)
- Analytics + feedback loop
- Operational hardening + scale testing