# Data Fingerprinting and Watermarking (Datasets)
One-sentence definition: Embedding identifiers or tracking features in datasets to trace leaks and deter misuse.
## Key Facts
- Fingerprints via canary records, seeded IDs, or statistical marks.
- Watermarking for structured/unstructured datasets; reversible or not.
- Legal support with contracts and acceptable-use acknowledgments.
- Careful to avoid biasing analytics; document insertion methods.
- Monitor for reappearance in external sources.
- **Verify:** check official (ISC)² CBK and current exam outline.
## Exam Relevance
- Choose tracing method after suspected dataset leak.
**Mnemonic:** “Seed to **see**.”
## Mini Scenario
Q: Competitor has your unique record—what proves leak?
A: Canary record match tied to sharing event.
## Revision Checklist
- Define canary record.
- List one risk to data quality.
- Tie to legal remedies.
## Related
[[Digital Rights Management (DRM) and Watermarking]] · [[Data Sharing and External Collaboration Controls]] · [[eDiscovery and Data Retention]] · [[Data Catalogs and Metadata Management]] · [[Logs and Telemetry as Sensitive Data]] · [[Domain 2 - Index]]