Data Standards
Data Standards
This spec defines how Solstice FC collects, stores, accesses, and presents player and operational data. Every decision here traces to a debate verdict. Where debates produced tension rather than resolution, that tension is preserved in the Dissents & Risks section.
1. Tiered Access Architecture
Informed by: Round 6 (Player Data)
Player data is not open. It is structured into three access tiers with consent boundaries between them. The Round 6 verdict was decisive: the AFF's own qualifications -- age-gating, consent frameworks, credentialed tiers, anonymization -- collectively transformed "open metrics" into "regulated metrics with controlled access." The NEG's HIPAA-analogy model is the governing framework.
Tier 1: Player & Parent
Full access to their own child's development record. This includes:
- Technical assessments (pass completion, first touch quality, decision-making ratings)
- Physical metrics with maturation context (see Section 3)
- Tactical assessments and positional development notes
- Season-over-season progression data
- Coach narrative evaluations
Parents and players see everything the club has on their child. No exceptions. Data ownership sits with the family.
Tier 2: Coach
Club coaches see their own roster's data. This means:
- Full development records for players currently on their roster
- Aggregate anonymized data across the league for benchmarking (e.g., "U-14 league median pass completion: 68%")
- No access to individual records from other clubs' rosters
Coaching staff access is tied to their current roster assignment. When a coach's assignment changes, access updates accordingly.
Tier 3: Scout & College
External stakeholders access player data only through explicit, per-request family consent. The consent mechanism must satisfy:
- Written opt-in from a parent or legal guardian
- Specification of which data fields are shared (families can consent to technical metrics while withholding physical data)
- Time-bounded access (consent expires, default 90 days, renewable)
- Audit trail of every access event visible to the family
No open dashboards for minors. No bulk data exports for external parties. No "directory" where scouts browse player profiles without prior consent.
COPPA Compliance
For players under 13, all data collection requires verifiable parental consent before any record is created. The system must support:
- Parental consent capture at registration
- Parental review and deletion rights at any time
- No collection of data beyond what is necessary for participation and development tracking
This is not optional. It is a legal requirement that shapes the data schema from day one.
2. Privacy Controls in the Schema
Informed by: Round 6 (Player Data)
The Round 6 verdict established that consent architecture must precede data architecture. Privacy is not a feature layered on top of a data model. It is a constraint the data model is built around.
Every player record includes:
consent_status: Current parental consent state (granted, revoked, expired, pending)consent_scope: Which data categories are authorized for Tier 3 sharingconsent_history: Timestamped log of all consent changesaccess_log: Every read event against this record, with accessor identity and tier
These fields are not nullable. A record without consent status is an invalid record.
When consent is revoked, Tier 3 access terminates immediately. Cached or exported data held by external parties is outside the system's control, but the consent agreement must include a destruction clause requiring deletion upon revocation.
3. Maturation Context
Informed by: Round 6 (Player Data)
Physical metrics without maturation context are misleading and harmful. The Round 6 verdict identified maturation bias -- where early-maturing players are systematically favored -- as the strongest empirical argument against open physical data. The solution is not to hide physical data but to contextualize it.
Every player record that includes physical metrics must also include:
- Predicted adult height (Khamis-Roche method: requires current height, current weight, and parental heights -- no medical assessment)
- Percentage of predicted adult height (the primary maturation indicator)
- Biological age estimate derived from the above
Physical metrics are always displayed alongside maturation context. The system must not present a sprint time or distance-covered figure without the corresponding maturation data. If maturation data is unavailable (e.g., parental heights not provided), physical metrics are suppressed from Tier 2 and Tier 3 views.
Technical Over Physical
Technical and tactical metrics take priority in all evaluation contexts. The default view for any player profile leads with:
- Technical skills (ball control, passing accuracy, shooting technique)
- Tactical awareness (positioning, decision-making, game reading)
- Behavioral indicators (coachability, effort, team play)
Physical metrics (speed, endurance, strength) appear in a secondary section, always with maturation context. This ordering is a deliberate design choice to counteract maturation bias in player identification.
4. Season One: Commodity Tools
Informed by: Round 10 (Technology Timing)
The data standards defined in this spec will not be implemented as custom software in season one. Round 10 was clear: build the community first, build the platform second. Season one runs on:
- Registration: Google Forms with Stripe for payments
- Player records: Google Sheets with access controlled via Google Workspace sharing permissions
- Consent tracking: A dedicated Google Form per family, with responses stored in a restricted Sheet accessible only to the registrar
- Coaching data: Shared Sheets scoped to each coach's roster
This is ugly. It is also free, immediately deployable, and produces zero technical risk.
Instrumentation Mandate
The entire point of running season one on manual tools is to observe what breaks. Every pain point is a future product requirement. During season one, the league must actively document:
- Every instance where the tiered access model is difficult to enforce in Sheets
- Every consent request that is slow, confusing, or error-prone
- Every case where maturation data is missing and physical metrics must be suppressed
- Every time a coach needs data they cannot access, or accesses data they should not have
- Time spent on manual data entry, access provisioning, and consent tracking
This observational data becomes the requirements document for the season-two platform.
5. Season Two: Purpose-Built Platform
Informed by: Round 10 (Technology Timing)
After one season of real operations, build the custom platform. The platform scope, per the Round 10 verdict:
- Registration with payment processing, age verification, medical waivers, and COPPA-compliant consent capture
- Scheduling with conflict detection
- Player records implementing the three-tier access model natively, with consent management, maturation context, and audit logging built into the data layer
- Governance dashboard with transparent financials
The platform is built from season-one observations, not from hypotheses. Architecture decisions that would otherwise be guesses -- how clubs handle sibling discounts, what registration edge cases exist, how coaches actually use development data -- are answered by real usage data.
6. Communication as Core Competency
Informed by: Semifinal 2 (Simplicity vs Optimization)
The data architecture described in this spec is sophisticated. Three access tiers, consent gating, maturation context, audit logging -- none of this is simple. The Semifinal 2 verdict established that the answer is not to simplify the system but to invest in making it navigable.
Protocol vs. Policy
The protocol layer -- tiered access, consent mechanisms, maturation calculations, audit trails -- should be as sophisticated as the evidence warrants. It operates independently of geography or local preference.
The policy layer -- how families interact with the system, what they see, what language is used, how consent is requested -- should be as simple as possible. It is locally determined by clubs within the cooperative structure.
A parent should never need to understand the three-tier access model. They should see: "Here is your child's development report. A scout from [Organization] has requested access to [these fields]. Approve or deny." The sophistication is real. The experience is simple.
Family Confusion as Signal
When families are confused by the data system, that is a UX failure, not evidence that the system is too complex. Track:
- Support requests related to data access or consent
- Time-to-completion for consent workflows
- Family satisfaction survey scores on "understanding my child's development data"
- Attrition correlated with data-system friction
These metrics feed back into UX improvement, not into system simplification. The Semifinal 2 majority was explicit: the first response to family confusion is better communication, not simpler systems.
No Blanket Simplicity Default
Design decisions about data presentation and access are evaluated case-by-case by club leadership within the cooperative governance structure. "It is more complex for families" is a legitimate concern that triggers UX investment, not an automatic veto on the underlying design.
Dissents & Risks
The Contrarian's Dissent (Semifinal 2)
The Contrarian judge voted AFF, arguing that "make it feel simple" is not a strategy available to a resource-constrained startup. The majority's prescription -- invest in communication excellence from day one -- assumes communication capacity the organization may not have. If year-one attrition exceeds 40%, or if family satisfaction surveys show systemic confusion with league data systems, the communication-excellence approach should be reconsidered in favor of genuine structural simplification.
This dissent is empirically testable. The metrics in Section 6 provide the test. If the numbers say families cannot navigate the system despite communication investment, simplify the system.
Maturation Data Gaps
The Khamis-Roche method requires parental heights. Some families will not provide this data (privacy concerns, single-parent households where one parent's height is unknown, adopted children). The spec mandates suppressing physical metrics when maturation context is unavailable, but this creates an asymmetry: players with complete maturation data have richer profiles than those without. This could disadvantage the very players the privacy protections are designed to help. Season one should track how often maturation data is missing and whether the suppression rule creates practical problems.
Commodity Tools May Not Enforce Tiers
Google Sheets with sharing permissions is a rough approximation of three-tier access. A coach with edit access to a Sheet could, in practice, see rows they should not. Season one will expose how often this matters and how severe the consequences are. If the answer is "frequently and seriously," it accelerates the timeline for the custom platform.
Consent Fatigue
Per-request, time-bounded, field-scoped consent is the most privacy-protective model. It is also the most burdensome for families. If a player is actively being recruited by multiple programs, their parents may face dozens of consent requests per season. Season one should track consent request volume and family feedback on the process. If consent fatigue leads families to blanket-approve everything, the protective mechanism has failed and the model needs revision.
The AFF's Valid Point on Information Asymmetry (Round 6)
The Round 6 AFF lost the debate but landed a real argument: the current scouting system's opacity disadvantages lower-income players who cannot afford showcase fees ($3,000-$8,000 for NCSA profiles) or scouting platform access ($1,000-$5,000 annually for InStat/Wyscout). The tiered-access model addresses this by making development data available to families at no cost (Tier 1), but it does not solve the broader discovery problem. A talented player at a small club with no scout relationships still depends on the consent-gated Tier 3 pathway, which requires scouts to know the player exists in the first place. This is a distribution problem the data standards alone do not solve.