Live Property Filtering Based on Identity Mapping with Fire Arrow
A practical reference architecture for serving different views of the same FHIR data set to different audiences. Covers the regulatory tension between comprehensive data capture and restricted access, how current FHIR servers handle multi-audience access, identity-based authorization with field-level property filters, search side-channel protection, and concrete configuration examples.
Executive Summary
Healthcare software teams face a structural tension. Clinical workflows, study protocols, and digital monitoring programs demand comprehensive data capture: demographics, identifiers, contact information, observations, adverse events, notes, workflow tags, and internal metadata. At the same time, regulations like HIPAA’s Minimum Necessary principle and GDPR’s data minimization require that each participant sees only what their role justifies. Treating clinicians may need full patient context. Study sponsors may need clinical outcomes but not direct identifiers. Analytics pipelines may need population-level data without PHI. External applications may need access to standard FHIR resources but not to internal system metadata.
The current FHIR server ecosystem does not provide a standard solution to this problem. The most widely used open-source FHIR server has no built-in per-user access control. SMART on FHIR scopes do not express field-level visibility. The FHIR Permission resource is still at maturity level 1. Cloud de-identification services create copies of the data set rather than filtering in place. Teams fill these gaps with custom API gateways, batch de-identification pipelines, and application-level field stripping. The filtering logic ends up distributed across custom code, maintained per endpoint, and difficult to audit.
The central design principle of this paper is: field-level visibility should be expressed as authorization rules evaluated against the caller’s FHIR identity, not as custom code distributed across endpoints or maintained in a separate anonymization pipeline.
Fire Arrow Server uses the FHIR database itself as the permission and filtering core. Requests are authenticated, mapped to FHIR identity resources such as Practitioner or Patient, evaluated against rule-based authorization, narrowed before search results are returned, and filtered at the property level where needed. Identity filters let rules apply only to selected subsets of users within the same role. Property filters remove or transform sensitive fields before the response is sent.
The practical result: the same Fire Arrow Server instance can serve different audiences from the same underlying data set, with different views, over standard FHIR APIs. Anonymization is one application. Others include automated filtering on bulk data extraction for analytics and ML pipelines, client privilege limitations on standard FHIR APIs, and hiding internal system data from external views.
This paper explains the design pattern, where it fits, how it compares to current approaches, and where teams still need to be careful.
1. Audience and Scope
This paper is intended for:
- CTOs, CIOs, and technical product leaders evaluating architectures for multi-audience healthcare data access.
- Solution and platform architects responsible for authorization design and data privacy in shared FHIR infrastructure.
- Security and compliance stakeholders reviewing how field-level redaction is enforced across REST, GraphQL, and HFQL.
- Engineering teams building clinical trial platforms, digital monitoring products, or analytics pipelines that need identity-dependent data views.
In scope
This paper covers:
- the regulatory tension between comprehensive data capture and restricted access,
- how current FHIR servers and cloud platforms handle (or fail to handle) field-level multi-audience access,
- identity-based authorization with field-level property filtering on a shared FHIR server,
- the authorization pipeline from authentication through search narrowing to response-time redaction,
- identity filters for differentiating users within the same role,
- property filter types (NullFilter, RandomFilter) and their configuration,
- search side-channel protection with blocked search parameters and includes,
- risk profiles for REST search, GraphQL read, GraphQL search, and HFQL,
- practical benefits beyond anonymization: bulk data filtering, client privilege limiting, internal data hiding,
- a concrete clinical trial configuration example with two access tiers,
- extending the pattern to blinded vs. unblinded access using identity filters.
Out of scope
This paper does not:
- provide legal advice or make blanket regulatory compliance claims,
- prescribe a complete HIPAA or GDPR de-identification strategy,
- address infrastructure-level isolation strategies (separate databases, separate deployments),
- cover network-level security controls or encryption at rest.
2. The Tension: Comprehensive Data vs. Restricted Access
Healthcare systems are pulled in two opposing directions.
2.1 The pressure to record everything
Clinical study protocols require structured capture of demographics, lab results, medications, adverse events, vital signs, and clinical notes. Digital monitoring systems want continuous data streams from wearables, patient-reported outcomes, and remote observations. Registries need longitudinal coverage. Analytics teams need population-level data sets. The value of the system grows with its completeness. Over 80% of healthcare data remains unstructured (free-text notes, imaging reports, pathology narratives), and recent work on eSource-enabled trials and EHR-to-EDC pipelines is specifically aimed at capturing more of it in structured form. The trend is toward richer, more comprehensive data, not less.
2.2 The pressure to restrict who sees what
At the same time, regulations require that access be limited. HIPAA’s Minimum Necessary principle requires covered entities to “make reasonable efforts to ensure that uses and disclosures of PHI is limited to the minimum necessary information to accomplish the intended purpose.” In 2025 congressional testimony, the American Health Information Management Association (AHIMA) noted that EHR systems “often lack the sophistication to sequester patients by assigned employees,” which “often leads to approval for ‘any and all’ access rather than imposing certain access restrictions on the PHI.” The EU General Data Protection Regulation (GDPR) takes a similar stance with its data minimization principle. The proposed 2026 HIPAA Security Rule update strengthens these expectations further, making network segmentation and granular access controls explicit requirements rather than addressable specifications.
Systems should collect the data needed for care and operations, and then enforce precise limits on who can see which parts of it. The regulatory frameworks agree on this point even where they differ on specifics.
2.3 The audience model is layered
A modern healthcare platform is rarely built for a single user type. A clinical study platform may serve investigators, study coordinators, sponsor monitors, and data analysts. A digital monitoring product may serve patients, clinicians, care managers, and operations staff. A hospital integration layer may expose data to internal applications, external partners, and analytics services. In all of these cases, the underlying data set is shared, but the acceptable view of that data is not.
A typical example is a multi-site clinical trial. Site investigators need identifiable patient data because they are directly involved in care and trial execution. Sponsor monitors need to review outcomes, measurements, medication events, and adverse-event documentation, but without seeing names, addresses, dates of birth, or phone numbers.
The same basic pattern appears outside trials:
- public-health reporting, where epidemiologists need clinical facts without direct identities,
- real-world evidence or analytics pipelines, where researchers need clinical data but not full person-level identifiers,
- customer-facing portals that should not expose internal administrative tagging,
- AI/ML model training, where service accounts should receive de-identified data so no PHI enters the training environment,
- internal and external applications sharing a common FHIR backend with different privilege levels.
What makes these scenarios difficult is not that the access rules are unusual. The data is both sensitive and highly structured, while the audience model is layered and changes over time.
3. How Current Systems Handle This (and Where They Fall Short)
There are several established approaches. Each addresses part of the problem. None addresses the full scope of field-level, identity-based, query-time filtering on a standard FHIR API surface.
3.1 Custom API gateways in front of FHIR servers
The most common open-source FHIR server, HAPI FHIR, has no built-in per-user access control. Teams that need multi-audience access build a custom API gateway (typically in Python, Node.js, or Java) that sits between the client and the FHIR server. The gateway appends _tag parameters to scope searches by tenant, performs post-fetch verification on direct reads, and implements role-specific field stripping in application code.
The gateway pattern works, but it has well-documented gaps. HAPI ignores _tag on instance reads: GET /fhir/Patient/abc-123 returns the resource regardless of its tags, so the gateway must fetch the resource, check its tags, and discard it if they do not match. Every endpoint, every resource type, and every access path (REST, GraphQL, bulk export, subscriptions) must pass through the gateway. A single code path that forgets to apply the correct filter leaks data. And because the gateway is custom code, the filtering logic is distributed across controllers and middleware rather than declared in one place.
For field-level redaction specifically, the gateway must strip fields from response bodies before returning them. That means the gateway must parse FHIR resources, apply transformation rules, and re-serialize the output. Teams are building bespoke anonymization logic, per endpoint, maintained alongside the rest of their application code. When the data model changes (new resource types, new fields, new API surfaces), the filtering code must change with it.
3.2 Copy-based de-identification pipelines
Cloud platforms offer de-identification as a batch operation. Google Cloud Healthcare API, for example, creates a copy of a FHIR store with PHI removed or transformed. The original data stays untouched. The de-identified copy is a separate FHIR store that downstream consumers query independently.
This approach works for some use cases, but it introduces delay. The copy must be regenerated whenever the underlying data changes. In practice, teams run de-identification on a schedule (daily or less frequently), which means the anonymized data set is always behind the operational system. For a hospital that discovered 17% of referrals were delayed because data synced only once a day, batch processing was the direct cause. For clinical trial sponsors who need to review adverse events promptly, a 24-hour lag in the anonymized data set is a concrete operational problem.
Copy-based de-identification also doubles the storage footprint and creates a second system boundary that must be secured, monitored, and maintained. If the de-identification configuration is incomplete (a new resource type was added to the operational store but not to the de-identification pipeline), the copy silently omits data that downstream consumers expect. Teams now maintain two serving surfaces: the operational API and the anonymized API.
3.3 SMART on FHIR scopes
SMART on FHIR standardizes how applications obtain authorized access to a FHIR server. Its scope syntax expresses which resource types and operations a token holder may perform (patient/Observation.rs, user/MedicationRequest.cruds). Version 2.0 added granular search-parameter constraints.
SMART scopes do not express field-level visibility. A scope can say “this token holder may read Observations,” but it cannot say “this token holder should receive Patient resources with the name randomized and the address removed.” The specification is explicit about this delegation: “Neither SMART on FHIR nor the FHIR Core specification provide a way to model the ‘underlying’ permissions at play here; this is a lower-level responsibility in the access control stack.” Field-level filtering, identity-based rule selection, and search side-channel protection all fall into that lower layer.
3.4 The FHIR Permission resource
The FHIR R6 ballot includes a Permission resource (maturity level 1, trial use) designed for declarative attribute-based access control. The HL7 Data Access Policies Implementation Guide describes an architecture where a middleware layer sits between the client and the FHIR server, evaluating Permission resources to filter responses.
The approach addresses the right problem but is still early. The middleware must intercept every request, evaluate policies, and transform responses before returning them. The Permission resource’s limit element uses a CodeableConcept that cannot yet refer to specific properties inside a resource for field-level redaction (this is noted as an open issue in the IG). The IG is a draft, and production deployments are rare.
3.5 Tag-based and label-based isolation
Some systems use security labels or metadata tags on FHIR resources to mark ownership or sensitivity. The server or gateway filters responses by matching the request’s context against the resource’s label. Labels must be applied before data is loaded. Retrofitting labels to an existing data set requires a migration. And labels have no inherent connection to clinical relationships: the label says “this resource belongs to Tenant A” but does not say “this practitioner has a cardiology role in Tenant A and should see lab results but not mental health notes.”
3.6 The common cost
None of these approaches is inherently wrong. Each solves a piece of the problem. The issue is that field-level, identity-based, query-time filtering with search side-channel protection does not exist as a standard capability in the current FHIR server ecosystem. Teams assemble it from gateways, middleware, batch pipelines, and custom code. The more the filtering logic lives outside the FHIR server and outside the request pipeline that serves data, the more coordination work is required to keep behavior correct across all interfaces, all resource types, and all access paths.
For technical teams, the cost is maintainability: every new resource type, every new API surface, every new client integration must be audited against the filtering logic. For product teams, the cost is slower iteration: adding a new audience tier requires changes to the gateway, the de-identification pipeline, and potentially the bulk export process. For compliance teams, the cost is audit difficulty: the effective access policy is the emergent behavior of scattered code, not a single inspectable configuration.
4. Design Principles
4.1 Express visibility rules as authorization, not as application code
If two users see different fields on the same resource, that difference should be declared in the authorization configuration, not implemented in a controller or resolver.
4.2 Keep filtering in the same pipeline as access control
Property filters should apply after authorization succeeds, inside the same request pipeline. Clients should not need to call a separate “anonymized API.”
4.3 Use FHIR identity as the control point
The caller’s resolved FHIR identity resource (Practitioner, Patient, RelatedPerson, Device) should determine which rules apply. Identity properties, not just role codes, should be available as selection criteria.
4.4 Treat search side-channels as a first-class concern
Response-time field redaction alone does not prevent data inference through search parameters, sort order, or includes. The server must block search vectors that could reveal redacted values.
4.5 Prefer read access over search access for filtered roles
When a role receives property-filtered data, the safest pattern is to grant read and GraphQL read access rather than search access. Search access requires careful blocking of every parameter that could reveal a redacted field.
4.6 Keep rules declarative and ordered
Authorization rules should be inspectable configuration, not scattered code. When the same role needs different filtering for different user subsets, identity filters should select the applicable rule without branching application logic.
5. Fire Arrow’s Authorization Pipeline
Fire Arrow Server uses a rule-based authorization pipeline. Each incoming request passes through a defined sequence of stages:
- Authenticate the caller using a JWT (OAuth 2.0 / OIDC) or API token.
- Resolve identity by mapping the token to a FHIR resource (
Patient,Practitioner,RelatedPerson, orDevice) via identifier lookup, email fallback, or optional auto-create. - Build the applicable rule set by matching the client’s role, the requested resource type, and the operation against all configured rules. For each candidate rule, the identity filter (if present) is evaluated against the caller’s resolved FHIR identity resource. If the filter returns false or an empty result, the rule is skipped. All rules whose identity filters pass contribute to the effective access.
- Check blocked parameters for search requests where a matching rule disallows specific search parameters or includes.
- Narrow the search so that only authorized resources can appear in results. On REST, Fire Arrow appends additional search constraints derived from the matching validators. On GraphQL, it builds alternative search parameter maps with OR semantics, each executed independently and merged by resource ID.
- Execute and validate the request using the configured validators.
- Apply property filters to outgoing resources where the matching rules define field-level transformations. This happens after the operation completes, before the response is sent.
Rules are expressed against combinations of:
- client role (Practitioner, Patient, RelatedPerson, Device),
- resource type (Patient, Observation, Condition, etc.),
- operation (read, search, create, update, graphql-read, graphql-search, etc.),
- validator (LegitimateInterest, CareTeam, PatientCompartment, PractitionerCompartment, Allowed, Forbidden, etc.),
with optional constraints:
practitioner-role-systemandpractitioner-role-codeto restrict a rule to practitioners holding a specific role,identity-filterfor FHIRPath-based conditions on the caller’s identity resource,property-filtersfor field-level response redaction,blocked-search-paramsandblocked-includesto prevent specific search vectors.
The default validator is Forbidden. If no rule matches a given request, the server denies it.
6. Identity Mapping as the Control Point
Fire Arrow resolves the authenticated caller to a FHIR identity resource. That identity then becomes part of the authorization decision. Access rules can be formulated in the same domain model as the data being protected.
The identity resource also makes a distinction available that role codes alone cannot express: precise identity state within a broad role.
A role may indicate that someone is a Practitioner. Within that role, some users may be investigators, some study sponsors, some blinded, some unblinded. Identity filters address this case. An identity filter is a FHIRPath expression attached to a rule. Before the validator runs, Fire Arrow evaluates the expression against the caller’s resolved FHIR identity resource. If the expression returns false or an empty result, the rule is skipped and evaluation moves to the next matching rule.
How identity filter evaluation works
The FHIRPath expression is evaluated with the caller’s identity resource as the root. Standard FHIRPath truthiness applies: an empty collection is false, a non-empty collection is true. If the identity resource has not been resolved (for example, the token does not map to a FHIR resource), the filter returns false and the rule is skipped.
Practical example: blinded vs. unblinded monitors
A sponsor organization might have both blinded monitors (who must not see PHI) and unblinded monitors (who need full access for safety reporting). Both hold the same study-sponsor role code, but they need different anonymization behavior.
Instead of creating separate role codes, tag blinded practitioners with a meta tag on their Practitioner resource. Then use an identity filter to select the applicable rule:
fire-arrow:
authorization:
validation-rules:
# Blinded sponsor monitors: anonymized access
- client-role: Practitioner
resource: Patient
operation: read
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: study-sponsor
identity-filter: "meta.tag.where(system = 'http://example.org/trial-blinding' and code = 'blinded').exists()"
property-filters:
- property: "name"
filter: RandomFilter
- property: "telecom"
filter: NullFilter
- property: "address"
filter: NullFilter
- property: "birthDate"
filter: NullFilter
- property: "identifier"
filter: NullFilter
- property: "photo"
filter: NullFilter
# Unblinded sponsor monitors: full access
- client-role: Practitioner
resource: Patient
operation: read
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: study-sponsor
For a blinded monitor, the tag exists, the identity filter passes, and the rule with property filters applies. For an unblinded monitor, the filter fails, the rule is skipped, and the next matching rule (without property filters) takes effect.
Rules with identity filters should be placed before rules without them. A rule without an identity filter acts as a catch-all for that role/resource/operation combination. If it appears first, the more specific rule is never reached.
7. Property Filters in Practice
Once the authorization pipeline determines that the caller may access a resource, property filters can modify the returned resource before it is sent back.
7.1 Available filter types
Two built-in filter types are available:
- NullFilter removes the targeted property entirely. The field does not appear in the response. NullFilter works on any property path the FHIRPath engine can target: structured fields like
nameoraddress, free-text fields likenoteorconclusion, narrativetext, or filtered subsets liketelecom.where(system = 'phone'). - RandomFilter replaces the targeted property with randomly generated data of the same type. RandomFilter supports two FHIR data types:
HumanName(generates a plausible random given and family name) andContactPoint(generates a random value appropriate to the contact system: phone, email, URL). Other data types are not supported. If RandomFilter is configured for an unsupported type, the server rejects the operation (fail-closed).
Each request generates different random values for RandomFilter-transformed fields. The resource structure is preserved, but the values carry no information.
7.2 What to filter
The Patient resource is the primary target for anonymization. It concentrates most direct identifiers:
| Property | PHI category | Recommended filter |
|---|---|---|
name | Names | RandomFilter |
telecom | Phone, fax, email | NullFilter |
address | Geographic data | NullFilter |
birthDate | Dates | NullFilter |
identifier | MRN, SSN, insurance numbers | NullFilter |
photo | Full-face photographs | NullFilter |
RelatedPerson resources carry the same demographic fields and can indirectly identify the patient. Apply the same filter pattern.
7.3 Free-text fields require separate attention
Structured identifiers are only part of the problem. Clinicians regularly type patient names, locations, or other identifying facts into free-text fields: Observation.note, Condition.note, DiagnosticReport.conclusion, or narrative text elements on any resource. A filtering strategy that covers only structured demographic fields will miss identifiers embedded in prose.
NullFilter on free-text fields removes them entirely. There is no built-in NLP-based redaction that selectively removes names from within a narrative while preserving the rest. If a use case requires retaining clinical narratives for filtered roles, that processing belongs in a separate pipeline.
| Resource | Property | Recommended filter |
|---|---|---|
| Observation | note | NullFilter |
| Condition | note | NullFilter |
| DiagnosticReport | conclusion | NullFilter |
| MedicationRequest | note | NullFilter |
| Any resource | text (narrative) | NullFilter |
7.4 Configuration
Property filters are defined per authorization rule. Because read and search are separate operations with separate rules, the same property filter list must appear on both if filtering should apply regardless of how the client retrieves the resource:
fire-arrow:
authorization:
validation-rules:
- client-role: Practitioner
resource: Patient
operation: read
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: study-sponsor
property-filters:
- property: "name"
filter: RandomFilter
- property: "telecom"
filter: NullFilter
- property: "address"
filter: NullFilter
- property: "birthDate"
filter: NullFilter
- property: "identifier"
filter: NullFilter
- property: "photo"
filter: NullFilter
- client-role: Practitioner
resource: Patient
operation: search
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: study-sponsor
property-filters:
- property: "name"
filter: RandomFilter
- property: "telecom"
filter: NullFilter
- property: "address"
filter: NullFilter
- property: "birthDate"
filter: NullFilter
- property: "identifier"
filter: NullFilter
- property: "photo"
filter: NullFilter
blocked-search-params:
- "name"
- "family"
- "given"
- "phonetic"
- "telecom"
- "phone"
- "email"
- "address"
- "address-city"
- "address-state"
- "address-postalcode"
- "address-country"
- "birthdate"
- "identifier"
blocked-search-params is only needed on search rules. Read operations do not accept search parameters.
8. Search Side-Channel Protection
Property filters redact response fields, but the underlying data remains fully indexed and searchable. A client could try to infer redacted values through targeted search parameters, sort order, includes, reverse chaining, or filter expressions if those are left open. Each vector must be explicitly blocked.
8.1 The threat
When a property filter removes a patient’s name, a client can still call GET /fhir/Patient?name=Smith. If the search returns results, the client knows a patient named “Smith” exists, even though the name was removed from the response.
Even without a direct search parameter, _sort=name reveals alphabetical ordering of redacted names across patients, grouping identical values and exposing relative rank.
_include follows references. If Patient.generalPractitioner is filtered out, GET /fhir/Patient?_include=Patient:general-practitioner reveals the relationship through the included resource in the response bundle.
8.2 Blocked search parameters
The blocked-search-params field on a rule lists FHIR search parameter names that must be rejected. When a client uses any of these parameters, the server returns 403 Forbidden. The error message does not reveal why the parameter is blocked, to avoid leaking filter configuration.
The mapping from FHIRPath properties to search parameters is many-to-many and must be listed explicitly. There is no automatic derivation, because an incorrect automatic mapping would create a false sense of security.
| FHIRPath property | Search parameters to block |
|---|---|
name | name, family, given, phonetic |
telecom | telecom, phone, email |
address | address, address-city, address-state, address-postalcode, address-country |
birthDate | birthdate |
identifier | identifier |
8.3 Automatic handling of related parameters
When blocked-search-params is configured on a rule, the server also handles several related parameters:
_sort: each sort key is checked against the blocked list. Sorting by a blocked parameter is rejected. Multi-key sorts are parsed individually; the request fails if any key is blocked._filter,_has,_text,_content: rejected entirely on both REST and GraphQL search requests. These support arbitrary or full-text search criteria that could reference any blocked parameter. The_hasreverse-chaining parameter is particularly opaque because it embeds search parameter names deep inside colon-delimited chains (e.g.,_has:Patient:link:name=Smith). The server uses a fail-closed approach across all search paths.
8.4 Blocked includes
The blocked-includes field lists _include and _revinclude directives in ResourceType:searchparam format. These are needed when property filters target reference fields:
blocked-includes:
- "Patient:general-practitioner"
- "Patient:link"
8.5 Risk profiles by access pattern
Not all access patterns carry the same risk. The safest approach for roles that receive property-filtered data is to grant only read and graphql-read access:
| Access pattern | Side-channel risk | Additional configuration needed |
|---|---|---|
read (REST) | None | No |
graphql-read (inline reference expansion) | None | No |
HFQL (with blocked-search-params configured) | None (fail-closed) | blocked-search-params triggers rejection of all HFQL queries on the resource type |
search (REST) | High | blocked-search-params, blocked-includes |
graphql-search | High (same as REST search) | blocked-search-params, blocked-includes |
GraphQL read is the safest access pattern for clients that need to traverse resource graphs while property filters are active. References are resolved by resource ID (not by client-controlled search arguments), and property filters apply to all resolved resources automatically. The client has no way to inject search criteria into the resolution path.
HFQL takes a fail-closed approach: when blocked-search-params is configured for a resource type, all HFQL queries on that type are rejected outright. HFQL WHERE clauses use FHIRPath expressions rather than search parameter names, making fine-grained analysis complex. The fail-closed behavior prevents any partial leakage.
9. A Concrete Configuration: Clinical Trial with Two Access Tiers
Consider a multi-site clinical trial. Site investigators (treating physicians) need full read/write access to patient records. Study sponsor monitors need read-only access to clinical data, with patient-identifying information removed.
Both roles use the LegitimateInterest validator, so each user only sees data within their organizational scope. The difference is what happens to the data after authorization succeeds: the sponsor’s rules apply property filters that strip identifying information, and their search rules block parameters that correspond to redacted fields.
fire-arrow:
authorization:
default-validator: Forbidden
validation-rules:
# Site investigators: full clinical access, no anonymization
- client-role: Practitioner
resource: Patient
operation: read
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: site-investigator
- client-role: Practitioner
resource: Patient
operation: search
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: site-investigator
- client-role: Practitioner
resource: Patient
operation: update
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: site-investigator
- client-role: Practitioner
resource: Observation
operation: read
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: site-investigator
- client-role: Practitioner
resource: Observation
operation: search
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: site-investigator
- client-role: Practitioner
resource: Observation
operation: create
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: site-investigator
- client-role: Practitioner
resource: Condition
operation: read
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: site-investigator
- client-role: Practitioner
resource: Condition
operation: search
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: site-investigator
# Study sponsors: read-only access, anonymized patients, notes stripped
- client-role: Practitioner
resource: Patient
operation: read
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: study-sponsor
property-filters:
- property: "name"
filter: RandomFilter
- property: "telecom"
filter: NullFilter
- property: "address"
filter: NullFilter
- property: "birthDate"
filter: NullFilter
- property: "identifier"
filter: NullFilter
- property: "photo"
filter: NullFilter
- client-role: Practitioner
resource: Patient
operation: search
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: study-sponsor
property-filters:
- property: "name"
filter: RandomFilter
- property: "telecom"
filter: NullFilter
- property: "address"
filter: NullFilter
- property: "birthDate"
filter: NullFilter
- property: "identifier"
filter: NullFilter
- property: "photo"
filter: NullFilter
blocked-search-params:
- "name"
- "family"
- "given"
- "phonetic"
- "telecom"
- "phone"
- "email"
- "address"
- "address-city"
- "address-state"
- "address-postalcode"
- "address-country"
- "birthdate"
- "identifier"
- client-role: Practitioner
resource: Observation
operation: read
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: study-sponsor
property-filters:
- property: "note"
filter: NullFilter
- client-role: Practitioner
resource: Observation
operation: search
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: study-sponsor
property-filters:
- property: "note"
filter: NullFilter
- client-role: Practitioner
resource: Condition
operation: read
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: study-sponsor
property-filters:
- property: "note"
filter: NullFilter
- client-role: Practitioner
resource: Condition
operation: search
validator: LegitimateInterest
practitioner-role-system: http://example.org/trial-roles
practitioner-role-code: study-sponsor
property-filters:
- property: "note"
filter: NullFilter
# Identity verification for all practitioners
- client-role: Practitioner
resource: Practitioner
operation: me
validator: Allowed
When the study sponsor reads a Patient resource, the response has a randomized name and no telecom, address, birthDate, identifier, or photo fields. Gender, managingOrganization, and clinical references remain intact. Clinical resources like Observation are returned with full structured data but without free-text notes.
When the site investigator reads the same Patient, the full unmodified resource is returned.
10. Practical Benefits Beyond Anonymization
Anonymization is a valid use case for property filters, but it is not the only one. Several practical patterns are at least as common in day-to-day operations.
10.1 Automated filtering on bulk data extraction
Analytics and machine learning pipelines consume healthcare data at scale. The FHIR Bulk Data Access Implementation Guide defines an export mechanism, but it does not specify how to apply field-level filtering to the exported data. The specification states that “the FHIR server SHALL limit the data returned to only those FHIR resources for which the client is authorized” but delegates all implementation details to the server.
In practice, teams that need filtered bulk exports build a separate pipeline: export the full data set, run it through a de-identification tool, store the result in a second location, and serve it from there. This creates the copy-based delay described earlier.
When the bulk extraction client authenticates against Fire Arrow with a service identity that matches rules with property filters, the filtering applies automatically to every resource returned. The export contains the filtered view from the start, without a separate pipeline, a second storage location, or additional delay. The same authorization rules that govern interactive access also govern batch access.
10.2 Explicit client privilege limitations on a standard FHIR API
A common requirement in healthcare platforms is to expose a FHIR API to external clients (partner applications, patient-facing apps, third-party integrations) while restricting which data those clients can see or search for. The requirement is access scoping, not anonymization: this client may read Observations but may not search Patients by name. This client may see Encounter resources but should not see internal workflow tags.
Without server-native support, teams implement this as gateway logic: parse the request, check the client’s permissions, strip disallowed parameters, filter the response. This works, but every new resource type, every new search parameter, and every new API endpoint must be covered by the gateway. A gap in coverage is a data leak.
Fire Arrow’s blocked-search-params and blocked-includes enforce these restrictions inside the authorization pipeline. A client that is configured with blocked search parameters receives a 403 when it uses them, regardless of which API surface it accesses (REST, GraphQL, HFQL). The restriction is declared once in the rule configuration and enforced consistently across all request paths. External clients can use stock FHIR client libraries and standard FHIR APIs. They do not need custom SDKs or non-standard endpoints. The server enforces the limits.
10.3 Filtering internal system data from external views
Healthcare data models often carry internal metadata that should not be visible to external consumers: administrative tags used for workflow routing, system-generated identifiers used for internal reference tracking, implementation-specific extensions, or operational flags that drive backend logic. These are not PHI. They are implementation details.
Exposing them to external clients creates two problems. First, clients may build dependencies on internal data structures that were never intended to be stable. Second, internal metadata may inadvertently reveal operational information (processing status, internal organization structure, cost center assignments) that the external client should not see.
Property filters can remove these fields per role. An internal practitioner sees the full resource. An external application, authenticating with a different role, receives the resource with internal extensions and administrative tags stripped. The filtering is a configuration change, not a code change in the API layer.
10.4 No second system for a different privilege level
The pattern that runs through all three use cases above is the same: the server serves one data set to multiple audiences through one API, with the filtering rules declared in one configuration. Adding a new audience tier is a new set of authorization rules, not a new system.
For teams building clinical study platforms, digital monitoring products, or multi-site healthcare SaaS applications, this reduces the number of moving parts that need to stay in sync. The cost reduction is primarily in engineering time: the time that would otherwise go into building, maintaining, debugging, and auditing custom filtering code across every access path.
10.5 Debug mode
When a request returns unexpected results, the X-Fire-Arrow-Debug header shows the full rule evaluation trace: every rule that was evaluated, whether the identity filter passed or failed, and which property filters were applied. The output also includes near-miss analysis with hints about why a request was denied (missing PractitionerRole, role code mismatch, identity filter failure, etc.).
Debug mode should not be enabled in production. The output exposes the full authorization configuration, including property filter settings.
11. Boundaries: Anonymization is Practical, Not Complete
Property filters are a practical tool for field-level de-identification, but they are not a complete privacy solution by themselves. Understanding the boundaries helps teams build the right overall architecture.
11.1 Search access must be handled carefully
Property filters redact response fields, but the underlying data remains indexed and searchable. If the anonymized role has search access, teams must explicitly configure blocked-search-params and blocked-includes on every search rule that co-exists with property filters. Missing a single search parameter from the blocked list can allow a client to circumvent anonymization. The safest pattern for anonymized roles is to grant read and graphql-read access only.
11.2 Statistical de-identification is a different problem
Removing direct identifiers does not automatically prevent re-identification through small cohorts, rare clinical combinations, or temporal patterns. If a data set contains only one female patient aged 90+, gender and age range alone may re-identify that patient. If a use case requires k-anonymity-style guarantees or broader statistical disclosure control, that belongs in a higher-level de-identification design, not only in field filtering.
11.3 Free text and coded values still need auditing
Property filters work on the properties they are configured to transform. If identifying information appears in unexpected fields (coded text, narrative blocks, custom extensions), those need to be audited and filtered deliberately. There is no built-in detection of PHI within arbitrary text.
11.4 Configuration discipline matters
Because rules are ordered and identity-filtered rules should precede catch-all rules for the same role/resource/operation combination, teams need to understand precedence. Rule hygiene is normal for a declarative authorization model, but it still requires review and testing discipline. The debug header helps verify that rules are evaluated as expected during development.
12. When This Pattern Fits
This pattern is especially useful when the following are true:
You need to record comprehensive clinical data while restricting who sees what
Systems need rich, complete data for care delivery, research, and analytics, while HIPAA’s Minimum Necessary principle and GDPR’s data minimization require that each user sees only what their role demands. If this tension is part of the product requirements, server-native field filtering addresses it without building a separate policy layer.
You want one operational source of truth
If the clinical or study system should remain the primary live data store, query-time filtering avoids the need to continuously synchronize a second anonymized serving layer. Copy-based de-identification pipelines are viable for batch analytics, but they introduce delay and a second system to maintain. For use cases where filtered access must be live (sponsor monitoring during an active trial, real-time analytics dashboards, external client APIs), query-time filtering avoids the copy step.
You have multiple client types sharing a common FHIR backend
External applications, internal tools, analytics services, and patient-facing apps may all need access to the same data set through standard FHIR APIs. If these clients need different views (external clients should not see internal tags; analytics clients should not see patient names; patient apps should only see their own data), server-native filtering avoids building a custom gateway per client type.
Your team cannot justify a custom filtering gateway
Building a FHIR API gateway that handles authentication, field-level redaction, search side-channel protection, and multi-surface enforcement (REST, GraphQL, bulk export, subscriptions) is substantial engineering work. For teams where that work is not the core product, server-native filtering is an alternative to building a parallel access control layer.
You want policy close to the data model
When authorization logic is expressed using healthcare-native concepts (patient compartments, practitioner relationships, organization links, care teams, identity properties), it maps directly to the domain the compliance team already understands. A policy model expressed in custom gateway code requires reviewers to trace the mapping from domain concepts to implementation, and requires updates whenever the data model changes.
13. Conclusion
Healthcare systems are under pressure from both sides. Clinical workflows, study protocols, and digital monitoring programs demand comprehensive data capture. Regulations, privacy principles, and practical product requirements demand that each participant sees only what their role justifies. The intersection of these two pressures is the problem this paper addresses.
The current FHIR server ecosystem does not provide a standard answer. HAPI FHIR has no built-in per-user access control. SMART on FHIR scopes do not express field-level visibility. The FHIR Permission resource is still at maturity level 1. Cloud de-identification services create copies of the data set rather than filtering in place. Teams fill these gaps with custom API gateways, middleware layers, and batch export pipelines. The filtering logic ends up distributed across custom code, maintained per endpoint, and difficult to audit.
Fire Arrow Server’s model takes a different approach: keep the permission and filtering logic close to the data model and inside the same request pipeline that serves data. The server authenticates the caller, resolves a FHIR identity, evaluates ordered authorization rules with optional identity filters, narrows search results before they are returned, and applies property filters where the caller’s view must differ from the stored resource. The rules are declared in one configuration. The enforcement is consistent across REST, GraphQL, and HFQL. Search side-channel protections are part of the same pipeline.
The result is not “automatic privacy” and it is not a replacement for every de-identification requirement. Anonymization specifically is a viable but tricky use case that requires careful attention to search side-channels, free-text fields, and statistical re-identification risks. Server-native field filtering also addresses patterns that are less dramatic than anonymization but come up regularly: automated filtering on bulk data extraction for analytics and machine learning pipelines, client privilege limitations on standard FHIR APIs, and hiding internal system data from external views.
For teams building healthcare platforms where multiple audiences need different views of the same data set, the choice is between building a custom filtering layer outside the FHIR server or using a server that handles it natively.
References
-
Fire Arrow Docs: Automatic Anonymization of Data. https://docs.firearrow.io/docs/server/how-to/automatic-anonymization/
-
Fire Arrow Docs: Authorization Concepts. https://docs.firearrow.io/docs/server/authorization/concepts/
-
Fire Arrow Docs: Identity Filters. https://docs.firearrow.io/docs/server/authorization/identity-filters/
-
Fire Arrow Docs: Property Filters. https://docs.firearrow.io/docs/server/authorization/property-filters/
-
Fire Arrow Docs: Validators. https://docs.firearrow.io/docs/category/validators-1/
-
Fire Arrow Docs: Authentication Overview. https://docs.firearrow.io/docs/server/authentication/overview
-
Fire Arrow Docs: Configuration Reference. https://docs.firearrow.io/docs/server/configuration
-
HIPAA Journal: The HIPAA Minimum Necessary Rule Standard (updated 2026). https://www.hipaajournal.com/ahima-hipaa-minimum-necessary-standard-3481/
-
HL7 SMART App Launch Implementation Guide (v2.2.0, STU 2.2). https://hl7.org/fhir/smart-app-launch/
-
HL7 FHIR Bulk Data Access Implementation Guide (STU 2). https://hl7.org/fhir/uv/bulkdata/STU2/export.html
-
HL7 FHIR R6 Permission Resource (ballot 3). https://hl7.org/fhir/6.0.0-ballot3/permission.html
-
HL7 Data Access Policies Implementation Guide (draft). https://build.fhir.org/ig/HL7/data-access-policies/