How to Structure Role-Based Access Control in a Vector Database for Teams with Mixed Read and Write Responsibilities

Most teams that use a vector database have members with different levels of access needs. Data engineers need to create and update collections. Analysts and consumers need to query data but should not modify schema or delete records. Tenant administrators need to manage tenant lifecycle within defined boundaries. Operators need full control.

Without a permission model that reflects these differences, teams face a familiar set of tradeoffs: grant everyone broad access and accept the risk, or gate everything behind a single admin who becomes a bottleneck. Neither is sustainable.

Role-based access control (RBAC) addresses this by separating what users can do from who they are. Roles describe a set of permissions. Users are assigned to roles. When a user’s responsibilities change, the role assignment changes — the permission definition stays stable.

Why Mixed Read/Write Teams Need Custom Roles

Most platforms ship with a small set of predefined roles covering the extremes: a full-access admin role and a read-only viewer role. These are useful starting points but inadequate for real teams.

A data engineer needs write access to the collections they manage, but not to the rest of the system. An analyst needs read access to query data, but granting them the full admin role to get that access exposes schema modification and deletion they should never use. The predefined roles are too coarse for the nuance real teams require.

Custom roles close this gap. They are defined in terms of specific operations on specific resources, rather than blanket access levels. A custom role for a data engineer can grant write access to the collections they own and nothing else. A custom role for an analyst can grant read-only access without touching schema permissions.

Designing Roles Around Job Functions

The most maintainable RBAC design maps roles to job functions, not to individuals. A role named after a function — data_engineer, analyst, ingestion_service — is assigned to every person or service that performs that function. When someone joins a team, they get the role. When they leave, the role is revoked. The role definition itself rarely changes.

Roles designed around individuals — alice_role, bobs_key — force you to duplicate permission sets and diverge over time. Two engineers doing the same job have subtly different roles, and every permission change requires updating both.

The key questions when designing a role:
– Which resources does this role need to access?
– Does it need to read data, write data, modify schema, or some combination?
– Should it be scoped to specific collections, or all collections?
– Does it manage other roles or create users?

Separating Schema and Data Permissions

A subtlety that matters in practice: schema permissions and data permissions are often independent. A role can have permission to read and write data objects without having permission to modify the collection schema — the structure of the data. A role can have permission to read schema configuration without being able to create or delete collections.

This separation matters for two reasons. First, it allows you to grant engineers who are actively ingesting data write access to objects without giving them the ability to accidentally alter the underlying schema. Second, it allows auditors and read-heavy services to access configuration details — which collections exist, what their structure is — without being able to touch the data itself.

When designing custom roles, think separately about what schema operations a role needs and what data operations it needs. Do not conflate them.

Wildcard Scoping for Scalable Role Definitions

When teams have many collections or collections that are provisioned dynamically — per-customer, per-project, per-environment — enumerating each collection in a role definition is impractical. Role definitions that list specific collections need to be updated every time a new collection is added.

Wildcard patterns solve this. A role definition that scopes to ProjectA* covers every collection with that prefix, including ones that do not exist yet. This makes roles forward-compatible: the role is defined once and automatically applies to new collections that match the pattern.

The tradeoff is that wildcard roles are broader than they appear. A pattern like * covers everything. Even a narrower pattern like Customer* covers more than intended if new collections are added with that prefix for unrelated reasons. Naming conventions for collections matter when you are relying on wildcard scoping.

Role Assignment and User Lifecycle

Roles are only as useful as the process for assigning and revoking them. A role design that is precise and well-structured provides no access control benefit if users are routinely assigned roles that do not match their function.

A healthy role assignment process has three properties:

Assignment is tied to onboarding. When a developer joins a project or a service account is created, role assignment is part of the provisioning step — not something that happens informally later.

Revocation is tied to offboarding. When a developer leaves a project or a service is decommissioned, the credential is revoked as part of the offboarding step. Access does not persist because nobody remembered to revoke it.

Minimum necessary access is the default. Start with the least permissive role that allows the user or service to do their work. Add permissions when there is a demonstrated need, not in advance.

Weaviate: Structuring RBAC for Mixed Read/Write Teams

Predefined Roles as a Baseline

Weaviate provides two predefined roles that cannot be modified:

root — full access to all resources
viewer — read-only access to all resources

These are the extremes. Most teams with mixed responsibilities need custom roles that sit between them.

Defining a Read/Write Role

For data engineers or services that manage collections and write data:

from weaviate.classes.rbac import Permissions

permissions = [
    Permissions.collections(
        collection="TargetCollection*",
        create_collection=True,
        read_config=True,
        update_config=True,
        delete_collection=True,
    ),
    Permissions.data(
        collection="TargetCollection*",
        create=True,
        read=True,
        update=True,
        delete=False,
    ),
    Permissions.backup(collection="TargetCollection*", manage=True),
    Permissions.Nodes.verbose(collection="TargetCollection*", read=True),
    Permissions.cluster(read=True),
]

client.roles.create(role_name="rw_role", permissions=permissions)

Notice that collection permissions and data permissions are separate. The collections block governs schema operations; the data block governs object operations. Setting delete=False on data objects while retaining delete_collection=True on the schema gives the role power over structure without the ability to bulk-delete records.

The TargetCollection* wildcard covers all collections with that prefix, including ones created after the role is defined.

Defining a Read-Only Role

For analysts, auditors, or consumers that only query data:

permissions = [
    Permissions.collections(
        collection="TargetCollection*",
        read_config=True,
    ),
    Permissions.data(collection="TargetCollection*", read=True),
]

client.roles.create(role_name="viewer_role", permissions=permissions)

This role cannot modify schema, cannot write data, and cannot delete anything. It is safe to assign broadly across team members who need query access.

Defining a Tenant Manager Role

For multi-tenant setups where some team members manage tenant lifecycle within specific collections:

permissions = [
    Permissions.tenants(
        collection="TargetCollection*",
        tenant="TargetTenant*",
        create=True,
        read=True,
        update=True,
        delete=True,
    ),
    Permissions.data(
        collection="TargetCollection*",
        tenant="TargetTenant*",
        create=True,
        read=True,
        update=True,
        delete=True,
    ),
]

client.roles.create(role_name="tenant_manager", permissions=permissions)

The tenants permission block governs tenant lifecycle (create, activate, deactivate, delete). The data block scoped to a specific tenant pattern governs object access within those tenants. Both use wildcard patterns so the role applies to new tenants as they are provisioned.

Assigning Roles to Users

Once roles are defined, creating a user and assigning the appropriate role is a two-step operation:

user_api_key = client.users.db.create(user_id="custom-user")
client.users.db.assign_roles(user_id="custom-user", role_names=["rw_role"])

The API key returned by create is the credential the user or service will use. The role assignment determines what that credential can do.

Role Design Summary

Team Function	Recommended Role	Key Permissions
Data engineers	Custom `rw_role`	Read + write on specific collections
Analysts / auditors	`viewer_role`	Read-only on specific collections
Tenant administrators	Custom `tenant_manager`	Full tenant + data access, scoped by tenant
Operators	Predefined `root`	Full access

Important Design Notes

Collection and data permissions are independent. You can grant data read/write without granting schema modification rights. Model these separately when designing roles.

Wildcard patterns are forward-compatible but broad. Scope patterns carefully based on collection naming conventions. A poorly chosen wildcard can grant unintended access as the collection space grows.

Predefined roles cannot be modified. root and viewer are stable baselines. Custom roles are the mechanism for everything in between.

Role management permissions carry escalation risk. A role that can create or modify other roles can in principle grant itself or others elevated access. Only grant role management permissions to roles that genuinely administer the permission system.