Point Cloud Allemansrtten
point cloud allemansrättenmichael winston dalescopclidar datageospatial datacloud architecturedistributed systemsbrewer's theoremeventual consistencyamazon s3aws lambdaopen access

Point Cloud Allemansrtten

What Does This Architecture Even Look Like?

The term "Point Cloud Allemansrätten" was recently coined by Michael Winston Dales in his "Weeknotes" blog post on April 6, 2026. Dales' approach centers on the Cloud Optimized Point Cloud (COPC) format. This is a sound technical decision. COPC is a lazily-loaded, cloud-native version of LAS or LAZ, designed to let clients fetch only the necessary parts of a point cloud without downloading the entire file. It builds on object storage, meaning the primary data repository will likely be Amazon S3, Azure Blob Storage, or Google Cloud Storage.

The architecture likely follows this pattern:

Here, the MetadataService handles hierarchical access, translating user requests for specific geographic areas or levels of detail into direct URLs for COPC segments in ObjectStorage. A CDN sits in front of the object storage to cache frequently accessed data and reduce latency for geographically dispersed users. This setup is a common and effective pattern for serving large static or semi-static assets, leveraging the CDN's strengths.

Why Adding a Cache Here Will Break Everything Downstream

The "open access" and "right to roam" principles imply high availability and low latency for all users, globally. Lidar data, even optimized, is still massive. A single COPC file representing a large area can be gigabytes. When many users "prod at the data," as Dales puts it, the system is not just serving a few images; it is potentially streaming vast amounts of geometric information.

The immediate bottleneck is the CDN and ObjectStorage egress. While CDNs excel at caching, point cloud data is often highly localized and unique. If users are truly "roaming," they constantly request new, uncached data segments. This scenario presents several challenges:

  1. A high rate of CDN cache misses means the CDN constantly pulls from ObjectStorage, driving up egress costs and increasing latency for the end-user.
  2. Thundering Herd: Imagine a popular landmark in Sweden. If a thousand users simultaneously decide to "virtually explore" that specific area, their clients will all hit the same COPC segments. Even with a CDN, the initial requests will bypass the cache, creating a thundering herd problem on the ObjectStorage and potentially overwhelming the MetadataService if it is not designed for that kind of concurrent load.
  3. Metadata Service Scaling: The MetadataService is the central control point. If it is a serverless function like AWS Lambda, it scales well, but cold starts can introduce latency. If it is backed by a database like DynamoDB, read capacity units require careful provisioning to handle bursts of requests for hierarchical lookups. An under-provisioned IndexStore will quickly become the single point of failure. In a project I worked on for a national mapping agency, we observed this exact failure pattern in similar geospatial data services, where an unexpected spike in regional interest could render the entire system unresponsive for hours.

Beyond raw bandwidth, the critical factor is the pattern of access. Random access across a vast dataset is much harder to optimize than sequential access or highly localized requests.

The Inevitable Trade-off: Consistency vs. Availability

A fundamental principle in distributed systems is Brewer's Theorem, which states that one must choose between Availability (AP) or Consistency (CP). Attempting both simultaneously violates the theorem. For "Point Cloud Allemansrätten," the priority is clear: Availability. Without data access, the "right to roam" principle is void.

This means the system will likely operate under an Eventual Consistency model. When new lidar data is ingested and processed into COPC files, it will not appear instantly for all users globally. There will be a propagation delay as CDNs update their caches and IndexStore entries replicate across regions. For virtual exploration, this is generally acceptable. A few minutes or even hours of delay for new data is not a deal-breaker.

However, if the system were to incorporate user-generated content—say, annotations or virtual markers—then the consistency requirements would shift. If I place a marker, I expect to see it immediately, and I expect my friend to see it immediately. This would force a more complex consistency model, potentially requiring distributed transactions or CRDTs, which adds significant overhead and complexity.

Ultimately, balancing the cost of high availability with user experience is the core trade-off. If infrastructure is insufficient, the result is a slow, unreliable experience, which defeats the purpose of "open access."

How to Build a Truly Open Point Cloud

To realize "Point Cloud Allemansrätten," the system requires design for extreme scale from inception. Key priorities for an architecture review include:

Aggressive caching via a global CDN is a fundamental requirement. Configure the CDN to cache COPC segments aggressively, even for short durations, to absorb request spikes. Edge compute capabilities, such as CloudFront Functions or Lambda@Edge, can dynamically route requests or pre-fetch adjacent data based on user movement patterns, further reducing perceived latency.

Stateless operation for the MetadataService is key to achieving scalability. Serverless functions, like AWS Lambda or Google Cloud Functions, provide the necessary elasticity to scale out and handle millions of requests without server management overhead.

For the IndexStore, a DynamoDB single-table design offers predictable performance even under millions of concurrent lookups. All metadata—geographic bounds, COPC file pointers, and hierarchical relationships—can reside in a single table. Carefully chosen partition and sort keys enable efficient lookups for specific areas and zoom levels, even during traffic bursts.

Optimizing the client application to move beyond reactive data requests is crucial. It should predict user movement and pre-fetch adjacent COPC segments, significantly reducing perceived latency.

While the primary focus is on read performance, idempotency in the ingestion pipeline for new lidar data is critical. If a COPC conversion fails mid-process, or an IndexStore update times out, retrying the operation must not introduce duplicate data or corrupted indices. Each step in the ingestion process must tolerate safe re-execution without side effects.

Availability is only half the battle; its usability under the anticipated load of millions of users is the other. The vision of "prodding at the data and asking questions" only materializes if the system withstands the load generated by those inquiries. Current architectural decisions will determine if "Point Cloud Allemansrätten" remains a theoretical concept or evolves into a truly impactful, widely used resource. This is a solvable problem, requiring a deep understanding of distributed systems and a design that anticipates future demands, not merely present-day functionality.

Dr. Elena Vosk
Dr. Elena Vosk
specializes in large-scale distributed systems. Obsessed with CAP theorem and data consistency.