logo

Trillions of points

Massive point clouds as infrastructure

Connor Manning

What is a point cloud?

iowa-bridge
University Ave., Cedar Falls, IA
Data source: Iowa DOT

Magnitudes of data

US Interstates - terrestrial

  • 46,876 miles
  • With 500m buffer at 15 points/m2:
    • 850 billion points
    • 15.4 TB uncompressed

Magnitudes of data

All of USA - aerial

  • 3.8 million miles2
  • At 8 points/m2 (USGS QL1):
    • 79 trillion points
    • 1.4 PB uncompressed

Tiles!

ahn-tiles-close
Netherlands
Data source: AHN

Tiles

ahn-bucket
Netherlands bucket explorer

Using tiles

  • Physically difficult
  • Logically difficult
  • Visualization only in pieces
red-rocks
Red Rocks Amphitheatre
Data source: DroneMapper

Spatial indexing

Re-organization of data

Entwine & Greyhound

Entwine

Spatial indexing

Built with:

pdal-logo

Entwine

Spatial indexing

  • entwine build
    • -i ~/data/boston/
    • -o ~/entwine/boston/
    • -r EPSG:3857
    • -t 12

Entwine

aws google dropbox laptop

Greyhound

Server piece

  • HTTP/s server
  • User-driven spatial queries
  • JSON communication
  • Binary point data

Entwine & Greyhound

  • A web-services solution
  • Lossless
  • Cloud-scalable
  • Visualization-first mindset
st-helens
Mt. St. Helens
Data source: Hobu, Inc.

Greyhound

Core API

  • /info
  • /files
  • /read

Read query


            /read?
                bounds=[1250,-750,0,1500,-500,250]&
                depthBegin=14&
                depthEnd=15&
                scale=0.1&
                offset=[637300,851210,520]
        

Read query


            /read?
                bounds=[1250,-750,0,1500,-500,250]&
                depthBegin=14&
                depthEnd=15&
                scale=0.1&
                offset=[637300,851210,520]&
                schema=[{"name":"X","type":"uint32"}, ...]
        

Read query


            /read?
                bounds=[1250,-750,0,1500,-500,250]&
                depthBegin=14&
                depthEnd=15&
                scale=0.1&
                offset=[637300,851210,520]&
                schema=[{"name":"X","type":"uint32"}, ...]&
                filter={"Classification":{"$in":[2,3]}}
        

Read query


            /read?
                bounds=[1250,-750,0,1500,-500,250]&
                depthBegin=14&
                depthEnd=15&
                scale=0.1&
                offset=[637300,851210,520]&
                schema=[{"name":"X","type":"uint32"}, ...]&
                filter={"$or":[{"Z":{"$gt":200}},{"Z":{"$lt":[200]}}]}
        

Visualization

Trillions of points in a browser

Potree / Plasio

dk nyc

Visualization

check

...but so what?

So what?

Visualization-first ≠ visualization-only

  • Eases burden of point cloud fluffiness
fluffy-pup
Data source: Pinterest

Analytics

dublin-flights
Dublin
Data source: NYU

Can we estimate the density?

Analytics

Density estimation

  • Random samples within the dataset bounds:
  • 
                /read
                    ?bounds=[p.x, p.y, p.x + 10, p.y + 10]
                    &schema=["X","Y","Z"]
            
  • For queries with points, track points/unit2
  • Aggregate average density

Intensity patchwork

midwest-intensity
Iowa & Minnesota
Data source: USGS

Intensity normalization

  • For each file, average the intensities at low resolution
  • Average those averages for a "target" intensity
  • Scale intensities by totalAverage/fileAverage
  • For p in files:
  • 
                    /read
                        ?filter={"$and": {
                            "Path": p,                      // Select path.
                            "Classification": {"$ne": 7}}}  // Filter noise.
                        &depthEnd=12                        // Low resolution.
                        &schema=["Intensity"]               // Intensity only.
                

Before/After

450B points in 5 minutes

midwest-intensity-results

Analytics

  • Sampling - density, intensity, etc.
  • Visual quality-control
  • Large areas at low resolution:
    • Boundary mapping
    • Average vegatative heights
    • Density of planar surfaces
      • Higher density → more urban

Create

check

Read

check

Update

question

Create

check

Read

check

Update

check

Push new attributes

append-dimensions
  • GET /read?schema=["X","Y","Z"]&...
  • PUT /write?schema=["MyAttr","Mask"]&...
tile
tile
tile
tile
tile
tile
tile
tile
tile
tile
tile
tile
tile
tile
tile
tile

Classify Central Park

central-park

Classify Central Park

nyc-potree-lasso

Density Sampling

nyc-sample

Results

nyc-results

PMF

nyc-results

SMRF

nyc-results

SMRF ⊕ PMF

nyc-results

Zoomed in

nyc-closeup

PMF

nyc-closeup

SMRF

nyc-closeup

PMF ⊕ SMRF

nyc-closeup

Default Classification

autzen

PMF

autzen

SMRF

autzen

SMRF ⊕ PMF

autzen

Adding data

  • Algorithm development
  • Classifiers
  • Application-specific data
    • Feature markings
    • User annotations

Create

check

Read

check

Update

check

Delete

times

Visualization-first

...but not "-only"

  • User-driven access patterns
  • Millisecond response times
sncf
SNCF Railway
Data source: SNCF

Future work

  • Appending data attributes: prototype → release
  • Expanding the meta-API
  • Further server-side integration with PDAL
    • Pipeline execution
    • Filtering via polygon
    • Rasterization, colorization, etc.
logo

Links