|Exercise 5||Parallel Processing with Custom Transformer|
|Data||3D Point Clouds (ASPRS Lidar Data Exchange Format (LAS))|
|Overall Goal||Create a custom transformer to parallel process data|
|Demonstrates||Custom Transformers and Parallel Processing|
The city has recently started collecting point cloud data and now it is ready for sharing with different departments. You have been asked to create a solution that converts the point clouds to a vector format that other departments can use.
You quickly create a great workspace that nicely tiles and thins the data too so the destination datasets aren’t overwhelming in terms of size.
However... the workspace takes longer to run than you like. Because it will be run on a daily basis it would be useful to speed up the translation using parallel processing.
Since none of the transformers used has a parallel processing parameter, you’ll have to create a custom transformer to do this.
1) Open Workspace
Open the workspace C:\FMEData2016\Workspaces\DesktopAdvanced\CustomTransformers-Ex5-Begin.fmw
As you’ll see, this workspace processes some incoming point cloud data. Inspect the data to see what we’re dealing with. If you run the workspace as-is it will take approximately three minutes. To make it run a little faster you can increase the Thinning Interval parameter in the PointCloudThinner (say to 25).
Open a task manager (process manager) tool for your operating system. Run the workspace. You’ll see a single FME engine process running (fme.exe):
|First Officer Transformer says…|
|You’ll also see an fmeworkbench.exe process, which is the process running the Workbench interface. This isn’t responsible for processing the workspace; the two are completely separate processes.|
2) Create Custom Transformer
Now select the PointCloudThinner and PointCloudCoercer transformers and turn them into a custom transformer.
It's important you don’t include the Tiler transformer, as this is creating the tiles that we’ll be using as a way to parallel process.
You can call the transformer something like PointCloudProcessing. It doesn’t matter what attribute reference handling you choose.
The transformer definition should look something like this:
3) Set Parallel Processing
In the Navigator window (of the custom transformer definition) locate and expand the section of custom transformer advanced parameters.
Double-click the Parallel Processing Level parameter to set it. Set the processing level to Moderate.
Click OK to close the dialog and you’ll notice the Parallel Process By parameter is now published.
4) Set Process By
Return to the main canvas and click on the parameters button for the custom transformer instance. Select both _column and _row as the attributes to process by:
This means that each unique combination of _column and _row (i.e. each tile) will be run under a separate process, up to a maximum of one process per core processor.
5) Run Workspace
Run the workspace, again with a task manager window open. Once the tiling is complete and the rest of the workspace is being processed, you’ll notice a number of FME worker processes (fmeworker.exe).
In moderate mode, you’ll see up to one fmeworker process for each core. This time the translation should be complete is nearly half the time, approximately one minute and thirty seconds.
Absolutely do NOT run this in "breakpoint mode". If you do, parallel processing won't work!
You are in breakpoint mode when Run with Breakpoints is set under the Run menu - whether or not you have any breakpoints set!
Similarly, do NOT run this in "full inspection mode". In that case the workspace will fail with a fatal error. Again, you are in full inspection mode when Run with Full Inspection is set under the Run menu.
6) Experiment with Parallel Processing Level
If you have time, re-run the workspace with a different processing level, say Aggressive. Does it run any quicker than the Moderate processing level? If not, why might that be? Does adjusting the number of tiles make it better or worse?
By completing this exercise you have learned how to: