Walkthrough: Integrating Multiple Feature Types into a Geodatabase
In this section we will integrate data of several different formats into a single database: an Esri geodatabase used commonly in GIS applications.
This walkthrough will reiterate procedures from the previous exercises, in addition to showing how to write multiple feature types using a single writer. After integrating the data into a single database, we will carry out some basic analysis of the data.
1) Start Workbench
Use the Generate Workspace dialog to create a workspace using these parameters:
|Reader Format||Google KML|
|Writer Format||Esri Geodatabase (File Geodb Open API)|
Note: Make sure you select File Geodb Open API for your geodatabase format or you may run into problems later on. The Open API implementation of GDB will allow you to create and view this geodatabase in FME Data Inspector without an Esri license. However, you will require a licensed copy of ArcMap, ArcGIS Pro, or ArcGIS Online to view the GDB in its native software. Alternatively, your instructor might ask you to use a different format, for example, a PostGIS database. FME makes it easy to write whatever format you want.
Your dialog should look like this:
Click the Parameters buttons in the Generate Workspace dialog to check the reader/writer parameters. We will change one for this exercise. Under your Writer Parameters, in the Database Connections section, check Overwrite Existing Geodatabase. With that checked we will recreate the database entirely every time we write out the data. Your dialog should look like this:
Click OK twice. You will be presented with a Select Feature Types dialog because our reader data set contains multiple layers. We only need the feature type named Neighborhoods, which contains the polygons of neighborhood boundaries. Make sure Neighborhoods is the only feature type selected (you can use Select All to toggle all layers off first):
Note: if you pressed OK before setting Writer Parameters, you can change this after generating your workspace. In the Navigator window, expand your CityDatabase FILEGDB writer, and find Parameters > Overwrite Existing Database (set to Yes).
2) Clean Up Generated Workspace
The generated workspace added a geometry filter automatically. This happens because Esri geodatabases store layers with only one geometry type.
However, our reader feature type only contains polygons, so we can clean up the starting workspace by:
- deleting the GeometryFilter
- deleting the Neighborhoods_point, Neighborhoods_line, and Neighborhoods_geom writer feature types
- renaming the Neighborhoods_polygon writer feature type to just "Neighborhoods"
Your workspace should now look like this:
3) Add Excel Reader
Let's add two more readers. We'll look at two datasets from the list above that need changes to the defaults to work as we wish. Other situations that might require similar changes are covered in the Additional Procedures section.
You can add readers by clicking Readers > Add Reader, or by clicking an empty space on the canvas and typing the name or file extension of the format you wish to add and picking it from the Quick Add Menu.
First, let's add an Excel file of public art:
|Reader Format||Microsoft Excel|
We will make two changes to the reader defaults.
First, change Workflow Options from Individual Feature Types (the default) to Single Merged Feature Type. The Excel reader treats rows as features and worksheets as feature types. This means by default, FME would try to read every sheet in the file as a separate feature type. If you inspect the data, you'll find it has one sheet per neighborhood. This works in Excel, but we would prefer that our database treat all the public art as one point layer. Choosing Single Merged Feature Type does this for us.
Second, we need to add a coordinate system. The reader has latitude and longitude coordinates in the Excel file, but because Excel does not store coordinate system information, we have to tell FME which one to use. We know that it is LL84, so type that into the Coordinate System box in your dialog.
Your dialog should look like this:
Before clicking OK, click on Parameters. We have to ensure the Reader recognizes the right columns in the Excel sheet as coordinates. Under the Attributes table column C named Longitude should by type x_coordinate and column D named Latitude should be type y_coordinate. Your dialog should look like this:
Click OK twice. This adds a new Excel reader to our Navigator window and gives it a feature type on our canvas named <All>, the default name for merged feature types. This isn't very descriptive, however. Let's add an annotation to indicate that this is the merged public art data. Right-click your feature type and select Attach Annotation. You can write something like, "Public Art Installations (points)":
Even though we added the public art installations from all neighborhoods, we can still distinguish between neighborhoods through the fme_feature_type attribute. This attribute simply gives the name of the feature type the feature belongs to. It exists on all FME features, but is not always exposed. It is exposed by default whenever you add a merged feature type. You can see it if you click the drop-down arrow next to your new <All> reader feature type.
Use Run Just This on the new <All> feature type and then inspect it using Visual Preview. You can see in Table View that the data is still organized by neighborhood: the fme_feature_type attribute contains the neighborhood name.
Let's turn that fme_feature_type attribute into something more meaningful for our geodatabase. Add an AttributeManager transformer after the <All> reader feature type. Open the Parameter Editor and click in the Attribute Actions table in the Output Attributes column. This will let you rename that attribute. Let's change it to NeighborhoodName instead of fme_feature_type. The Action column will automatically change to Rename. Your Attribute Actions table should now look like this:
4) Add Writer Feature Type for Public Art
Now we need a writer feature type for our public art points.
|Remember, we already have our geodatabase writer set up. We don't want to add another writer, because we want all of our data to be written to the same output database. Instead, we need to use feature types, because we want our data to be organized by layer.|
We can add a feature type to a writer in the menu bar under Writers > Add Feature Type. Because we only have one writer it will automatically be selected. Name your new feature type PublicArt and give it Geometry type geodb_point. Your Feature Type dialog should look like this:
Click OK. Connect your new feature type to the Output port of your Attribute Manager. Your canvas should look like this:
5) Add Shapefile Reader
Add another reader by clicking Readers > Add Reader, or by clicking an empty space on the canvas and typing the name or file extension of the format you wish to add and picking it from the Quick Add Menu. In this case that would be shp for Esri Shapefile.
|Reader Format||Esri Shapefile|
When you select your dataset, use Ctrl or Shift click to select all three bike path shapefiles.
The bike path data is split up into three shapefiles by length of the bike path (L for long, M for medium, and S for short). Just like the public art points, we don't need these features separated in our database. Therefore, let's change Workflow Options from Individual Feature Types to Single Merged Feature Type. Because shapefiles contain coordinate system information, we don't need to change that here. Your dialog should look like this:
Click OK. This adds another <All> feature type to our canvas. Again, we should add an annotation so it is clear which reader is which:
If you inspect the reader feature type you'll find the data already has a PathType attribute with values S, M, or L. Therefore, we don't need to use AttributeManager to rename fme_feature_type like we did for the public art data. However, we also don't want fme_feature_type to be written to our final data.
We can unexpose it on our reader feature type. Select your bike paths reader feature type and go to the Format Attributes tab in the Parameter Editor. You'll see that fme_feature_type is checked. Simply uncheck it and click OK/Apply:
Now if you click the drop-down arrow on your bike paths reader feature type you'll see that fme_feature_type has been removed from the schema as we wanted.
6) Add Writer Feature Type for Bike Paths
Now we need a writer feature type for our bike path lines. Click Writers > Add Feature Type, name your new feature type BikePaths, and give it Geometry type geodb_polyline. Your Feature Type dialog should look like this:
Click OK. Connect your ALL reader feature type containing the bike path features to your writer feature type BikePaths. Your canvas should look like this:
7) Reproject Data
Add a Reprojector transformer to the canvas and then connect it to the Neighborhoods feature type. Choose UTM83-10 as the Destination Coordinate System. Right-click the Reprojector transformer and select Duplicate to add another. Connect this between your AttributeManager output port and your Public Art writer feature type.
This will ensure our neighborhoods and public art data is in the same coordinate system as our bike paths. Your canvas should look like this:
Save and run your workspace.
8) Inspect Output
Let's take a look at our data integrated into one geodatabase. Open C:\FMEData2019\Output\Training\CityDatabase.gdb in Data Inspector. You should see all three layers, now all stored in the same format in a centralized database:
9) Example Data Integration Analysis
Let's look at an example of how integrating data facilitates analysis.
What if the City Planning Department wanted to know the total length of bike paths and number of public art installations by neighborhood? How would we do this using this workspace?
Take a minute and write or draw out how you would tackle this problem using what you have learned so far. Don't worry if you can't remember the exact name of transformers. Instead focus on outlining the process you would undertake to perform this analysis.
Let's find out if you were right! Note that there are usually multiple ways to solve a problem in FME, so your solution might still be valid.
Here are the steps we will take to conduct this analysis:
- Sum the total of bike path lengths by neighborhood.
- Sum the count of public art installations by neighborhood.
- Output a table or chart to show the results.
Let's walk through how to do that in Workbench.
10) Calculate Statistics for Public Art
Add a StatisticsCalculator transformer after your public art Reprojector transformer. We will use this to count the number of public art installations by neighborhood. Your canvas should look like this:
Open the parameters for the StatisticsCalculator. Select any attribute for Attributes to Analyze (it is just counting if there is a value, so any attribute will do).
With the exception of Total Count Attribute, clear all the boxes in Calculate Attributes so they are not generated.
Finally, set Group By to NeighborhoodName. Your dialog should look like this:
Click OK/Apply. The Summary output port will now output a table with the count of public art installations by neighborhood.
Let's write that out as a table in our geodatabase. Click Writers > Add Feature Type. Call it PublicArtSummaryStats and give it Geometry type geodb_no_geom. This will store it without geometry as a table. Click OK.
Once your new feature type is added, connect it to the Summary port of your StatisticsCalculator. Your canvas should look like this:
You can run the translation and inspect the table if you want.
11) Calculate Statistics for Bike Paths
Add a LineOnAreaOverlayer transformer to your canvas. This transformer will let us add attributes from the neighborhoods to the bike paths that overlap them. This will let us know which neighborhood each bike path segment is in. The Reprojector should connect to the Area port (because it is polygons of neighborhoods) and the bike path reader feature type should connect to the Line port. You will have overlapping connections for now, but we will improve the workspace layout in a later step:
Open the LineOnAreaOverlayer transformer parameters and check the box Attribute Accumulation > Merge Attributes. Your dialog should look like this:
We want to use this overlaid data to sum up the length of each feature and report the total length of bike paths, by neighborhood. However, if you inspect the bike path data, you'll notice that the PathLength attribute reports the total length of the entire bikepath, rather than the individual line segment. You can confirm this by inspecting the bike paths reader feature type and noticing that PathLength is the same for all segments sharing the same PathName, e.g. all Seaside segments have a PathLength of 31,669 m:
This fact means we have to create a new attribute to measure the actual length of each bike path segment. We can easily do this by adding a LengthCalculator transformer after the LineOnAreaOverlayer transformer Line output port:
We can keep the defaults for this transformer, which will create a new attribute, _length.
Add a StatisticsCalculator transformer and connect it to the output port of the LengthCalculator:
To do this, open the parameters for the StatisticsCalculator_2. Set Group By to NeighborhoodName. Set the Attributes to Analyze to _length. With the exception of Sum Attribute, clear all the boxes in Calculate Attributes so they are not generated. Your dialog should look like this:
If you inspect the Summary port of the StatisticsCalculator_2 (using feature caching), you'll find that one of the neighborhood names is blank:
This is because some of the bike path lines do not intersect a neighborhood polygon. If you inspect the data, these paths are either around Stanley Park or along the water's edge. Let's fix this problem.
Add an NullAttributeMapper transformer between the LengthCalculator output port and the StatisticsCalculator_2 input port:
Set its parameters as follows:
|If Attribute Value Is:||"Missing (Selected Attributes Only)"|
|Map To:||New Value|
|New Value||Stanley Park|
Your dialog should look like this:
Click OK. Great! Now our NullAttributeMapper will set any empty values of NeighborhoodName to Stanley Park. If you use Run to This on the StatisticsCalculator_2 and inspect the Summary port, you should see now see Stanley Park in the table:
Let's write these results as a table in our geodatabase as well. Click Writers > Add Feature Type. Call it BikePathSummaryStats and give it Geometry type geodb_no_geom. Click OK. Once your new feature type is added, connect it to the Summary port of your bike paths StatisticsCalculator. Your canvas should look like this:
Now when you run the translation, a summary table of bike path length by neighborhood will be written as a table in the geodatabase.
12) Create Charts
Finally, lets create two charts to summarize our findings.
Add a ChartGenerator transformer to the canvas. Attach it to the Summary output port of your public art StatisticsCalculator as a second connection. We'll still be sending features to the PublicArtSummaryStats feature type, but we want to create a second stream of features to use to create a chart.
Open its parameters. Under Chart Settings, for Type select Pie. For Title enter: "Distribution of public art by neighborhood, Vancouver, BC." Change the following parameters:
- Style > Title Font: Arial, 6
- Data Series > Data Attribute: _count
- Data Series > Label Attribute: NeighborhoodName
- Data Series > Show Percentages: Yes
Your dialog should look like this:
|If you want to change the order the neighborhoods are displayed in for this or the bike paths chart, add a Sorter transformer before the ChartGenerator.|
Add another ChartGenerator to the canvas, this time connected to the Summary output port of your bike paths StatisticsCalculator_2. Let's make this chart type Bar and title it: "Bike path length by neighborhood, Vancouver, BC." Change the following parameters:
- Axes > X Axis Label: Neighborhood (type this in, not an attribute)
- Axes > Y Axis Label: Length (m)
- Axes > Axes Font: Arial, 6
- Axes > Tick Font: Arial, 6
- Style > Title Font: Arial, 10
- Data Series > Data Attribute: _sum
- Data Series > Data Label Attribute: NeighborhoodName
- Data Series > Label Angle: Slanted
This will generate a chart that shows the total length of bike paths in each neighborhood. Your dialog should look like this:
13) Write Charts to PNGs
Now let's connect these ChartGenerators to writers to write the charts as images. Click Writers > Add Writer and use the following parameters:
|Writer Format||PNG (Portable Network Graphics)|
For Add Feature Type(s) > Raster File Definition, choose Manual. We are choosing this because we don't want these chart images to map any schemas that already exist in our workspace. The dialog should look like this:
Another dialog will open to specify a feature type for our PNG writer. For Raster File Name enter PublicArtChart. Change Raster > World File Generation to No. Your dialog should look like this:
Click OK. Now connect your new writer feature type to the Output port of your public art ChartGenerator.
Repeat this process for a new feature type named BikePathChart: click Writers > Add Writer(s) and ensure Writer is set to PNGRASTER. Connect that to the bike path ChartGenerator. Since this is as far as we'll be going in this example, add bookmarks and/or annotations to explain your workspace (see the Best Practice section of the lab if you need a reminder how to do this). After that your canvas should look something like this:
Save your workspace and then run it. You can inspect the output by opening Data Inspector, clicking Open Dataset, and using the following parameters:
|Reader Format||Esri Geodatabase (File Geodb Open API)|
Your dialog should look like this:
Here you can see all the layers in the geodatabase:
You can also navigate to C:\FMEData2019\Output\Training and open the charts:
By completing this exercise you have learned how to:
Now that you have some practice integrating data, it's your turn! Use the skills you gained in the previous exercises to add one more dataset to the workspace to answer a question or solve a problem.
Here are some example questions:
- How many addresses (C:\FMEData2019\Data\Addresses\Addresses.gdb) are within 100 meters of a bike path?
- Hint: use the Bufferer and PointOnAreaOverlayer transformers. Don't forget to make sure all your data shares a coordinate system before analyzing it.
- Where could the city locate a new public art installation? First find out which neighborhood has the fewest public art installations. Then find a city-owned property (C:\FMEData2019\Data\Parcels\CityProperties\CityProperties.shp) in that neighborhood that is the furthest away from existing public art installations. This is the site for a new installation.
- Hint: use the Sorter, Tester, PointOnAreaOverlayer, NeighborFinder (check out the _distance attribute), and Sampler (check out Sampling Type: First N Features) transformers.
- Do any city parks (C:\FMEData2019\Data\Parks\Parks.tab) not have access to bike lanes? If so, which ones? If not, which have the best and worse access?
- Hint: use the Bufferer, LineOnAreaOverlayer, and/or NeighborFinder.
As a reminder, please refer to your lab requirements.
The next section contains optional advice on some of the procedures you may have to carry out depending on the data you choose.
Finally, don't forget to answer your lab questions.