|Exercise 2.3||Feature Caching|
|Data||3-1-1 case location details (XLS hosted on the web)|
|Overall Goal||Inspect data|
|Demonstrates||Inspect data using feature caches Sort Visual Preview table|
Let's use feature caching to inspect our data.
1) Start FME Workbench
Start FME Workbench (if necessary) and open the workspace from Exercise 2.2. Alternatively you can open C:\FMEData2019\Workspaces\IntroToDesktop\Ex2.3-Begin.fmw.
2) Use Feature Caching
Feature caching is turned on by default; you can confirm by clicking the dropdown arrow next to the run button to see that Enable Feature Caching is on:
Run your workspace by clicking the run button if you have not done so in a previous exercise. Once it has run, you should notice a little green square with a magnifying glass on your reader feature type:
This icon represents a feature cache. The cache stores features at that point of the translation, and you can click it to inspect the data. Click the green icon or select the feature type to inspect the data at that point of the translation; your original Excel data will be displayed in the Visual Preview window:
3) Use Partial Runs with Feature Caching
Feature caching also allows us to selectively run sections of our workspace.
If you select your reader feature type by clicking it, you will see some tool-tip icons appear above it. Here you have two partial runs options. Because this feature type is the first object in the workspace, you can choose to Run Just This, which in this case reads the data and caches it. You can also choose to Run From This, which will run from this object to the end of the workspace, running only what it is connected to.
If you mouse over each option, you will see the objects that will run if you click are highlighted. For Run From This, just the reader feature type is highlighted:
For Run From This, the writer feature type will also run:
Finally, you can select the writer feature type and see the option Run To This, which will run all connected objects preceding this one:
In this specific scenario these tools are of limited use, since we only have two objects in our workspace. However, as our workspace grows, partial runs help us test and develop small sections of your workspace at a time. For now, use Run Just This on your reader feature type. Then, click the green icon to inspect your data.
4) Sort Table to Check for Missing and Inconsistent Values
Now that the Visual Preview window is displaying our data, let's use its table to identify problems in our source data.
The column headers in the table contain the names of each attribute. If you right-click them you get a context menu with many options: to sort the column naturally (FME will guess if it contains numbers or text), alphabetically, or numerically, or to clear all sorting:
Since ultimately we want to report on 3-1-1 calls by local planning area, let's see if any of our features are missing values for the
Local_Area attribute. To do so, scroll over in your table until you see the
Local_Area attribute. Then, right-click and select Sort Alphabetically Ascending:
If you scroll down, you will find there are 4,122 features with missing values for
If you take the time to look through, it turns out we have two kinds of missing data:
- Some records do not have street addresses, leading them to not be associated with a local area.
- There are also two rows of missing data in between each month of records. You can verify this by scrolling down to find the gaps between each month (where the
Monthattribute changes from
If you have a keen eye for data problems, you might also spot inconsistent naming for the values in the
Local Area attribute. Two of the area names have some values with a hyphen in some of the values, but not all: Dunbar Southlands and Dunbar-Southlands and Arbutus Ridge and Arbutus-Ridge:
You can verify this problem by using the search bar in the bottom-left of the table. Click the drop-down and select
Compare the results (visible in the bottom right) of typing in "Arbutus-" and "Arbutus" in the text box with the magnifying glass. You'll find that of the 2,236 features in the Arbutus-Ridge local area, 116 have a hyphen between the words:
We will address both these data quality issues in a later exercise.
|FME Lizard says...|
|This kind of visual inspection of data is a good first step, but more thorough data validation is usually required. As this is an introduction, we won't go into details regarding validation. If you want to learn more about data validation with FME, check out this article.|
By completing this exercise, you have learned how to: