Delimit your data in ModelBuilder

July 17, 2017

Tom Thompson

Working with tabular data can sometimes mean that the data isn’t formatted properly. Your project may need to sort attributes into their own fields for further analysis. What if the data has been separated by a particular character? How can we utilize the power of ModelBuilder to delimit our data into their own fields and automate the process? It can be done! Here’s how.

ModelBuilder is a visual tool that creates geoprocessing workflows. You can use model builder to automate long and repetitive geoprocessing tasks. Modelbuilder can also be utilized to automate delimiting data! By following this blog, you’ll learn how to accomplish this. Let’s get started.

We’ve acquired field data of elements and compounds detected in our study area. We want to run analysis on these elements and compounds for a study we’re conducting, but the data is grouped into a single field, making the analysis difficult. We don’t have the time to enter this manually, and we have a lot of data which will be coming in the same format. How can we delimit these data into their own fields, and how can we automate this process? We’ll utilize ModelBuilder.

Source field data table containing elements and compounds

1. First, we must determine the number of individual elements found in our elements field.

--With the new model open, add the Add Field tool.

Add Field tool in ModelBuilder

Within the Add Field tool, add a new attribute named IndexLength with a Field Type of SHORT. This field will hold the number of individual elements, separated by commas, found in the elements field.

Create a new field using the Add Field tool

2. Next, we’ll count the number of elements found in the elements field.

Add the Calculate Field tool.

Calculate field tool added in ModelBuilder

Within the Calculate Field tool, select the newly created IndexLength field for the output of the calculation. Within the Code Block, define the function name as indexCount and set the input parameter name to fields. As our elements field is separated by commas, we’ll set the .split() function to split based on that character and return the length of the .split(‘,’) function to count the number of elements separated by a comma. We’ll set the expression name to indexCount(!ELEMENTS!), matching the name of the function defined in the code block. Finally, we’ll set the expression type as PYTHON.

Calculate Field tool with a Code Block to count the IndexLength

Once completed, save and run the model. You’ll see the IndexLength field populated by the number of elements separated by commas.

Updated table showing the IndexLength field

3. To avoid repetition and to automate this process, we’ll add a For Loop to the model.

The for loop will be used to iterate through each element counted in the IndexCount field and will then create a new field for that single element, as well as pull the element value from the elements field into our newly created fields.

For loop added into ModelBuilder

Set the for loop parameters From Value to 1 and the To Value to the largest number listed in the IndexLength field. In our case, the max value is 9. We'll increment by 1 with each iteration of the loop by setting the By Value to 1.

Set the for loop tool parameters where the To Value matches the highest value from IndexLength field

4. Now that we know the number of attributes and have a loop to go through each of them, we have to add more fields to store those attributes.

Add a second Add Field tool to the model.

Second Add Field tool added to ModelBuilder

Within our second Add Field tool, we’ll utilize inline variable substitution in the Field Name parameter. We’ll add the Field%Value%, which will pull the variable Value from the for loop and place it at the end of Field, creating a new field for each loop from 1 to 9. The Field Type will be set as TEXT which matches the field type of our elements field.

Add Field tool with inline variable substitution

Now we'll run the model! The model will loop 9 times and will add a new field with each loop.

Updated table showing each new field for our delimited elements

5. Now that we have a new field for each element found in Elements, we need pull the elements values from Elements.

Add a second Calculate Field tool.

Second Calculate Field tool added to ModelBuilder

Again, we’ll utilize inline variable substitution to pull the element value from the .split() function list and place it into its corresponding newly created field. We subtract the inline variable by one because the values are stored in a list from 0-8 and not 1-9 as our %Value% is setup

Calculate Field tool with inline variable substitution

6. Lastly, for this approach to work without running into an out of range error, we must select all attributes and remove the lowest value from the selection with each iteration of the for loop.

Add a Select Layer by Attribute tool.

Select Layer by Attribute tool added in ModelBuilder

As in the past few tools, we’ll again utilize inline variable substitution in the Expression where we will remove the current for loop value from the selection.

Select by Attribute tool with inline variable substitution

7. Now we’re ready to run the whole model!

Make sure you’ve selected all fields through the attribute tables options first before running the model. Once all attributes are selected, validate the model and select run. You’ll receive some warnings about IndexLength already existing as the tool goes through each loop; this is normal and can be ignored. If setup properly, your table should look something like this!

Table with delimited attributes each in their own field

Your table may not show the changes right away. To fix this, re-load the table cache.

Re-load Cache found in Table Options of your attribute table

Now our attributes have been delimited by the commas and placed into their own fields for further analysis. This was all done with a single click of a button to execute our model built in ModelBuilder. As we receive more data in this format, we can modify the for loop length to match the new number of attributes found in the IndexLength field and re-use this model again and save ourselves a lot of time. This model can also be very quickly modified to delimit based on another character other than a comma.

I hope you learned a lot and enjoyed this blog!