Skip to main content

Data Transformation

Data rarely arrives in exactly the shape you need it. Field values might need reformatting, columns might need combining, data from multiple sources might need aggregating, or you might need to run custom logic before or after a sync completes. Data Sync provides four transformation tools to handle these scenarios, each suited to a different level of complexity.

Choosing the Right Tool

FeatureWhat it's forComplexity
Calculated ColumnsTransform or derive values using built-in functionsLow — no coding required
Dynamic ColumnsWrite C# properties for full .NET Framework accessMedium — requires C#
Project AutomationRun custom code at specific points in the sync lifecycleMedium–High — requires C#
Analysis GroupingAggregate and group rows from one or more project filesLow — configured via UI

Calculated Columns

Calculated Columns are the most common transformation tool and the right starting point for most scenarios. They let you create new columns on your data source using a library of built-in functions — similar in feel to Excel formulas — without writing any code.

You can use them to reformat values, combine fields, perform date arithmetic, do lookups against other data sources handle nulls, convert data types, and much more. The result appears as a virtual column in your schema map that can be mapped to the target just like any other column.

Use calculated columns when you need to:

  • Reformat or combine field values (e.g. build a full name from first and last name)
  • Convert data types or handle null values
  • Generate values based on row data (e.g. derive an email address from an ID)
  • Perform lookups against a second data source
  • Apply conditional logic to produce different output values per row

Read more about Calculated Columns here.

Dynamic Columns

Dynamic Columns take the same concept as Calculated Columns but let you write full C# class properties instead of using the built-in function library. This gives you access to the entire .NET Framework, making it the right choice when your transformation is too complex for the available functions.

A Dynamic Column is defined as a C# property getter — the property name becomes the column name and the return value is calculated for each row at data load time. You can also use BeginRow() to access the current row values before each calculation, and return false from BeginRow() to filter a row out of the results entirely.

Use dynamic columns when you need to:

  • Write transformation logic that goes beyond the built-in function library
  • Use .NET Framework classes or external libraries in your transformation
  • Filter rows out of the data source based on complex conditions
  • Share transformation logic across multiple columns in the same class

Read more about Dynamic Columns here.

Project Automation

Project Automation lets you run your own C# code at specific points in the Data Sync process — before the data loads, after the compare, before and after the sync, and on error. Unlike Calculated and Dynamic Columns which operate at the row level, Project Automation operates at the project level and is used to control or extend the sync process itself.

Common uses include dynamically changing a data source filter at runtime based on the current date, calling a stored procedure or webhook once the sync completes, logging results to an external system, or sending a notification when an error occurs.

Use project automation when you need to:

  • Dynamically adjust data source properties or filters before the sync runs
  • Trigger an action in another system when the sync completes or fails
  • Log or report results to somewhere outside of Data Sync
  • Control whether the sync proceeds based on the compare results
  • Handle item-level events (before/after each add, update, or delete)

Read more about Project Automation here.

Analysis Grouping

Analysis Grouping is a different kind of transformation — rather than modifying values within rows, it changes the shape of the data itself by grouping rows together and applying aggregate functions across them.

It works by taking one or more Data Sync project files as its source and applying a group-by operation, producing a condensed result set. For example if you have a products table where each product row contains a supplier ID, Analysis Grouping can collapse those rows into one row per supplier with an array of their associated products.

The available aggregate operations include Count, Sum, Min, Max, Average, First, Last, Mode, Median, Variance, Range, Standard Deviation, and both String and Integer arrays.

Use analysis grouping when you need to:

  • Aggregate rows that share a common key into a single result row
  • Calculate summary statistics (totals, averages, counts) across a data set
  • Produce array values from multiple related rows
  • Combine and summarise the output of multiple Data Sync projects

Read more about Analysis Grouping here.