Het laatste nieuws

ETL testing

Tags: Kennisbijeenkomsten

Wednesday 19th of June ABN AMRO hosted an IT-Circle event on testing of ETL (the process of Extract to Transform to Load information). 18 participants from all Circles were present. To kick off the meeting 2 flip-overs were set up for participants to enter their goals and challenges on ETL testing. ABN AMRO started with a brief introduction of their challenges and concerns regarding the testing of ETL processes. 
Since we had 18 participants we divided the group in two and had discussions within the group with a brief presentation of the summary at the end to the other group. 

First discussion round was on automating the ETL testing. It was interesting to see how one group focussed more on the why of automation, whereas the other group focussed on what tests to automate. Interesting observations were made how the technical tests were easier to automate, whereas the true acceptance tests seem to remain manual effort. 

Group one really focussed on what tests are available to test. Most tests were thought or were part of the data and staging layer. Easy tests contain row counts, content (expected data), consistency and validation of complete data sets. The later stages were harder and the discussion more tended to go towards the reason and need to do these tests automated. It is however valuable,  as they can be used as regression tests and help teams to test unchanged pieces of their service.  

The need for test automation -and more specifically CI/CD- was also discussed since not all participants actually recognized the need for automation in their environment. Ideas on using monitoring -as is already set up for production- to raise red flags in test environments as early warning signs,  also resonated with a number of people. The same with the idea of having test(automation) being set up in parallel to the development of the system, together with bringing in the possibility of early feedback and cross-sharing of knowledge and insights (and very agile as well!). 

After shuffling the groups we entered the second discussion round, which was on testing strategy and risk management. Again the approach in the 2 groups was radically different: one group focussed on risks and impacts and how to deal with side steering whereas the other group looked at the different quality attributes and how to deal with them in ETL testing. Both groups came to the distinction between historical (source) data and derived data, where the history layer of data was seen as the most important. 
Not all data nor all processes have the same impact/priority, so identification of impact seems to be an interesting way of determining the depth of testing for each of these. 

In the second group we acknowledged that from the risk based approach, many of the tests that you can think of are more for the other layers in the system, like Data warehouses and Data marts.  

Time flies when you are having fun, so before we knew it, it was 17:00, time to break up. All in all the afternoon was well spend. Everybody learned something and connections were made for follow up.  

Interested to participate in one of these roundtables or to participate in another activity? Then contact the coordinator of your organization or send a message to info@itcircle-nederland.nl

This article was written for the IT-Circle website by one of the participants in this session. To Bart Knaack (ABN Amro): thank you for your positive feedback!

Deelnemen aan AIC of inschrijven voor de newsflash?
Meld je bij ons aan!