Privacy
Data privacy
if you decide to allow teams to build models based on real personal data stored by PayU you must consider this two points.
Preparation of training data
Any data science activities should not be done on raw data but on data which was deidentified (or even anonymized) as far as possible (e.g. by removing names, surnames, addresses, hashing of card data, etc.). If a team intends to use any real user data, the scope of the data used and the manner of deidentification should be agreed between you and the relevant team. As we discussed, there is small number of teams that plans to work on PayU personal data to create their models, so the consultation should not be too engaging.
Safe environment for cooperation
Hackathon data science activities should be done in a dedicated safe environment allowing participants to share information and discuss ideas. The tools used by participants should ensure that any information is secure, accessed only by verified users and has appropriate controls limiting export of data. Shortly after the event any personal data should be deleted.
Also, please remember that any AI projects should also be assessed from a fairness and ethics perspective, especially with respect to bias control, transparency, explainability, etc. However, given that these models will not be used in production but removed after the hackathon, in this case we can resign from such additional assessments.