Abstract:
Big data is significantly changing the way companies can define their strategy and business model. Furthermore, the amount of data generated in the world has reached dimensions that were unthinkable until a few years ago. However, this important growth in sources, volumes, and velocity of data collected have also increased the storage and use of many private information (for example, personally identifiable information (PII)), thus increasing the vulnerability of privacy. There are many investments made by the healthcare sector, biomedical companies, advertising sector, private companies and government agencies in the collection, aggregation and sharing of huge amounts of personal data such as names, addresses, credit card numbers, etc. for the development of AIML systems that need to be protected. Big data can contain sensitive personal information that requires protection from unauthorized access and release. From a security point of view, the greatest challenge is to protect the privacy of individuals. Ensuring the privacy of people's data is mandatory under privacy laws. A possible solution for the protection of this data is anonymization, which consists of a process to protect privacy information by deleting or encrypting the identifiers that link a specific individual to the data generated.