Comparison of reconstruction and heuristic-based PP techniques
| Property | Reconstruction-based techniques | Heuristic-based techniques |
|---|---|---|
| Definition | Modify raw data to allow reconstruction by data collector | Alter records before public release for anonymity |
| Examples | Randomisation, data swapping, synthetic data generation | k-anonymity, l-diversity, t-closeness |
| Data utility | Preserves data utility, suitable for ML-based analysis | May result in loss of data utility due to anonymisation |
| ML algorithm requirement | May require specific ML algorithms but preserves data utility effectively | May require adjustments for certain ML algorithms due to anonymisation |
| Suitability | Protects individual data points, useful when data collector is untrustworthy | Effective for public data sharing while protecting individual identities |
| Complexity | Varies based on perturbation type and dataset size | Depends on specific heuristic used and level of anonymisation |
| Computational overhead | May have higher overhead due to individual perturbation | Generally lower overhead compared to reconstruction-based techniques |
| Property | Reconstruction-based techniques | Heuristic-based techniques |
|---|---|---|
| Definition | Modify raw data to allow reconstruction by data collector | Alter records before public release for anonymity |
| Examples | Randomisation, data swapping, synthetic data generation | k-anonymity, l-diversity, t-closeness |
| Data utility | Preserves data utility, suitable for ML-based analysis | May result in loss of data utility due to anonymisation |
| ML algorithm requirement | May require specific ML algorithms but preserves data utility effectively | May require adjustments for certain ML algorithms due to anonymisation |
| Suitability | Protects individual data points, useful when data collector is untrustworthy | Effective for public data sharing while protecting individual identities |
| Complexity | Varies based on perturbation type and dataset size | Depends on specific heuristic used and level of anonymisation |
| Computational overhead | May have higher overhead due to individual perturbation | Generally lower overhead compared to reconstruction-based techniques |
Sharing content requires targeting cookies to be enabled. Please update your cookie preferences to use this feature.