Comparison with other state of the art de-identification methods
| Paper | Bosch et al. (2020) | Farrow et al. (2023) | Holmes et al. (2023a) | Holmes et al. (2023b) | This paper | |||
|---|---|---|---|---|---|---|---|---|
| PII | Names | Names | Names | Names | Personal URL, emails | Names, locations and links | ||
| Method | Extra-trees + deep neural nets | Regular Expressions | Fine-tuned RoBERTa | Fine-tuned RoBERTa | Regular Expressions | GPT-4o | Llama-3.3 | Llama-3.1 |
| Names required | Yes | Yes | Yes | Yes | No | No | No | No |
| Precision | 0.827 | 0.550 | 0.680 | 0.740 | 0.270 | 0.579 | 0.506 | 0.262 |
| Recall | 0.970 | 0.905 | 0.840 | 0.700 | 0.870 | 0.946 | 0.962 | 0.928 |
| Paper | This paper | |||||||
|---|---|---|---|---|---|---|---|---|
| PII | Names | Names | Names | Names | Personal URL, emails | Names, locations and links | ||
| Method | Extra-trees + deep neural nets | Regular Expressions | Fine-tuned RoBERTa | Fine-tuned RoBERTa | Regular Expressions | GPT-4o | Llama-3.3 | Llama-3.1 |
| Names required | Yes | Yes | Yes | Yes | No | No | No | No |
| Precision | 0.827 | 0.550 | 0.680 | 0.740 | 0.270 | 0.579 | 0.506 | 0.262 |
| Recall | 0.970 | 0.905 | 0.840 | 0.700 | 0.870 | 0.946 | 0.962 | 0.928 |
Source(s): Authors’ own creation