t

com.johnsnowlabs.nlp.annotators.deid

DeIdentificationParams

trait DeIdentificationParams extends BaseDeidParams

A trait that contains all the params that are common between DeIdentificationModel and DeIdentification annotators.

See also

DeIdentification

DeIdentificationModel

BaseDeidParams

Linear Supertypes
BaseDeidParams, Params, Serializable, Serializable, Identifiable, AnyRef, Any
Ordering
  1. Grouped
  2. Alphabetic
  3. By Inheritance
Inherited
  1. DeIdentificationParams
  2. BaseDeidParams
  3. Params
  4. Serializable
  5. Serializable
  6. Identifiable
  7. AnyRef
  8. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def copy(extra: ParamMap): Params
    Definition Classes
    Params
  2. abstract val uid: String
    Definition Classes
    Identifiable

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def $[T](param: Param[T]): T
    Attributes
    protected
    Definition Classes
    Params
  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. val ageRanges: IntArrayParam

    List of integers specifying limits of the age groups to preserve during obfuscation

    List of integers specifying limits of the age groups to preserve during obfuscation

    Definition Classes
    BaseDeidParams
  6. val ageRangesByHipaa: BooleanParam

    A Boolean variable indicating whether to obfuscate ages based on HIPAA (Health Insurance Portability and Accountability Act) Privacy Rule.

    A Boolean variable indicating whether to obfuscate ages based on HIPAA (Health Insurance Portability and Accountability Act) Privacy Rule.

    The HIPAA Privacy Rule mandates that ages from patients older than 90 years must be obfuscated, while age for patients 90 years or younger can remain unchanged.

    When true, age entities larger than 90 will be obfuscated as per HIPAA Privacy Rule, the others will remain unchanged. When false, ageRanges parameter is valid.

  7. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  8. val blackList: StringArrayParam

    List of entities that will be ignored in the regex file.

    List of entities that will be ignored in the regex file. The rest will be processed. The default values are "IBAN","ZIP","NPI","URL","DLN","PASSPORT","EMAIL","C_CARD","DEA","SSN"

  9. final def clear(param: Param[_]): DeIdentificationParams.this.type
    Definition Classes
    Params
  10. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  11. val consistentObfuscation: BooleanParam

    Whether to replace very similar entities in a document with the same randomized term (default: true) The similarity is based on the Levenshtein Distance between the words.

  12. def copyValues[T <: Params](to: T, extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  13. val dateFormats: StringArrayParam

    Format of dates to displace

    Format of dates to displace

    Definition Classes
    BaseDeidParams
  14. val dateTag: Param[String]

    Tag representing what are the NER entity (default: DATE)

  15. val dateToYear: BooleanParam

    true if dates must be converted to years, false otherwise

  16. val days: IntParam

    Number of days to obfuscate the dates by displacement.

    Number of days to obfuscate the dates by displacement. If not provided a random integer between 1 and 60 will be used

    Definition Classes
    BaseDeidParams
  17. final def defaultCopy[T <: Params](extra: ParamMap): T
    Attributes
    protected
    Definition Classes
    Params
  18. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  19. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  20. def explainParam(param: Param[_]): String
    Definition Classes
    Params
  21. def explainParams(): String
    Definition Classes
    Params
  22. final def extractParamMap(): ParamMap
    Definition Classes
    Params
  23. final def extractParamMap(extra: ParamMap): ParamMap
    Definition Classes
    Params
  24. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  25. val fixedMaskLength: IntParam

    Select the fixed mask length: this is the length of the masking sequence that will be used when the 'fixed_length_chars' masking policy is selected.

  26. val genderAwareness: BooleanParam

    Whether to use gender-aware names or not during obfuscation.

    Whether to use gender-aware names or not during obfuscation. This param effects only names. If value is true, it might decrease performance. Default: False

    Definition Classes
    BaseDeidParams
  27. final def get[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  28. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  29. def getConsistentObfuscation: Boolean
  30. def getDateFormats: Array[String]
    Definition Classes
    BaseDeidParams
  31. def getDateTag: String
  32. def getDateToYear: Boolean
  33. def getDays: Int
    Definition Classes
    BaseDeidParams
  34. final def getDefault[T](param: Param[T]): Option[T]
    Definition Classes
    Params
  35. def getIgnoreRegex: Boolean
  36. def getLanguage: String
    Definition Classes
    BaseDeidParams
  37. def getMappingsColumn: String
  38. def getMaskingPolicy: String
  39. def getMetadataMaskingPolicy: String

    Gets metadataMaskingPolicy param

  40. def getMinYear: Int
  41. def getMode: String
  42. def getObfuscateDate: Boolean
  43. def getObfuscateRefSource: String
    Definition Classes
    BaseDeidParams
  44. def getObfuscationStrategyOnException: String
  45. final def getOrDefault[T](param: Param[T]): T
    Definition Classes
    Params
  46. def getParam(paramName: String): Param[Any]
    Definition Classes
    Params
  47. def getRegexOverride: Boolean
  48. def getReturnEntityMappings: Boolean
  49. def getSameEntityThreshold: Double
  50. def getSameLengthFormattedEntities(): Array[String]
    Definition Classes
    BaseDeidParams
  51. def getSeed(): Int
    Definition Classes
    BaseDeidParams
  52. def getUseShiftDays: Boolean

    Getter method of useShiftDays

  53. def getZipCodeTag: String
  54. final def hasDefault[T](param: Param[T]): Boolean
    Definition Classes
    Params
  55. def hasParam(paramName: String): Boolean
    Definition Classes
    Params
  56. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  57. val ignoreRegex: BooleanParam

    Select if you want to use regex file loaded in the model.

    Select if you want to use regex file loaded in the model. If true the default regex file will be not used The default value is false.

  58. final def isDefined(param: Param[_]): Boolean
    Definition Classes
    Params
  59. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  60. val isRandomDateDisplacement: BooleanParam

    Use a random displacement days in dates entities,that random number is based on the DeIdentificationParams.seed If true use random displacement days in dates entities,if false use the DeIdentificationParams.days The default value is false.

  61. final def isSet(param: Param[_]): Boolean
    Definition Classes
    Params
  62. val language: Param[String]

    The language used to select the regex file and some faker entities.

    The language used to select the regex file and some faker entities. 'en'(English),'de'(German), 'es'(Spanish), 'fr'(French), 'ar'(Arabic) or 'ro'(Romanian) Default:'en'

    Definition Classes
    BaseDeidParams
  63. val mappingsColumn: Param[String]

    This is the mapping column that will return the Annotations chunks with the fake entities

  64. val maskingPolicy: Param[String]

    Select the masking policy:

    Select the masking policy:

    • 'entity_labels': Replace the values with the entity value.
    • 'same_length_chars': Replace the name with the asterix with same length minus two plus brackets on both end.If the entity is less than 3 chars (like Jo, or 5), we can just use asterix without brackets.
    • 'fixed_length_chars': Replace the obfuscated entity with a masking sequence composed of a fixed number of asterisk.
    • Default: 'entity_labels'
  65. val metadataMaskingPolicy: Param[String]

    If specified, the metadata includes the masked form of the document.

    If specified, the metadata includes the masked form of the document. Select the following masking policy if you want to return mask form in the metadata:

    • 'entity_labels': Replace the values with the entity value.
    • 'same_length_chars': Replace the name with the asterix with same length minus two plus brackets on both end.If the entity is less than 3 chars (like Jo, or 5), we can just use asterix without brackets.
    • 'fixed_length_chars': Replace the obfuscated entity with a masking sequence composed of a fixed number of asterisk.
    • Default: ""
  66. val minYear: IntParam

    Minimum year to use when converting date to year

  67. val mode: Param[String]

    Mode for Anonymizer ['mask'|'obfuscate'].

    Mode for Anonymizer ['mask'|'obfuscate']. Default: 'mask'

    • Mask mode: The entities will be replaced by their entity types.
    • Obfuscate mode: The entity is replaced by an obfuscator's term.
    Example:
    1. Given the following text: "David Hale visited EEUU a couple of years ago"

      • Mask mode: "<PERSON> visited <COUNTRY> a couple of years ago"
      • Obfuscate mode: "Bryan Johnson visited Japan a couple of years ago"
  68. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  69. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  70. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  71. val obfuscateDate: BooleanParam

    When mode=="obfuscate" whether to obfuscate dates or not.

    When mode=="obfuscate" whether to obfuscate dates or not. This param helps in consistency to make dateFormats more visible. When setting to true, make sure dateFormats param fits the needs. If the value is true and obfuscation is failed, then DeIdentificationParams.unnormalizedDateMode will be activated. When setting to 'false', then the date will be masked to <DATE> Default: false

  72. val obfuscateRefSource: Param[String]

    The source of obfuscation to obfuscate the entities.

    The source of obfuscation to obfuscate the entities. The values ar the following: 'file': Takes the entities from the obfuscatorRefFile 'faker': Takes the entities from the Faker module 'both': Takes the entities from the obfuscatorRefFile and the faker module randomly.

    Definition Classes
    BaseDeidParams
  73. val obfuscationStrategyOnException: Param[String]

    The obfuscation strategy to be applied when an exception occurs.

    The obfuscation strategy to be applied when an exception occurs.

    The obfuscation strategy determines how obfuscation is handled in case of an exception. Four possible values are supported:

    • "mask": The original chunk is replaced with a masking pattern.
    • "default": The original chunk is replaced with a default faker.
    • "skip": The original chunk is not replaced with any faker.
    • "exception": Throws the exception.

    The default obfuscation strategy is "default".

  74. val outputAsDocument: BooleanParam

    Whether to return all sentences joined into a single document

  75. lazy val params: Array[Param[_]]
    Definition Classes
    Params
  76. val regexOverride: BooleanParam

    If the value is true, prioritize the regex entities; if the value is false, prioritize the ner.

    If the value is true, prioritize the regex entities; if the value is false, prioritize the ner. The default value is false. If DeIdentification.combineRegexPatterns is true, this value will be invalid.

  77. val region: Param[String]

    With this property, you can select particular dateFormats.

    With this property, you can select particular dateFormats. This property is especially used when obfuscating dates. You can decide whether the first part of 11/11/2023 is a day or the second part is a day when obfuscating dates. The values are following: 'eu' for European Union 'us' for USA Default: 'eu'

  78. val returnEntityMappings: BooleanParam

    With this property, you can select if you want to return mapping column.

  79. val sameEntityThreshold: DoubleParam

    Similarity threshold [0.0-1.0] to consider two appearances of an entity as the same (default: 0.9) For date entities this method doesn't apply.

  80. val sameLengthFormattedEntities: StringArrayParam

    List of formatted entities to generate the same length outputs as original ones during obfuscation.

    List of formatted entities to generate the same length outputs as original ones during obfuscation. The supported and default formatted entities are: "phone", "fax", "id", "idnum", "bioid", "medicalrecord", "zip", "vin", "ssn", "dln", "plate", "license", "IRS", "CFN".

    Definition Classes
    BaseDeidParams
  81. val seed: IntParam

    It is the seed to select the entities on obfuscate mode.

    It is the seed to select the entities on obfuscate mode. With the seed, you can reply to an execution several times with the same output.

    Definition Classes
    BaseDeidParams
  82. final def set(paramPair: ParamPair[_]): DeIdentificationParams.this.type
    Attributes
    protected
    Definition Classes
    Params
  83. final def set(param: String, value: Any): DeIdentificationParams.this.type
    Attributes
    protected
    Definition Classes
    Params
  84. final def set[T](param: Param[T], value: T): DeIdentificationParams.this.type
    Definition Classes
    Params
  85. def setAgeRanges(mode: Array[Int]): DeIdentificationParams.this.type

    List of integers specifying limits of the age groups to preserve during obfuscation

    List of integers specifying limits of the age groups to preserve during obfuscation

    Definition Classes
    BaseDeidParams
  86. def setAgeRangesByHipaa(value: Boolean): DeIdentificationParams.this.type

    Sets whether to obfuscate ages based on HIPAA (Health Insurance Portability and Accountability Act) Privacy Rule.

    Sets whether to obfuscate ages based on HIPAA (Health Insurance Portability and Accountability Act) Privacy Rule.

    The HIPAA Privacy Rule mandates that ages from patients older than 90 years must be obfuscated, while age for patients 90 years or younger can remain unchanged.

    value

    If true, age entities larger than 90 will be obfuscated as per HIPAA Privacy Rule, the others will remain unchanged. If false, ageRanges parameter is valid. Default: false.

  87. def setBlackList(list: Array[String]): DeIdentificationParams.this.type

    List of entities that will be ignored to in the regex file.

    List of entities that will be ignored to in the regex file. The rest will be processed. The default values are "IBAN","ZIP","NPI","URL","DLN","PASSPORT","EMAIL","C_CARD","DEA","SSN"

  88. def setConsistentObfuscation(s: Boolean): DeIdentificationParams.this.type

    Whether to replace very similar entities in a document with the same randomized term (default: true) The similarity is based on the Levenshtein Distance between the words.

  89. def setDateFormats(s: Array[String]): DeIdentificationParams.this.type

    Format of dates to displace

    Format of dates to displace

    Definition Classes
    BaseDeidParams
  90. def setDateTag(s: String): DeIdentificationParams.this.type

    Tag representing what are the NER entity (default: DATE)

  91. def setDateToYear(s: Boolean): DeIdentificationParams.this.type

    true if dates must be converted to years, false otherwise

  92. def setDays(k: Int): DeIdentificationParams.this.type

    Number of days to obfuscate the dates by displacement.

    Number of days to obfuscate the dates by displacement. If not provided a random integer between 1 and 60 will be used

    Definition Classes
    BaseDeidParams
  93. final def setDefault(paramPairs: ParamPair[_]*): DeIdentificationParams.this.type
    Attributes
    protected
    Definition Classes
    Params
  94. final def setDefault[T](param: Param[T], value: T): DeIdentificationParams.this.type
    Attributes
    protected
    Definition Classes
    Params
  95. def setFixedMaskLength(value: Int): DeIdentificationParams.this.type

    fixed mask length: this is the length of the masking sequence that will be used when the 'fixed_length_chars' masking policy is selected.

  96. def setGenderAwareness(value: Boolean): DeIdentificationParams.this.type

    Whether to use gender-aware names or not during obfuscation.

    Whether to use gender-aware names or not during obfuscation. This param effects only names. If value is true, it might decrease performance. Default: False

    Definition Classes
    BaseDeidParams
  97. def setIgnoreRegex(s: Boolean): DeIdentificationParams.this.type

    Select if you want to use regex file loaded in the model.

    Select if you want to use regex file loaded in the model. If true the default regex file will be not used The default value is false.

  98. def setIsRandomDateDisplacement(s: Boolean): DeIdentificationParams.this.type

    Use a random displacement days in dates entities,that random number is based on the DeIdentificationParams.seed If true use random displacement days in dates entities, if false use the DeIdentificationParams.days The default value is false.

  99. def setLanguage(s: String): DeIdentificationParams.this.type

    The language used to select the regex file and some faker entities.

    The language used to select the regex file and some faker entities. 'en'(English),'de'(German), 'es'(Spanish), 'fr'(French), 'ar'(Arabic) or 'ro'(Romanian). Default:'en'

    Definition Classes
    BaseDeidParams
  100. def setMappingsColumn(s: String): DeIdentificationParams.this.type

    This is the mapping column that will return the Annotations chunks with the fake entities

  101. def setMaskingPolicy(value: String): DeIdentificationParams.this.type

    Select the masking policy:

    Select the masking policy:

    • 'entity_labels': Replace the values with the entity value.
    • 'same_length_chars': Replace the name with the asterix with same length minus two plus brackets on both end.If the entity is less than 3 chars (like Jo, or 5), we can just use asterix without brackets.
    • 'fixed_length_chars': Replace the obfuscated entity with a masking sequence composed of a fixed number of asterisk.
    • Default: 'entity_labels'
  102. def setMetadataMaskingPolicy(value: String): DeIdentificationParams.this.type

    If specified, the metadata includes the masked form of the document.

    If specified, the metadata includes the masked form of the document. Select the following masking policy if you want to return mask form in the metadata:

    • 'entity_labels': Replace the values with the entity value.
    • 'same_length_chars': Replace the name with the asterix with same length minus two plus brackets on both end.If the entity is less than 3 chars (like Jo, or 5), we can just use asterix without brackets.
    • 'fixed_length_chars': Replace the obfuscated entity with a masking sequence composed of a fixed number of asterisk.
    • Default: ""
  103. def setMinYear(s: Int): DeIdentificationParams.this.type

    Minimum year to use when converting date to year

  104. def setMode(m: String): DeIdentificationParams.this.type

    Mode for Anonymizer ['mask'|'obfuscate'].

    Mode for Anonymizer ['mask'|'obfuscate']. Default: 'mask'

    • Mask mode: The entities will be replaced by their entity types.
    • Obfuscate mode: The entity is replaced by an obfuscator's term.
    Example:
    1. Given the following text: "David Hale visited EEUU a couple of years ago"

      • Mask mode: "<PERSON> visited <COUNTRY> a couple of years ago"
      • Obfuscate mode: "Bryan Johnson visited Japan a couple of years ago"
  105. def setObfuscateDate(s: Boolean): DeIdentificationParams.this.type

    When mode=="obfuscate" whether to obfuscate dates or not.

    When mode=="obfuscate" whether to obfuscate dates or not. This param helps in consistency to make dateFormats more visible. When setting to true, make sure dateFormats param fits the needs. If the value is true and obfuscation is failed, then DeIdentificationParams.unnormalizedDateMode will be activated. When setting to 'false' then the date will be masked to <DATE> Default: false

  106. def setObfuscateRefSource(s: String): DeIdentificationParams.this.type

    The source of obfuscation to obfuscate the entities.

    The source of obfuscation to obfuscate the entities. The values are the following: 'file': Takes the entities from the obfuscatorRefFile 'faker': Takes the entities from the Faker module 'both': Takes the entities from the obfuscatorRefFile and the faker module randomly.

    Definition Classes
    BaseDeidParams
  107. def setObfuscationStrategyOnException(value: String): DeIdentificationParams.this.type

    Sets the obfuscation strategy to be applied when an exception occurs.

    Sets the obfuscation strategy to be applied when an exception occurs.

    The obfuscation strategy determines how obfuscation is handled in case of an exception. Four possible values are supported:

    • "mask": The original chunk is replaced with a masking pattern.
    • "default": The original chunk is replaced with a default faker.
    • "skip": The original chunk is not replaced with any faker.
    • "exception": Throws the exception.

    The default obfuscation strategy is "default".

  108. def setOutputAsDocument(mode: Boolean): DeIdentificationParams.this.type

    Whether to return all sentences joined into a single document

  109. def setRegexOverride(s: Boolean): DeIdentificationParams.this.type

    If the value is true, prioritize the regex entities; if the value is false, prioritize the ner.

    If the value is true, prioritize the regex entities; if the value is false, prioritize the ner. The default value is false. If DeIdentification.combineRegexPatterns is true, this value will be invalid.

  110. def setRegion(s: String): DeIdentificationParams.this.type

    With this property, you can select particular dateFormats.

    With this property, you can select particular dateFormats. This property is especially used when obfuscating dates. You can decide whether the first part of 11/11/2023 is a day or the second part is a day when obfuscating dates. The values are following: 'eu' for European Union 'us' for USA Default: 'eu'

  111. def setReturnEntityMappings(s: Boolean): DeIdentificationParams.this.type

    With this property, you can select if you want to return mapping column.

  112. def setSameEntityThreshold(s: Double): DeIdentificationParams.this.type

    Similarity threshold [0.0-1.0] to consider two appearances of an entity as the same (default: 0.9) For date entities this method doesn't apply.

  113. def setSameLengthFormattedEntities(entities: Array[String]): DeIdentificationParams.this.type

    List of formatted entities to generate the same length outputs as original ones during obfuscation.

    List of formatted entities to generate the same length outputs as original ones during obfuscation. The supported and default formatted entities are: PHONE, FAX, ID, IDNUM, BIOID, MEDICALRECORD, ZIP, VIN, SSN, DLN, LICENSE, PLATE, IRS, CFN.

    Definition Classes
    BaseDeidParams
  114. def setSeed(s: Int): DeIdentificationParams.this.type

    It is the seed to select the entities on obfuscate mode.

    It is the seed to select the entities on obfuscate mode. With the seed, you can reply to an execution several times with the same output.

    Definition Classes
    BaseDeidParams
  115. def setUnnormalizedDateMode(mode: String): DeIdentificationParams.this.type

    The mode to use if the date is not formatted.

    The mode to use if the date is not formatted. [mask, obfuscate, skip] Default: obfuscate

  116. def setUseShiftDays(s: Boolean): DeIdentificationParams.this.type
  117. def setZipCodeTag(s: String): DeIdentificationParams.this.type
  118. val supportedFormattedEntities: Array[String]
    Attributes
    protected
    Definition Classes
    BaseDeidParams
  119. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  120. def toString(): String
    Definition Classes
    Identifiable → AnyRef → Any
  121. val unnormalizedDateMode: Param[String]

    The mode to use if the date is not formatted.

    The mode to use if the date is not formatted. [mask, obfuscate, skip] Default: obfuscate

  122. val useShifDays: BooleanParam

    Use shift days : Whether to use the random shift day when the document has this in its metadata.

    Use shift days : Whether to use the random shift day when the document has this in its metadata. Default: False

  123. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  124. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  125. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  126. val zipCodeTag: Param[String]

Deprecated Value Members

  1. def setUseShiftDayse(s: Boolean): DeIdentificationParams.this.type
    Annotations
    @deprecated
    Deprecated

    deprecated because of typo

Inherited from BaseDeidParams

Inherited from Params

Inherited from Serializable

Inherited from Serializable

Inherited from Identifiable

Inherited from AnyRef

Inherited from Any

getParam

Parameters

Parameter setters

Ungrouped